JERARCA HOWTO

NAMEJERARCA - Iterative hierarchical clustering utilitySYNOPSISLinux users from the console:jerarca <Graph file> <Iterative algorithm> <Tree algorithm> <Iterations>Windows users from a command window, and from the directory where Jerarca is located:Jerarca.exe <Graph file> <Iterative algorithm> <Tree algorithm> <Iterations>DESCRIPTIONThis page documents Jerarca, a suite of algorithms designed to efficiently convert unweighted undirected graphs into hierarchical trees by means of iterative hierar- chical clustering. An iterative algorithm is used in order to create a matrix of distances between every pair of nodes of the graph. Then, a phylogenetic algorithm builds a hierarchical tree based on those distances. Once the tree is created, the program reads the dendrogram and extracts the partition of nodes that best repre- sents the community structure of the graph.OPTIONSFour input parameters must be used:File containing a list of edges. Each edge is represented by a pair of nodes separated by a tab or space.Graph file:Name of the iterative algorithm that will be run for creating the matrix of distances among the nodes of the graph: RCluster, UVCluster, SCluster or the three of them. Four options are valid:Iterative algorithm:r,uv,sandall.Name of the phylogenetic algorithm that will be used for the construction of the tree from the matrix of distances: UPGMA, Neighbor-Joining or both. Three options are valid:Tree algorithm:u,njandall.Number of iterations that the iterative algorithm will perform.Iterations:OUTPUT FILESfilename_tree_IterativeAlg_TreeAlg.nwkComputed tree structure of the graph in Newick format.filename_partitionH_IterativeAlg_TreeAlg.txt, filename_partitionQ_IterativeAlg_TreeAlg.txt:File containing the most modular partition of the graph based whether on the cumulative hypergeometric cumulative distribution of the links (H) or on the Modularity (Q).filename_partitionH_IterativeAlg_TreeAlg.meg, filename_partitionQ_IterativeAlg_TreeAlg.meg:File that can be directly imported into the phylogenetic package MEGA 4. The file includes the matrix of distances among nodes and the most modular dis- tribution of nodes into clusters based whether on the cumulative hypergeome- tric cumulative distribution of the links (H) or on the Modularity (Q).filename_partitionH_IterativeAlg_TreeAlg.att, filename_partitionQ_IterativeAlg_TreeAlg.att:File that can be imported into Cytoscape as node attributes. Each node is a- ssigned to the cluster defined by the most modular partition of the tree based whether on the cumulative hypergeometric cumulative distribution of the links (H) or on the Modularity (Q).USAGE EXAMPLES(Linux users)jerarca saccharomyces_interactome.tab s u 60000Jerarca will perform 60000 iterations of the SCluster algorithm for computing the matrix of distances between pairs of nodes and then the UPGMA algorithm will be used in order to build a tree based on those distances.(Windows users)Jerarca.exe mitochondrial_ribosome.tab all nj 1000Jerarca will perform 1000 iterations of each algorithm (RCluster, UVCluster and SCluster) and will compute their respective matrices of distances between pairs of nodes. Then, the Neighbor-Joining algorithm will be used in order to build a tree based on each of those matrices.