JERARCA HOWTO

   NAME

	JERARCA - Iterative hierarchical clustering utility



   SYNOPSIS

	Linux users from the console:
	jerarca  <Graph file>  <Iterative algorithm>  <Tree algorithm>  <Iterations>

	Windows users from a command window, and from the directory where Jerarca is located: 
	Jerarca.exe  <Graph file>  <Iterative algorithm>  <Tree algorithm>  <Iterations>



   DESCRIPTION

	This page documents Jerarca, a suite of algorithms designed to efficiently convert
	unweighted undirected graphs into hierarchical trees by means of iterative hierar-
	chical clustering. An iterative algorithm is used in order to create a matrix of 
	distances between every pair of nodes of the graph. Then, a phylogenetic algorithm
	builds a hierarchical tree based on those distances. Once the tree is created, the
	program reads the dendrogram and extracts the partition of nodes that best repre-
	sents the community structure of the graph.



   OPTIONS
	Four input parameters must be used:
	
	Graph file:
	  File containing a list of edges. Each edge is represented by a pair of nodes
	  separated by a tab or space.
	
	Iterative algorithm:
	  Name of the iterative algorithm that will be run for creating the matrix of
	  distances among the nodes of the graph: RCluster, UVCluster, SCluster or the
	  three of them.
	  Four options are valid: r, uv, s and all.
	
	Tree algorithm:
	  Name of the phylogenetic algorithm that will be used for the construction of
	  the tree from the matrix of distances: UPGMA, Neighbor-Joining or both.
	  Three options are valid: u, nj and all.
	
	Iterations:
	  Number of iterations that the iterative algorithm will perform.



   OUTPUT FILES

	filename_tree_IterativeAlg_TreeAlg.nwk
	  Computed tree structure of the graph in Newick format.

	filename_partitionH_IterativeAlg_TreeAlg.txt, filename_partitionQ_IterativeAlg_TreeAlg.txt:
	  File containing the most modular partition of the graph based whether on the
	  cumulative hypergeometric cumulative distribution of the links (H) or on the
	  Modularity (Q).
	
	filename_partitionH_IterativeAlg_TreeAlg.meg, filename_partitionQ_IterativeAlg_TreeAlg.meg:
	  File that can be directly imported into the phylogenetic package MEGA 4. The
	  file includes the matrix of distances among nodes and the most modular dis-
	  tribution of nodes into clusters based whether on the cumulative hypergeome-
	  tric cumulative distribution of the links (H) or on the Modularity (Q).
	
	filename_partitionH_IterativeAlg_TreeAlg.att, filename_partitionQ_IterativeAlg_TreeAlg.att:
	  File that can be imported into Cytoscape as node attributes. Each node is a-
	  ssigned to the cluster defined by the most modular partition of the tree based
	  whether on the cumulative hypergeometric cumulative distribution of the links 
	  (H) or on the Modularity (Q).




   USAGE EXAMPLES

	(Linux users)
	jerarca  saccharomyces_interactome.tab  s  u  60000
	
	Jerarca will perform 60000 iterations of the SCluster algorithm for computing
	the matrix of distances between pairs of nodes and then the UPGMA algorithm 
	will be used in order to build a tree based on those distances.
	
	
	(Windows users)
	Jerarca.exe  mitochondrial_ribosome.tab  all  nj  1000
	
	Jerarca will perform 1000 iterations of each algorithm (RCluster, UVCluster and
	SCluster) and will compute their respective matrices of distances between pairs of
	nodes. Then, the Neighbor-Joining algorithm will be used in order to build a tree
	based on each of those matrices.
	
Back to index