mc ePoPE Download and Supplemental Material


back

ePoPE: efficient Prediction of Paralog Evolution

A dynamic programming algorithm designed to efficiently trace back the last common ancestor of a gene family and automatic annotation of gain and loss of its paralogs to the inner nodes of a given phylogenetic tree.

ePoPE is implemented in standard C programming language and available under the GNU General Public License.

Download:

Latest release including an improved partition function back recursion and a new summarizing script.
ePoPE_2.1.tar.gz

All outcomes for sankoff and partition function and the final annotated metazoan tree.
supplements.tar.gz

Former release including new partition function variant and Newick tree parser.
ePoPE_2.0.tar.gz

Former release including minor bug fixes and an additional option in the summarize script to adjust the file endings of the individual runs.
ePoPE_1.1.tar.gz

Original release:
ePoPE_1.0.tar.gz

Contents

  1. Installation + Tutorial for 1 alignment
  2. Tutorial for >1 alignment
  3. Usage information
  4. Supplemenatl Material


Installation

Download the program and save it into a directory of your choice.

Unzip the file:

$ tar -xzvf ePoPE_1.0.tar.gz

Install the program:

$ make

Test the installation with the example:

$ ./ePoPE -i example/example.stk -t example/example.tree.dat -p example/example.ePoPE.out.ps -o example/example.ePoPE.out

You may copy the binary of ePoPE and ePoPE.summarize.pl into your bin directory or set an alias to your installation.

Short tutorial for more than 1 alignment

ePoPE can be applied to set of multiple sequence alignments. If the respective gene families are related or in other words all belong to a certain class of genes, e.g. miRNAs or snoRNAs, the output of these ePoPE runs can be summarized using the provided perl script: ePoPE.summarize.pl:

	   $ ./ePoPE.summarize.pl -h
           
           Usage:
               ePoPE.summarize.pl -d DIR -o FILE [-e STR]

           Options:
               -help   Print a brief help message and exits.

               -e STR  String defining the ending of the output files from your
                       individual ePoPE runs. (Default: "ePoPE.out"). [OPTIONAL]

               -d DIR  the directory that contains the data output files of GLparaPred.
                       The ending of the files is the ending you provide with -e
                       option. [REQUIRED]

               -o FILE the output file for the final summarized tree data. [REQUIRED]

	  

The summarized output is provided in the file assigned with -o option. It is a space separated file containing a list of nodes with all computed labels. It can therefore be imported into any text editor of even as a spreadsheet for further analysis.

The summarized ePoPE output can be visualized calling the program again. First the output of ePoPE.summarize.pl needs to be sorted:

sort -nk 2 ePoPE.summarize.out >ePoPE.summarize.sort.out

Then call the programm with the sorted output and the same tree that has been used for applying ePoPE to the individual alignments.

./ePoPE -c ePoPE.summarize.sort.out -t treefile -p ePoPE.summarize.out.ps --type all -o ePoPE.summarize.out.ePoPE.out

The file ePoPE.summarize.out.ps is the final summarized Postscript file.

Usage

You will get help when calling ePoPE without any arguments or with options -h.

$ ./ePoPE
	  +--------------------------------------------------+
	  | ePoPE 1.0                                        |
	  |                                                  |
	  | ePoPE - efficent Prediction of Paralog Evolution |
	  +==================================================+

	  ePoPE predicts a maximal parsimony solution of gain and loss events of a gene family with paralogs.

	  Usage: ePoPE [ arguments ] -i ALNFILE -t TREEFILE

	  arguments: [-w WEIGHTFILE]
          [-o OUTFILE] [-p PS-OUTFILE]
          [-c COLLECTFILE]
          [-h,--help] [-v,--version]
          [--type TYPE]

	  -i FILE              Input alignment FILE in CLUSTALW/STOCKHOLM format. [REQUIRED]
	  -t FILE              Input tree FILE see example.tree.dat format. [REQUIRED]
	  -w FILE              Input weight array FILE. [REQUIRED]

	  -o FILE              Output FILE for tree data. Default is 'INFILE.dat'.[OPTIONAL]

	  -p FILE              Output FILE for PS-tree data. Default is 'INFILE.ps'. [OPTIONAL]

	  -c FILE              FILE is a collection of calls to ePoPE with the same tree on a set of gene families
	                       created via 'ePoPE.summarize.pl'. This option forces ePoPE to draw this summarized
	                       tree. You must provide the tree file you used for the single ePoPE calls with -t option.
	                       [OPTIONAL]

	  -c FILE              TYPE is one of {genes, gainFam, lossFam, gain, loss, all}. Is the type of values that
	    are plotted in the tree. Default: 'all'. [OPTIONAL]

	  -h,--help            Show this help message.
	  -v,--version         Show version information.

	  Example call:

	  ./ePoPE -i example/example.stk  -t example/example.tree.dat -p example/example.ePoPE.out.ps -o example/example.ePoPE.out --type all


	  Please feel free to contact me for comments, bug-reports, etc.

	  --
	  ePoPE 1.0

	  Auhthor: Jana Hertel:
          jana@bioinf.uni-leipzig.de

	  Date:    November, 2014
	    
	  

Jana Hertel