Linking here:

The license could not be verified: License Certificate has expired! Generate a Free license now.

Recently updated pages

Navigate space

Child pages
  • Global analysis

[ShoRAH 0.5 docs]">[ShoRAH 0.5 docs]

Skip to end of metadata
Go to start of metadata

The program shorah.py

shorah.py runs all the steps in succession. Similarly to the Local analysis, we assume that the package has been downloaded and extracted in the directory sho_bin, while the data we want to analyse are in a different folder named experiments/454_test. In the next panel we copy the files to the running directory and start the global analysis.

Running the global analysis
[user@host 454_test]$ cp path_to_sho_bin/shorah-0.4/sample_454.fasta ./
[user@host 454_test]$ cp path_to_sho_bin/shorah-0.4/ref_genome.fasta ./
[user@host 454_test]$ ls
ref_genome.fasta sample_454.fasta
[user@host 454_test]$ path_to_sho_bin/shorah-0.4/shorah.py -f sample_454.fasta -r ref_genome.fasta -j 1000 -w 90 -a 0.1 -k &> global.log
[user@host 454_test]$ ls -lrhtp
total 4.6M
-rw-r--r-\- 1 user user 324K Jul 21 15:41 sample_454.fasta
-rw-r--r-\- 1 user user   377 Jul 21 15:41 ref_genome.fasta
-rw-r--r-\- 1 user user 1.5M Jul 22 15:27 tmp_align_f.needle
-rw-r--r-\- 1 user user 1.6M Jul 22 15:27 tmp_align_r.needle
drwxr-xr-x 2 user user 4.0K Jul 22 15:28 support
drwxr-xr-x 2 user user 4.0K Jul 22 15:28 sampling
drwxr-xr-x 2 user user 4.0K Jul 22 15:28 freq
drwxr-xr-x 2 user user 4.0K Jul 22 15:28 debug
drwxr-xr-x 2 user user 4.0K Jul 22 15:28 corrected}
drwxr-xr-x 2 user user 4.0K Jul 22 15:28 raw_reads
-rw-r--r-\- 1 user user 441K Jul 22 15:28 sample_454.far
-rw-r--r-\- 1 user user 2.9K Jul 22 15:28 s2f.log
-rw-r--r-\- 1 user user 6.6K Jul 22 15:29 sample_454_cor.rest
-rw-r--r-\- 1 user user 330K Jul 22 15:29 sample_454_cor.read
-rw-r--r-\- 1 user user 5.4K Jul 22 15:29 sample_454_cor.geno
-rw-r--r-\- 1 user user 364K Jul 22 15:29 sample_454_cor.fas
-rw-r--r-\- 1 user user   162 Jul 22 15:29 proposed.dat
-rw-r--r-\- 1 user user   11K Jul 22 15:29 dec.log
-rw-r--r-\- 1 user user 1.7K Jul 22 15:29 shorah.log
-rw-r--r-\- 1 user user 5.6K Jul 22 15:29 sample_454_cor.popl
-rw-r--r-\- 1 user user   21K Jul 22 15:29 global.log

Note that we have redirected the large output of shorah.py and the programs called therein to the file global.log. The final result of the analysis is in the file sample_454_cor.popl, the first lines of which are reported here.

>HAP0_0.708656
TC-AA--A--TCACTCTTTGGCAACGACCCCTTGTC-AC-AA-T-A-A---AAG----T-AGGAGGGCAACTAAAGGAAGCTCTATTAGATAC-GGGAGCAGATGATACAGTATTAGAAG-AAAT-AG-AG--T-TG------CCAGGAAGATGGAAACCAAAAATGATAGGGGGAATTGGAGGTTTTATCAAAGTAAGACAGTATGATCAAATAGTCATAGAAATTTGTG-G---AAAG-AAAGCTATAGGTACAGTATTAGTAGGACCTACACCTGTCAACATAATTGGAAGA-AA-TCTGTTGAC-TCAGATTGGTTGCACTTTAAATTTTCCCATTAGTCCTATTGAAACTGTACCAGTAAAATTGAAGCCAGGAATGGATGGCCC
>HAP1_0.194
TC-TG--A--TCACTCTTTGGCAGCGACCCCTCGTC-AC-AA-T-A-A---AGA----T-AGGGGGGCAACTAAAGGAAGCTCTATTAGATAC-AGGAGCAGATGATACAGTGTTAGAAG-AAAT-GA-AT--T-TA------CCAGGAAGATGGAAACCAAAAATGATAGGGGGAATTGGAGGTTTTATCAAAGTAAGACAGTATGATCAGATACCCATAGAAGTTTGTG-G---ACAT-AAAGCTATAGGTACAGTATTAGTAGGACCTACACCTGTCAACATAATTGGAAGG-AA-TCTGTTGAC-CGAGATTGGTTGCACTTTAAATTTTCCCATTAGTCCTATTGAAACTGTACCAGTAAAATTAAAGCCGGGAATGGATGGCCC
>HAP2_0.0633444
TC-AA--A--TCACTCTTTGGCAACGACCCCTTGTC-AC-AA-T-A-A---AG-----T-AGGAGGGCAACTAAAGGAAGCTCTATTAGATAC-GGGAGCAGATGATACAGTATTAGAAG-AAAT-AG-AG--T-TG------CCAGGAAGATGGAAACCAAAAATGATAGGGGGAATTGGAGGTTTTATCAAAGTAAGACAGTATGATCAAATAGTCATAGAAATTTGTG-G---AAAG-AAAGCTATAGGTACAGTATTAGTAGGACCTACACCTGTCAACATAATTGGAAGA-AA-TCTGTTGAC-TCAGATTGGTTGCACTTTAAATTTTCCCATTAGTCCTATTGAAACTGTACCAGTAAAATTGAAGCCAGGAATGGATGGCCC

This file contains the sequences of the reconstructed haplotypes (reconstructed by the program mm.py) and their frequencies (estimated by the program freqEst.
The name of each sentence is in the format HAPn_freq where n is an ordinal number and freq is the frequency. Haplotypes are sorted in descending order according to their frequencies.