Commit 2f6ecbfc authored by Roman Sarrazin-Gendron's avatar Roman Sarrazin-Gendron
Browse files

updated README

parent 6594d56b
nothing
Thank you for using BayesPairing !
This software requires python 3.5, networkx 1.9, numpy and pickle, as well a pgmpy, which is included in the repository.
All scripts can be found in src/pgmpy, updated from github https://github.com/pgmpy/pgmpy
RNAfold is also required to be in your PATH when you run Bayespairing.
Two python scripts are required to run BayesPairing.
parse_sequences.py is used to run BayesPairing on a set of sequences, to parse them for some set of motifs.
The following arguments can be used to call it. only -seq is required.
--verbose increase output verbosity
-sm SM sample more minor sequences by making the regex more flexible
(include more nucleotides[0.1 to 0.9,0.9 being the most strict]
-mc MC Allow more substitutions for the regex [1 to 3]
-m M number of folded candidates, must be larger than n
-n N number of outputted candidates
-seq SEQ sequences to parse, FASTA file or sequence string.
-ss SS facultative secondary structures, FASTA file
-d D Dataset, as a pickle. Default will be the dataset presented in
the paper
-k K Number of times the Bayes Net should be sampled
To test parse_sequences, you can try the following command, which will run the default 40 modules on a test cDNA sequence :
python parse_sequences -d all_carnaval -seq ATGGCTCAGGAGACTAACCAGACCCCGGGGCCCATGCTGTGTAGCACAGGATGTGGCTTTTATGGAAATCCTAGGACAAATGGAATGTGTTCAGTTTGCTACAAAGAACATCTTCAGAGGCAGCAGAATAGTGGCAGAATGAGCCCAATGGGGACAGCTAGTGGTTCCAACAGTCCTACCTCAGATTCTGCATCTGTACAGAGAGCAGACACTAGCTTAAACAACTGTGAAGGTGCTGCTGGCAGCACATCTGAAAAATCAAGAAATGTGCCTGCGGCTGCCTTGCCTGTAACTCAGCAAATGACAGAAATGAGCATTTCAACAGAGGACAAAATAACTACCCCGAAAACAGAGGTGTGAGAGCCAGTTGTCACTCAGCCCAGCCCATCAGTTTTTCAGCCCAGTACTTCTCAGAGTAAAGAAAAAGCTCCTGAATTGCCCAAACCAAAGAAAAACAGATGTTTCATGTGCAGAAAGAAAGTTGGTCTTACAGGTTTGACTGCCGATGTGGAAATTTGTTTTGTGGACTTCACCGTTAACTCTGACAAGCACAACTGTCCGTATGATTACAAAGCAGAAGCTGCAGCAAAAATCAGAAAAGAGAATCCAGTTGTTGTGGCTGAAAAAATTCAGAGAATA
parse_sequences will return the score and the position of insertion of the N best candidates.
module_from_desc.py is used to build a dataset to use as the -d option for parse_sequences.py
you can call it with a .DESC file, which describes a graph, and a .fasta file of sequences matching this structure.
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment