generator
sequence generator
usage
SeqBox generator
usage: SeqBox generator [-h] [-out OUT] [-od OUT_DIR]
[-gt {random,norepeat,distance}] [-gn GNUMBER]
[-gts GTIMES] [-gc GCHAR] [-gd GDISTANCE] [-gl GLEN]
optional arguments:
-h, --help show this help message and exit
-out OUT, --out_file OUT
sequence out file with TSV format
-od OUT_DIR, --out_dir OUT_DIR
out direction
-gt {random,norepeat,distance}, --gtype {random,norepeat,distance}
-gn GNUMBER, --gnumber GNUMBER
sequence number
-gts GTIMES, --gtimes GTIMES
Number of runs
-gc GCHAR, --gchar GCHAR
generator charset, like: ATCG
-gd GDISTANCE, --gdistance GDISTANCE
distance
-gl GLEN, --glen GLEN
generator sequence length
random
Randomly generated sequence, possible repetitions.
API
1 | from seqbox import SEQ |
CLI
The main parameters include the length(-gl
or --glen
) and number(-gn
or -gnumber
) of generated sequences.
1 |
|
output: test_random.tsv
#generator_random:seq_len:35;char_set:ATCG
GGTCCGACCTCATCTGGATGCTCCAATGTGGGCTG
AGGCATATGGATCGCCGACACCCGTGCTACAGTTA
TCAAGCGCGAACCGGGTACCTGCCGAAACCGTATA
AACAGTGTTGCGCAGTGCCTGCACTTAAACAAATC
GATATAGGGTCTCGTTAGTACGACGATTTCGCGAG
CCCACAGGTCGCAGACTCCGCTGTTGCTTGAAGGC
CGTTAAAGCTCAATCATCAACCCGATACGTTGTCT
GAGAGCCTAGAACAAGGTACACCGAAGACGAGACG
GCGCGGCTGTCCTTAGATATAGGTAGCAATACTGA
...
CGTACTGATCAAATAACCCCGCAGACGGGTAATGC
norepeat
randomly generate non-repeating sequences.
API
1 |
|
CLI
The main parameters include the length(-gl
or --glen
) and number(-gn
or -gnumber
) of generated sequences.
1 |
|
output: test_norepeat.tsv
#generator_norepeat:seq_len:35;char_set:ATCG
ACTAGATTTTGATTTGGTCCGGAGTTAGAGATCGT
GGGATCGAAAGGGGTCGCCTCTCTTGAGAGCATTG
GCTATTTATTCAAATAGACTATATACAACAGTACA
GGACCTGTAGCGGCGTAGAATGTGCTGTGATACGA
CCTTGGACAGTGGGGTATAACCTATGGTGTGAGTA
TCACCTTTATTCAGGCGTATCTACGGTACTATCAA
GTAGGGTTTCTACCGTTTGAGCATGTAGATGCCAT
GCTTAAGTGATGTAAGGTGGCTTACCATCATCGAA
...
CATACCACGTAACAACCCGTAGGTTCGCGTTAGGT
distance
randomly generate sequences with edit distances between sequences greater than a fixed value.
API
1 |
|
CLI
The main parameters include the length(-gl
or --glen
), number(-gn
or -gnumber
), distance(-gd
or --gdistance
) and try_times(-gts
or gtimes
) of generated sequences.
1 |
|
output: test_norepeat.tsv
#generator_distance:seq_len:35;char_set:ATCG;distance:5
ACCATTAGCACCAACAGGCAAGCTCCTGCACGGTA
GTGCAGGCCCAACTTTCCCCACCTATAGGCTACGG
GACCGGGCGGGACTTTCGCCCAATCATCACATACC
AACCGGTAGTCGATGAGCGCTCATTAACACGAAGC
GTTCTGGTCATTTATCCTCCCTCAGGTACGGATTT
TTGCCGCTCAATTGAAAGGTACTGCCAGGAGTGTC
AGGCCAGAACGGATATACTAGTTGCTCCAACCTGA
ATTGACAGCAGGCGCAAGACATGCCCTAAGCCCTA
GTAACTATCCCGAGTCGACGCAGATTGTGCTTCGG