fasta2fasta.py - operate on sequences¶
- Tags
Sequences
Purpose¶
perform operations (masking, renaming) on a stream of fasta formatted sequences.
Available edit operations are:
- translate
translate sequences using the standard genetic code.
- translate-to-stop
translate until first stop codon
- truncate-at-stop
truncate sequence at first stop codon
- back-translate
convert nucleotide sequence to peptide sequence Requires parameter of second fasta file with peptide sequences.
- mark-codons
adds a space after each codon
- apply-map
rename sequence identifiers from a given map Requires parameter with filename of a map. The map is a tab-separated file mapping old to new names.
- build-map
rename sequence identifiers numerically and save output in a tab-separated file. Requires parameter with filename of a map. The map is a tab-separated file mapping new to old names and will be newly created. Any exiting file of the same name will be overwritten.
- pseudo-codons
translate, but keep register with codons
- interleaved-codons
mix amino acids and codons
- filter
remove sequence according to certain criteria. For example, –method=filter –filter-method=min-length=5 –filter-method=max-length=10
map-codons:
- remove-gaps
remove all gaps in the sequence
- mask-stops
mask all stop codons
- mask-seg
mask sequence by running seg
- mask-bias
mask sequence by running bias
- mask-codons
mask codon sequence given a masked amino acid sequence. Requires parameter with masked amino acids in fasta format.
- mask-incomplete-codons
mask codons that are partially masked or gapped
- mask-soft
combine hard-masked (NNN) sequences with unmasked sequences to generate soft masked sequence (masked regions in lower case)
- remove-stops
remove stop codons
- upper
convert sequence to upper case
- lower
convert sequence to lower case
- reverse-complement
build the reverse complement
- shuffle
shuffle each sequence
- sample
select a certain proportion of sequences
Parameters are given to the option parameters
in a comma-separated
list in the order that the edit operations are called upon.
Exclusion/inclusion is tested before applying any id mapping.
Usage¶
Example:
python fasta2fasta.py --method=translate < in.fasta > out.fasta
Type:
python fasta2fasta.py --help
for command line help.
Command line options¶
usage: fasta2fasta [-h] [--version]
[-m {translate,translate-to-stop,truncate-at-stop,back-translate,mark-codons,apply-map,build-map,pseudo-codons,filter,interleaved-codons,map-codons,remove-gaps,mask-seg,mask-bias,mask-codons,mask-incomplete-codons,mask-stops,mask-soft,map-identifier,nop,remove-stops,upper,lower,reverse-complement,sample,shuffle}]
[-p PARAMETERS] [-x]
[--sample-proportion SAMPLE_PROPORTION]
[--exclude-pattern EXCLUDE_PATTERN]
[--include-pattern INCLUDE_PATTERN]
[--filter-method FILTER_METHODS] [-t {aa,na}]
[-l TEMPLATE_IDENTIFIER] [--map-tsv-file MAP_TSV_FILE]
[--fold-width FOLD_WIDTH] [--timeit TIMEIT_FILE]
[--timeit-name TIMEIT_NAME] [--timeit-header]
[--random-seed RANDOM_SEED] [-v LOGLEVEL]
[--log-config-filename LOG_CONFIG_FILENAME]
[--tracing {function}] [-? ?] [-I STDIN] [-L STDLOG]
[-E STDERR] [-S STDOUT]
fasta2fasta: error: argument -?: expected one argument