fasta2bed.py - segment sequences

Tags

Genomics Sequences Intervals FASTA BED Conversion

Purpose

This script takes a genomic sequence in fasta format and applies various segmentation algorithms.

The methods implemented (--methods) are:

cpg

output all locations of cpg in the genome

fixed-width-windows-gc

output fixed width windows of a certain size adding their G+C content as score

gaps

ouput all locations of assembly gaps (blocks of N) in the genomic sequences

ungapped

output ungapped locations in the genomic sequences

Usage

Type:

python fasta2bed.py --method=gap < in.fasta > out.bed

Type:

python fasta2bed.py --help

for command line help.

Command line options

usage: fasta2bed [-h] [--version]
                 [-m {fixed-width-windows-gc,cpg,windows-cpg,gaps,ungapped,windows}]
                 [-w WINDOW_SIZE] [-s WINDOW_SHIFT] [--min-cpg MIN_CPG]
                 [--min-interval-length MIN_LENGTH] [--timeit TIMEIT_FILE]
                 [--timeit-name TIMEIT_NAME] [--timeit-header]
                 [--random-seed RANDOM_SEED] [-v LOGLEVEL]
                 [--log-config-filename LOG_CONFIG_FILENAME]
                 [--tracing {function}] [-? ?] [-I STDIN] [-L STDLOG]
                 [-E STDERR] [-S STDOUT]
fasta2bed: error: argument -?: expected one argument