gff2bed.py - convert from gff/gtf to bed

Tags

Genomics Intervals GFF BED Conversion

Purpose

This script converts GFF or GTF formatted files to BED formatted files.

Documentation

Users can select the field from the GTF file to be used in the name field of the BED file using --set-name. Choices include “gene_id”, “transcript_id”, “class”, “family”, “feature”, “source”, “repName” and “gene_biotype”. To specify the input is in GTF format use –is-gtf.

BED files can contain multiple tracks. If required, users can use the “feature” or “source” fields in the input GFF file to specifiy different tracks in the BED file (default none).

Usage

Example:

# View input GTF file
head tests/gff2bed.py/mm9_ens67_geneset_100.gtf

# Convert GTF to bed format using gene_id as name and group by GTF feature
cat tests/gff2bed.py/mm9_ens67_geneset_100.gtf | cgat gff2bed.py --is-gtf --set-name=gene_id --track=feature > mm9_ens67_geneset_100_feature.bed

track name=CDS

chr18

3122494

3123412

ENSMUSG00000091539

0

chr18

3327491

3327535

ENSMUSG00000063889

0

chr18

3325358

3325476

ENSMUSG00000063889

0

Command line options

usage: gff2bed [-h] [--is-gtf]
               [--set-name {gene_id,transcript_id,class,family,feature,source,repName,gene_biotype}]
               [--track {feature,source,None}] [--bed12-from-transcripts]
               [--timeit TIMEIT_FILE] [--timeit-name TIMEIT_NAME]
               [--timeit-header] [--random-seed RANDOM_SEED] [-v LOGLEVEL]
               [--log-config-filename LOG_CONFIG_FILENAME]
               [--tracing {function}] [-? ?] [-I STDIN] [-L STDLOG]
               [-E STDERR] [-S STDOUT]
gff2bed: error: argument -?: expected one argument