bed2gff.py - convert bed to gff/gtf

Tags

Genomics Intervals BED GFF Conversion

Purpose

This script converts a bed-formatted file to a gff or gtf-formatted file.

It aims to populate the appropriate fields in the gff file with columns in the bed file.

If --as-gtf is set and a name column in the bed file is present, its contents will be set as gene_id and transcript_id. Otherwise, a numeric gene_id or transcript_id will be set according to --id-format.

Usage

Example:

# Preview input bed file
zcat tests/bed2gff.py/bed3/bed.gz | head
# Convert BED to GFF format
cgat bed2gff.py < tests/bed2gff.py/bed3/bed.gz > test1.gff
# View converted file (excluding logging information)
cat test1.gtf | grep -v "#" | head

chr1

bed

exon

501

1000

.

.

.

gene_id “None”; transcript_id “None”;

chr1

bed

exon

15001

16000

.

.

.

gene_id “None”; transcript_id “None”;

Example:

# Convert BED to GTF format
cgat bed2gff.py --as-gtf < tests/bed2gff.py/bed3/bed.gz > test2.gtf
# View converted file (excluding logging information)
cat test2.gtf | grep -v "#" | head

chr1

bed

exon

501

1000

.

.

.

gene_id “00000001”; transcript_id “00000001”;

chr1

bed

exon

15001

16000

.

.

.

gene_id “00000002”; transcript_id “00000002”;

Type:

cgat bed2gff.py --help

for command line help.

Command line options

usage: bed2gff [-h] [-a] [-f ID_FORMAT] [--timeit TIMEIT_FILE]
               [--timeit-name TIMEIT_NAME] [--timeit-header]
               [--random-seed RANDOM_SEED] [-v LOGLEVEL]
               [--log-config-filename LOG_CONFIG_FILENAME]
               [--tracing {function}] [-? ?] [-I STDIN] [-L STDLOG]
               [-E STDERR] [-S STDOUT]
bed2gff: error: argument -?: expected one argument