diff_bed.py - count differences between several bed files

Tags

Genomics Intervals BED Comparison

Purpose

Compute overlap statistics between multiple bed files. For each pairwise comparison, this script outputs the number of intervals (exons) and bases overlapping.

Using the --update option, a table can be incrementally updated with additional comparisons.

The strand of intervals is ignored in comparisons.

Column

Content

set

Name of the set

nexons_total

number of intervals in set

nexons_ovl

number of intervals overlapping

nexons_unique

number of unique intervals

nbases_total

number of bases in gene set

nbases_ovl

number of bases overlapping

nbases_unique

number of unique bases

Usage

For example:

python diff_bed.py *.bed.gz > out.tsv

To update results from a previous run, type:

python diff_bed.py --update=out.tsv *.bed.gz > new.tsv

Type:

python diff_bed.py --help

for command line help.

Command line options

usage: diff-bed [-h] [--version] [-u FILENAME_UPDATE] [-p PATTERN_ID] [-t]
                [--timeit TIMEIT_FILE] [--timeit-name TIMEIT_NAME]
                [--timeit-header] [--random-seed RANDOM_SEED] [-v LOGLEVEL]
                [--log-config-filename LOG_CONFIG_FILENAME]
                [--tracing {function}] [-? ?] [-I STDIN] [-L STDLOG]
                [-E STDERR] [-S STDOUT]
diff-bed: error: argument -?: expected one argument