bed2stats.py - summary of bed file contents¶
- Tags
Genomics Intervals Summary BED
Purpose¶
This script takes a bed-formatted file as input and outputs the number
of intervals and bases in the bed file. Counts can be subdivided by setting
the --aggregate-by
command line option:
contig
output counts per contig (column 1)
name
output counts grouped by the name field in the bed formatted file (column 4)
track
output counts per track in the bed formatted file.
Note that a count of bases usually makes only sense if the intervals submitted are non-overlapping.
If the option –add-percent is given, an additional column will output the percent of the genome covered by intervals. This requires a –genome-file to be given as well.
Usage¶
To count the number of intervals, type:
cgat bed2table < in.bed
track |
ncontigs |
nintervals |
nbases |
all |
23 |
556 |
27800 |
To count per contig:
cgat bed2table --aggregate=contig < in.bed
track |
ncontigs |
nintervals |
nbases |
chrX |
1 |
11 |
550 |
chr13 |
1 |
12 |
600 |
chr12 |
1 |
37 |
1850 |
… |
… |
… |
… |
Type:
cgat bed2table --help
for command line help.
Command line options¶
usage: bed2stats [-h] [-g GENOME_FILE] [-a {name,contig,track,none}] [-p]
[--timeit TIMEIT_FILE] [--timeit-name TIMEIT_NAME]
[--timeit-header] [--random-seed RANDOM_SEED] [-v LOGLEVEL]
[--log-config-filename LOG_CONFIG_FILENAME]
[--tracing {function}] [-? ?] [-I STDIN] [-L STDLOG]
[-E STDERR] [-S STDOUT]
bed2stats: error: argument -?: expected one argument