This tool converts BAM files into BED files supplying the intervals for each read in the BAM file. BAM files must have a corresponding index file ie. example.bam and example.bam.bai

For example:

samtools view example.bam

READ1    163    1      13040   15     76M    =      13183   219     ...
READ1    83     1      13183   7      76M    =      13040   -219    ...
READ2    147    1      13207   0      76M    =      13120   -163    ...

python example.bam

1       13039   13115   READ1     15      +
1       13119   13195   READ2     0       +
1       13182   13258   READ1     7       -
1       13206   13282   READ2     0       -

By default, bam2bed outputs each read as a separate interval. With the option --merge-pairs paired-end reads are merged and output as a single interval. The strand is set according to the first read in a pair.


cgat bam2bed BAMFILE [--merge-pairs] [options]

operates on the file BAMFILE:

cgat bam2bed [--merge-pairs] [options]

operates on the stdin as does:

cgat bam2bed -I BAMFILE [--merge-pairs] [options]

To merge paired-end reads and output fragment interval ie. leftmost mapped base to rightmost mapped base:

cat example.bam | cgat bam2bed --merge-pairs

1       13119   13282   READ2     0       +
1       13039   13258   READ1     7       +

To use merge pairs on only a region of the genome use samtools view:

samtools view -ub example.bam 1:13000:13100 | cgat bam2bed --merge-pairs

Note that this will select fragments were the first read-in-pair is in the region.


-m, --merge-pairs

Output one region per fragment rather than one region per read, thus a single region is create stretching from the start of the frist read in pair to the end of the second.

Read pairs that meet the following criteria are removed:

  • Reads where one of the pair is unmapped

  • Reads that are not paired

  • Reads where the pairs are mapped to different chromosomes

  • Reads where the the insert size is not between the max and min (see below)


Merged fragements are always returned on the +ve strand. Fragement end point is estimated as the alignment start position of the second-in-pair read + the length of the first-in-pair read. This may lead to inaccuracy if you have an intron-aware aligner.

--max-insert-size, --min-insert-size

The maximum and minimum size of the insert that is allowed when using the –merge-pairs option. Read pairs closer to gether or futher apart than the min and max repsectively are skipped.

-b, --bed-format

What format to output the results in. The first n columns of the bed file will be output.


python --help

for command line help.

Command line options

usage: bam2bed [-h] [--version] [-m] [--max-insert-size MAX_INSERT_SIZE]
               [--min-insert-size MIN_INSERT_SIZE] [--bed-format {3,4,5,6}]
               [--timeit TIMEIT_FILE] [--timeit-name TIMEIT_NAME]
               [--timeit-header] [--random-seed RANDOM_SEED] [-v LOGLEVEL]
               [--log-config-filename LOG_CONFIG_FILENAME]
               [--tracing {function}] [-? ?] [-I STDIN] [-L STDLOG]
               [-E STDERR] [-S STDOUT]
