bam_vs_gtf.py - compare bam file against gene set¶
- Tags
Genomics NGS Genesets BAM GTF Summary
Purpose¶
Compare RNASeq reads in a BAM file and compares it against reference exons to quantify exon overrun / underrun.
Documentation¶
- This script is for validation purposes:
Exon overrun should be minimal - reads should not extend beyond known exons.
Spliced reads should link known exons.
- Please note:
For unspliced reads, any bases extending beyond exon boundaries are counted.
- For spliced reads, both parts of the reads are examined for their overlap.
As a consequence, counts are doubled for spliced reads.
The script requires a list of non-overlapping exons as input.
For read counts to be correct the NH (number of hits) flag needs to be set correctly.
Usage¶
Example:
# Preview the BAM file using Samtools view
samtools view tests/bam_vs_gtf.py/small.bam | head
# Pipe input bam to script and specify gtf file as argument
cat tests/bam_vs_gtf.py/small.bam | cgat bam_vs_gtf.py --gtf-file=tests/bam_vs_gtf.py/hg19.chr19.gtf.gz
category |
counts |
---|---|
spliced_bothoverlap |
0 |
unspliced_overlap |
0 |
unspliced_nooverrun |
0 |
unspliced |
207 |
unspliced_nooverlap |
207 |
spliced_overrun |
0 |
spliced_halfoverlap |
0 |
spliced_exact |
0 |
spliced_inexact |
0 |
unspliced_overrun |
0 |
spliced |
18 |
spliced_underrun |
0 |
mapped |
225 |
unmapped |
0 |
input |
225 |
spliced_nooverlap |
18 |
spliced_ignored |
0 |
Type:
python bam_vs_gtf.py --help
for command line help.
Command line options¶
filename-exons / filename-gtf: a gtf formatted file containing the genomic coordinates of a set of non-overlapping exons, such as from a reference genome annotation database (Ensembl, UCSC etc.).
usage: bam-vs-gtf [-h] [--version] [-e gtf] [--timeit TIMEIT_FILE]
[--timeit-name TIMEIT_NAME] [--timeit-header]
[--random-seed RANDOM_SEED] [-v LOGLEVEL]
[--log-config-filename LOG_CONFIG_FILENAME]
[--tracing {function}] [-? ?] [-P OUTPUT_FILENAME_PATTERN]
[-F] [-I STDIN] [-L STDLOG] [-E STDERR] [-S STDOUT]
bam-vs-gtf: error: argument -?: expected one argument