diff_chains.py - compare to chain formatted files

Tags

Genomics GenomeAlignment CHAIN Comparison

Purpose

Compare two genomic alignment files and calculate statistics from the comparison.

Documentation

Operates on two chain formatted files.

Outputs a table with the following columns:

Column

Content

contig1

contig name

contig2

contig name

strand

strand

mapped1

mapped residues

identical1

identically mapped residues

different1

differently mapped residues

unique1

residues mapped only from set1

pmapped1

percentage of mapped residues

pidentical1

percentage of identically mapped residues

pdifferent1

percentage of differently mapped residues

Similar columns exist for data set 2

Usage

Example:

cgat diff_chains.py hg19ToMm10v1.chain.over.gz hg19ToMm10v2.chain.over.gz

This will compare the locations that regions within the genome hg19 map to between two different mappings to the genome mm10.

Type:

python diff_chains.py --help

for command line help.

Command line options

usage: diff-chains [-h] [--version] [-m] [-a] [-u] [-r RESTRICT]
                   [--timeit TIMEIT_FILE] [--timeit-name TIMEIT_NAME]
                   [--timeit-header] [--random-seed RANDOM_SEED] [-v LOGLEVEL]
                   [--log-config-filename LOG_CONFIG_FILENAME]
                   [--tracing {function}] [-? ?] [-I STDIN] [-L STDLOG]
                   [-E STDERR] [-S STDOUT]
diff-chains: error: argument -?: expected one argument