Language to serialize objects. Used in the CGAT testing framework. (YAML).
Format to store genomic alignments in a compressed format. (BAM).
File containing genomic intervals. (BED).
General transfer format. Format to store genes and transcripts.
Compressed format for displaying numerical values across genomic ranges (BIGWIG).
Format for displaying numerical values across genomic ranges (Wiggle).
Genomic alignment format. The format is described in detail (PSL.
Format to store genomic alignments (SAM).
Tab separated values. In these tables, records are separated by new-line characters and fields by tab characters. Lines with comments are started by the
#character and are ignored. The first uncommented line should contain the column headers. For example:
# This is a comment gene_id length gene1 1000 gene2 2000 # Another comment
- edge list
Sequence format containing quality scores, more background is here
- test directory
Directory that contains the
test.yaml, input and reference files for testing scripts.
- submit host
- execution host
- edge list
- code directory
Transcription start site
- production pipeline
A pipeline that performs common tasks on a certain type of data. The idea of a production pipeline is to provide common preprocessing of data and a first look. A project pipeline might then take data from one or more production pipeline to glean biological insight.
- project pipeline
A pipeline that is project specific. Usually code is developed first inside a project pipeline. When it becomes generally useful, it may be refactored into a production pipeline.
Unix standard input. Most CGAT tools read data from stdin.
Unix standard output. Most CGAT tools output data to stdout.
Unix standard error. This is where errors go.
Verbosity of logging information. The logging level can be determined by the
--verboseoption. A level of
0means no logging output, while
1is information messages only, while
2outputs also debugging information.