fastqs2fastqs.py - manipulate (merge/reconcile) fastq files¶
- Tags
Genomics NGS FASTQ FASTQ Manipulation
Purpose¶
This script manipulates multiple fastq files and outputs
new fastq files. Currently only the method reconcile
is implemented.
reconcile¶
Reconcile reads from a pair of fastq files.
This method takes two fastq files and outputs two fastq files such that all reads in the output are present in both output files.
The typical use case is that two fastq files containing the first and second part of a read pair have been independently filtered, for example by quality scores, truncation, etc. As a consequence some reads might be missing from one file but not the other. The reconcile method will output two files containing only reads that are common to both files.
The two files must be sorted by read identifier.
Example input, read2 and read3 are only present in either of the files:
# File1 # File 2
@read1 @read1 AAA AAA + + !!! !!! @read2 @read3 CCC TTT + + !!! !!! @read4 @read4 GGG GGG + + !!! !!!
Example output, only the reads common to both files are output:
# File1 # File 2
@read1 @read1
AAA AAA
+ +
!!! !!!
@read4 @read4
GGG GGG
+ +
!!! !!!
Usage¶
Example:
python fastqs2fastqs.py --method=reconcile --output-filename-pattern=myReads_reconciled.%s.fastq myReads.1.fastq.gz myReads.2.fastq.gz
In this example we take a pair of fastq files, reconcile by read
identifier and output 2 new fastq files named
myReads_reconciled.1.fastq.gz
and
myReads_reconciled.2.fastq.gz
.
Type:
python fastqs2fastqs.py --help
for command line help.
Command line options¶
usage: fastqs2fastqs [-h] [--version] [-m {reconcile,filter-by-sequence}] [-c]
[-u] [--id-pattern-1 ID_PATTERN_1]
[--id-pattern-2 ID_PATTERN_2]
[--input-filename-fasta INPUT_FILENAME_FASTA]
[--filtering-kmer-size FILTERING_KMER_SIZE]
[--filtering-min-kmer-matches FILTERING_MIN_KMER_MATCHES]
[--timeit TIMEIT_FILE] [--timeit-name TIMEIT_NAME]
[--timeit-header] [--random-seed RANDOM_SEED]
[-v LOGLEVEL] [--log-config-filename LOG_CONFIG_FILENAME]
[--tracing {function}] [-? ?]
[-P OUTPUT_FILENAME_PATTERN] [-F] [-I STDIN] [-L STDLOG]
[-E STDERR] [-S STDOUT]
fastqs2fastqs: error: argument -?: expected one argument