- tools for working with PSL formatted files and data

This module provides a class to parse PSL formatted files such as those output by the BLAT tool.

This module defines the Blat.Match class representing a single entry and a series of iterators to iterate of PSL formatted files (iterator(), iterator_target_overlap(), …).


exception Blat.Error

Bases: Exception

Base class for exceptions in this module.

exception Blat.ParsingError(message, line=None)

Bases: Blat.Error

Exception raised for errors while parsing

message -- explanation of the error
class Blat.Match

Bases: object

a psl formatted alignment.

Block coordinates are on the forward strand for target and on the forward/reverse strand for the query depending on the strand.

The fields mQueryFrom/To and mSbjctFrom/To are always on the forward strand.


convert coordinates.

This rescales the block positions so that they start at 0 and converts the query to forward and the sbjct to forward/reverse coordinates.

About the psl psl format from the manual at


In general the coordinates in psl files are “zero based half open.” The first base in a sequence is numbered zero rather than one. When representing a range the end coordinate is not included in the range. Thus the first 100 bases of a sequence are represented as 0-100, and the second 100 bases are represented as 100-200.

There is a another little unusual feature in the .psl format. It has to do with how coordinates are handled on the negative strand. In the qStart/qEnd fields the coordinates are where it matches from the point of view of the forward strand (even when the match is on the reverse strand). However on the qStarts[] list, the coordinates are reversed.

This class works in forward coordinates for the query and forward/reverse coordinates for the sbjct.

For a negative strand match, the following is done:
  • invert mSbjctFrom and mSbjctTo with mSbjctLength

  • add block sizes to mQueryStarts and mSbjctStarts

  • invert mQueryStarts and mSbjctStarts

  • reverse blocksize, mQueryStarts and mSbjctStarts


switch the target strand.

Use in cases in which a feature has been defined on the negative target strand with reverse coordinates. The result will be the same alignment using forward coordinates on the target.

This method will also update the query strand and coordinates.


build BLAT entry from a MAQ match.

see Maq.Match.


return a list of aligned blocks.


return a map between query to target.

If the strand is “-”, the coordinates for query are on the negative strand.


return a map between target to query.

If the strand is “-”, the coordinates for query are on the negative strand.

fromMap(map_query2target, use_strand=None)

return a map between query to target.

fromPair(query_start, query_size, query_strand, query_seq, target_start, target_size, target_strand, target_seq)

fill from two aligned sequences.

Note that sequences are case-sensitive.

class Blat.MatchPSLX

Bases: Blat.Match

fromPSL(other, query_sequence, sbjct_sequence)

fill entry from a psl match.

sequences are on forward strand starting at query_from and sbjct_from, respectively.


iterate over the contents of a psl file.


iterate over the contents of a psl file.


iterate over the contents of a pslx file.

Blat.iterator_target_overlap(infile, merge_distance)

iterate over psl formatted infile and return blocks of target overlapping alignments.

Blat.iterator_query_overlap(infile, merge_distance)

iterate over psl formatted infile and return blocks of target overlapping alignments.

Blat.iterator_test(infile, report_step=100000)

only output parseable lines from infile.


iterate over the contents of a psl file per query

Blat.addAlignments(matches, shift=0, by_query=False)

building a genome to query alignment for all matches

The genome alignment is shifted by shift.

Blat.getComponents(matches, max_distance=0, min_overlap=0, by_query=False)

return overlapping matches.


allow reads to be joined if they are # residues apart. Adjacent reads are 1 residue apart, overlapping reads are 0 residues apart


require at least # residues to be overlapping