next up previous
Next: Amino Acids Substitution Matrices Up: BLAST - Basic Local Previous: Improved BLAST

   
PSI BLAST - Position Specific Iterated BLAST

The PSI BLAST is another improved version of the BLAST algorithm [2].

When aligning a group of amino acid sequences, i.e., writing them one below the other (see section 3.5 for discussion of multiple alignment), the vector of characters in a certain column (i.e. the same position in the aligned sequences) is called a profile. For a certain profile we may compute the histogram of characters types and obtain the variance between them. When we align together amino acid sequences belonging to the same protein family, we will find that some regions are very similar, with profiles showing little variance. These regions, called conserved regions, define the structure and functionality typical to this family. We would like the substitution matrices we use to take into account the statistic information we have about how conserved is the column, in order to improve our alignment score.

In the first stage in the algorithm we perform ordinary BLAST while using a different cost vector Vi for each column i. Initially, each such vector Vi is set to the row of the substitution matrix corresponding to the i-th character in the query sequence. From the high-scoring results we get, we build the profiles for each column. We continue to perform BLAST iteratively while using as query the collection of profiles, i.e. we use a histogram at each column rather than a simple string, and compare it against the database. This is equivalent to updating the position dependent cost vectors according to the profile statistics. After each iterative step we update the profiles according to the obtained result sequences. We terminate the iterative loop when we no longer find new meaningful matches.

It should be noted that biologists regard PSI BLAST as not reliable.


next up previous
Next: Amino Acids Substitution Matrices Up: BLAST - Basic Local Previous: Improved BLAST
Itshack Pe`er
1999-01-10