next up previous
Next: Initial state probabilities Up: GENSCAN Previous: GENSCAN

C+G Content

The C+G content (isochore) of the genomic sequence has a strong effect on gene density (see Figure [*]), gene length etc. For example: gene density in C+G rich regions is 5 times higher than moderate C+G regions and 10 times higher than rich A+T regions. Thus, for training GENSCAN the training set is divided into four categories depending on the C+G content of the sequence. The categories are:
0
1.
( < 43% C+G)
2.
(43 -51% C+G)
3.
(51 - 57% C+G)
4.
( > 57% C+G)
For each of these categories, separate initial state probabilities, transition probabilities and state length distributions are computed.

Peer Itsik
2000-12-25