When searching a query of length m in a database of total length n one performs m*n random walk experiment, each with exponentially decreasing probability of achieving a score S. Thus, the E-value for score s is: . and K are constants:
Indeed the E-score is normalized by the length of the query and database: The same alignment would have different E-score if these length are different. Also the E-score is exponential, thus it is instructive to consider a normalization of the E-score into logarithmic scale, called the Bit - score.
The Bit-score B is computed from the E-score E by
E=mn2-B. Obviously, the Bit-score is linear in the raw score s:
In contrast to raw scores, that have little meaning without k and , the Bit-score is measured in standard units (see eg. ). Naturally, the meaning of the Bit-score depends on sizes of the query and the database.
Again, as mentioned before one can ask for the P-value (the probability of the observed number of records with a known E-value or lower).
Define the random variable Y to be the observed number of pairs achieveing E-value E or better(smaller).
Y is distributed Poisson with (E). The Probability of Ye to be r is , and the probability of Ye to be 0 is equivilant to the probability that the (Best E-score < E)=exp (-E). Specifically the chance of finding zero alignments with score >= S is e-E so the probability of finding at least one such alignment is 1-e-E . This is the P-value associated with the score S (see eg. ). Note that this model assumes an I.I.D trial for each database position.