There are four parameters for the executable file:

1.       The first parameter is the filename of FASTA file of DNA sequences.

2.       The second parameter is the filename of the descriptions of motif candidates. (format)

3.       The third parameter is the filename of the descriptions of 1st order Markov model. (format)

4.       The fourth parameter is 0 or 1.

0 means not considering double strands. 1 means considering double strands

 

File Format of motif candidate

line 1       an integer: the number of motifs to do ranking

line 2       an integer: the type of motifs

         0 means consensus motifs on {A,C,G,T} or IUPAC alphabet     (see example0)

         1 means a set of motif words on {A,C,G,T}                  (see example1)

         3 means motifs on PWM                                 (see example3)

line3 to the end  description of motifs

For type 0, a typical description is

@ID m1

ACCGTCC

"m1" is the name of the motif and it can be omitted, but the token "@ID" can't be omitted.

For type 1, a typical description is

@ID m1

@SIZE 3

TTTGTCAA

ATGAGATT

TCAAATCG

"m1" is the name of the motif and it can be omitted, but the token "@ID" can't be omitted.

3” is the number of words in a motif set and the token “@SIZE” can’t be omitted

For type3, a typical description is

@ID GCR1

A   0   2   0   0   0

T   0   4   6   0   0

G   0   0   0   0   0

C   6   0   0   6   6

"m1" is the name of the motif and it can be omitted, but the token "@ID" can't be omitted.

                                                                     Back to top

File Format of descriptions of 1st order Markov model

Assume the markov region is R and R[i] is the i-th char in the chain. The parameter of order-1 markov model

consists of two parts. One is the initial probability which is a 1*4 vector and the other is the tansfer

probability which is a 4*4 matrix.

A typical description of an order-1 markov model is

 

Pr(R[0]=A)

Pr(R[0]=C)

Pr(R[0]=G)

Pr(R[0]=G)

 

Pr(R[1]=A|R[0]=A)

Pr(R[1]=C|R[0]=A)

Pr(R[1]=G|R[0]=A)

Pr(R[1]=T|R[0]=A)

Pr(R[1]=A|R[0]=C)

Pr(R[1]=C|R[0]=C)

Pr(R[1]=G|R[0]=C)

Pr(R[1]=T|R[0]=C)

Pr(R[1]=A|R[0]=G)

Pr(R[1]=C|R[0]=G)

Pr(R[1]=G|R[0]=G)

Pr(R[1]=T|R[0]=G)

Pr(R[1]=A|R[0]=T)

Pr(R[1]=C|R[0]=T)

Pr(R[1]=G|R[0]=T)

Pr(R[1]=T|R[0]=T)                                                   Back to top