CpG Islands

The matrix below represents the model parameters for the CpG Islands problem. The capital letters for the nucleotides represent those nucleotides "in a CpG island".

Hover over any of the question marks for more information.

Model Parameters   

Example CSV Parameters File
TransitionEmission 
 ACTGactg actgInitial
A
C
T
G
a
c
t
g

Choose the length of the sequence to generate using the model parameters above. The sequence of output symbols, along with the sequence of hidden states that produced them, will be given below. The hidden states correspond to the columns in the transition matrix, where 0 represents the first column (A), 1 represents the second column (C), and so on. It will be useful to copy-paste these values into the inputs of the other tabs.

Enter the sequence to evaluate (usually copied from the Generate tab) and, optionally, the hidden states that produced the sequence (from the Generate tab). States 0, 1, 2, 3 correspond to the island regions; states 4, 5, 6, 7 correspond to non-island regions. The posterior probabilities for being in an island state will then be calculated and presented as a graph below. If the hidden states are given, those representing islands (0, 1, 2, and 3) will appear as areas of dark gray background on the graph.

Sequence

States (optional)

Enter the sequence to decode (usually copied from the Generate tab) and, optionally, the hidden states that produced the sequence (from the Generate tab). The Viterbi algorithm will then be used to decode the sequence. In the output, the red symbols are the predicted islands and the gray background symbols are actual islands (if supplied), corresponding to states 0, 1, 2, 3 from the simulation.

Sequence

States (optional)

Enter the sequence to on which to train (usually copied from the Generate tab). The Baum-Welch algorithm will then be used to train the original model (given above) on this sequence. The resulting model parameters will be shown below.

Sequence