Sliding Windows

One way to determine the location of CpG islands is to check the statistics within a sliding window. You can watch this process below by specifying the nucleotide sequence from a fasta file. Clicking on the "Start" button show the window as it slides across the sequence. As sequences are found, their nucleotide endpoints are provided in the "CpG Islands" box. The speed of the process can be controlled with the slider.

This algorithm is adapted from that described in (Takai and Jones, "Comprehensive analysis of CpG islands in human chromosomes 21 and 22," PNAS, 2002). The algorithm uses a window size of 200 nucleotides, an observed-expected threshold of 0.6, and CG-content threshold of 0.5. For more control over parameters, we recommend EMBOSS, a popular web-based sliding-window CpG island searcher.

In the drawing below, A, C, G, and T are represented as red, blue, yellow, and green, respectively. So blue-yellow pairs represent CpG pairs.


Slow Fast

CpG Islands

Your browser does not support the HTML5 canvas.