In here you will find additional information about this website, and how the prediction algorithms are used.

Using the online scoring form

Protein sequences are input using one-letter amino acid codes. Any other non-standard symbols, spaces, or special characters are ignored. As a sanity check, please check your original sequence in the output page.

For an input protein, the online program will locate the zinc fingers in your protein sequence and output the protein sequence with all the ZF domains highlighted. You can select the ZF domains for which you want to predict DNA-binding specificities by marking the corresponding boxes or clicking the domains on the protein map. You can also specify whether you wish to use either the expanded linear or polynomial pre-trained SVM model. Note that the program will assume all fingers are binding consecutive bases, and this affects the predictions of the "overlap" bases. If you want to predict the DNA-binding specificity of any particular array of ZF domains, then you may want to select just those fingers.

HMMER algorithm version 2.3.2 is used for detecting ZF domains. We output all original bit scores next to ZF domains. Please note that no bit score threshold is used here, but the default HMMER gathering threshold for ZF domains is 17.7, so you can decide to select only confident ZF domains with bit scores exceeding 17.7. You can check the ZF scores in the final window when generating DNA sequence logos for your protein to ascertain whether you are satisfied with the labelled ZFs.


Prediction Results

Please note that the protein may bind to either the primary or complementary DNA chain: the predicted PWM is shown as a Sequence Logo for the primary DNA chain and as the Reverse Complement for easier comparison with your expectation or with a known experimental PWM.

You may also download the PWM in a text format with nucleotide probabilities given for each position by pressing the "Download PWM" button.


Pre-trained model files

If you would like to test our pre-trained SVM models using external programs, such as SVM_light, you can download pre-trained model files for the expanded linear and polynomial SVMs.


Experimental database download

We have also made available for download the database of experimental data collected from 25 individual manuscripts published in 1990 - 2005 and from the Protein Data Bank. This archive is password-protected. You can request the password by contacting us. Each line in the database represents one experiment including fields: source - data origin; dna - DNA sequence; zf - number of zinc fingers in protein; f1-fN - sequences of corresponding zinc finger regions; ex - type of example: + for binding, - for non-binding, Kd - for experimentally measured dissociation constant, and > for comparative examples when binding of sequence A is compared to the subsequently listed sequence B. Please consult the list of sources for all individual references.


