When you choose an epitope, either B or T, or are planning on using a certain protein region to induce an immune response (i.e. antigen for immunization, or part of a recombinant vaccine), it is usually highly desirable that the response will be specific, that is, directed against the chosen molecule (of the family of the molecules), but not against other proteins.
One of the protein domain characteristics that allows to evaluate this is its complexity: the higher it is, the more likely this region will be unique for a particular molecule (domain of the molecule).
The EpiQuest - C algorithm calculates the relative complexity of the region by summarizing (depending on the chosen Frame) the relative complexity values of combinations of amino acids that fall within the chosen frame. The Matrix C1.3, employed in this program as the basis for complexity values of individual amino acids, is based on the same inverted disorder values as Weathers et al (2006). The homodimers and homopolymers are considered as highly disordered and have increased penalties. One additional reason for highly penalized dimers and polymers of some amino acids is their negative contribution for the uniqueness of the epitope.
The values apply mostly to higher vertebrate protein sequences, as these species will usually receive the immunogen evaluated for its complexity by EpiQuest-C.
Providing you would like to use the program to review organised and disorganized regions of various proteins, we will soon release additional complexity matrices corrected for variation of amino acid relative frequencies in different types of organisms.
* Weathers, E.A., Paulaitis, M.E., Woolf, T.B., and Hoh, J.H. (2006). Insights into protein structure and function from disorder-complexity space. Proteins 66, 16–28.
EpiQuest-C,protein complexity levels,protein complexity definition,protein complexity human,protein low complexity region,protein and complexity,protein domain complexity,low-complexity protein domain,low complexity protein disorder,disordered protein complexity,protein low complexity prediction,protein structure prediction complexity,protein sequence complexity,protein surface complexity,epitope complexity,epitope specificity,how to select specific epitope,how to select an epitope,how to find conserved areas odf protein sequence,conserved areas of protein sequence,protein sequence evolution,family of proteins
The program analyses the presented protein sequence and returns the results in graphical and tabular form. Below is an example of the analysis of ssDNA-binding protein from Yersinia pestis:
The analysis of the sequence has shown that there are only 2 regions that have high complexity (not disorganized) within the sequence that exceed 9aa in length (as shown by the bars and in the table. Please note the composition/sequence of the domains that are indicated as having poor complexity.
For more details on operating the program please see the EpiQuest-C Manual and Analysis of Demo Sequences
Complexity: how unique is the epitope?
There are 3 immunodominant epitopes in the sequence of NS1 protein of Dengue virus; the analysis is performed for a particular isolate of DV2 from South America. Imagine that one would like to select epitopes for diagnostic assay that analyses the presence of antibodies to immunodomiant epitopes of the virus. All 3 epitopes, in case of infection, elicit the response, but we would like to know how specific an assay would be, based on a particular epitope.
Three immunodominant epitopes of NS1 (blue frame) are correctly predicted by EpiQuest-B (Red histogram). The yellow histogram shows the relative complexity of the corresponding sequences. Please note that the complexity is highest for epitopes #3>#1>#2.
Now see the uniqueness of these epitopes of NS1 (a protein of DV2) and compare them against other types of the dengue virus.
(please click the image to zoom in)
B-epitope complexity & antibody specificity
Specificity of an antibody that will be used for research or diagnostics is of primary importance. Lately, most antibodies are developed against a selected peptide epitope from the sequence of a protein. This raises the question: can the specificity of an antibody prepared against a given epitope be predicted in advance, at the stage when the epitope has been selected?
We recently had an opportunity to perform an analysis for specificity of several antibodies developed by a third party. These were all prepared against peptide epitopes (9-12-mers) and purified on a peptide column from hyperimmune serum. In other words, each of them were as good as antibodies can be against this epitope.
We have analysed the epitopes used for these peptides (and the entire sequences of origin of the proteins) using EpiQuest-C, our software that defines the complexity (as we view it) profile of a protein primary sequence. Each possible 9-mer from entire sequences and from peptide epitopes was assigned its calculated complexity index, CGI. As some peptides were longer than 9-mer, we have determined the highest and lowest value of their CGI.
Interestingly, all peptides could be divided into two groups according to their highest CGI value, Tier 1 and 2: the first was above the Mean value, the second -below the mean value (Fig.1). The selected peptide epitopes were not on average different from the whole population of 9-mers that could be selected from the given sequences.
What was interesting, is that the reactivity of the antibodies in western blot was quite different between the Tier 1 and 2 antibodies with respect to specificity and background. Thus, practically all Tier 1 antibodies were excellent in blot (10 out of 12, 83%, with 2 having some non-specific reaction). In contrast, in Tier 2 from 13 antibodies 3 did not give any specific reaction in blot at all, 6 demonstrated additional reactivity to some other proteins or were excessively sticky to the membrane, giving the background. Only 4 antibodies (30%) showed good, clean reaction as did the majority of Abs from Tier 1.
Scanning epitope collections with C-Scanner
When you need to access the relative complexity for a group of already selected epitopes (isolated from sequences due to their antigenicity, potential T-epitope activity, or other criteria, you can easily do this for up to 2000 sequences at once using the C-Scanner application which is based on the same algorithm and values as EpiQuest-C.