In the final year of my PhD, I have been investigating different forms of alignment of chemicals, and seeing which method is best at predicting whether a chemical will be active against a particular drugtarget.
Similarity is subjective to a particular problem domain. As an example, which two most objects are most similar – an apple, a pumpkin or a basketball? All three are more or less spherical, but the pumpkin and apple have the similarity of being fruit, while the pumpkin and basketball are a similar size.
The same subjectivity exists in chemistry. A common goal when searching for similarity in chemicals is to predict whether one compound will act in the same way as another compound, known to have useful pharmaceutical properties. The desired “similarity” in this case, is a similarity of biological activity: something which, at present, is impossible to predict. However, we can attempt to infer such a property from aspects of structural similarity.
Serotonin reuptake inhibitors is a group of chemicals that includes many useful antidepressants. Consider the examples below, of compounds that act as serotonin reuptake inhibitors. In the first case (Figure 1), similarity is based on the largest common fragment (highlighted in bold).
Figure 1
The similarity here is obvious, the only difference being the Br atom. However, the same technique fails to show the biological similarity between the two inhibitors in Figure 2.
In this case, the approach of finding the largest fragment has failed to highlight the “similarity” between the two compounds. A technique based on finding the maximum possible overlap of edges however, is more successful (Figure 3).
Figure 3.
This method, which seeks to find a set of common fragments emphasises a different “similarity” between these compounds.