The Role of Small Molecules in Biology
Unknown compound identification is a major challenge in metabolomics. Even establishing broad definitions of an “unknown compound” is not straightforward. An unlikely source of clarification (some might say confusion) to our field came from former US Defense Sectretary, Donald Rumsfeld, who once said
Reports that say that something hasn’t happened are always interesting to me, because as we know, there are known knowns; there are things we know we know. We also know there are known unknowns; that is to say we know there are some things we do not know. But there are also unknown unknowns—the ones we don’t know we don’t know. And if one looks throughout the history of our country and other free countries, it is the latter category that tends to be the difficult ones.
Following Rumsfeld, a “known known” is a metabolite in a reference database that can be matched with an experimental dataset. A “known unknown” is a metabolite that has been found before but is not in an accessible database. An “unknown unknown” is a metabolite that has not been discovered. The NIH Metabolomics Common Fund has funded 5 centers in the US to improve unknown metabolite identification. Our lab leads one of these centers, and our project is titled Genetics and Quantum Chemistry as Tools for Unknown Metabolite Identification.
We are using the model organism Caenorhabditis elegans and comparing both known mutants and natural isolates with the reference strain PD1074. We collaborate with Erik Anderson on C. elegans and Lauren McIntyre on study design and biostatistics. The conceptual steps that we use in this project are: