Tautomer Identification and Tautomer Structure Generation Based on the InChI Code

Abstract

An algorithm is introduced that enables a fast generation of all possible prototropic tautomers resulting from the mobile H atoms and associated heteroatoms as defined in the InChI code. The InChI-derived set of possible tautomers comprises (1,3)-shifts for open-chain molecules and (1,n)-shifts (with n being an odd number >3) for ring systems. In addition, our algorithm includes also, as extension to the InChI scope, those larger (1,n)-shifts that can be constructed from joining separate but conjugated InChI sequences of tautomer-active heteroatoms. The developed algorithm is described in detail, with all major steps illustrated through explicit examples. Application to ∼72 500 organic compounds taken from EINECS (European Inventory of Existing Commercial Chemical Substances) shows that around 11% of the substances occur in different heteroatom−prototropic tautomeric forms. Additional QSAR (quantitative structure−activity relationship) predictions of their soil sorption coefficient and water solubility reveal variations across tautomers up to more than two and 4 orders of magnitude, respectively. For a small subset of nine compounds, analysis of quantum chemically predicted tautomer energies supports the view that among all tautomers of a given compound, those restricted to H atom exchanges between heteroatoms usually include the thermodynamically most stable structures.

Information
Content Type Non OER
Author(s) Torsten Thalheim, Armin Vollmer, Ralf-Uwe Ebert, Ralph Kühne, Gerrit Schüürmann
DOI https://doi.org/10.1021/ci1001179
License Copyright © 2019 American Chemical Society
Content Status publish
Date Published June 29, 2010
Content Tags Classroom Material, Content type, Organic, Publication