PubChem chemical structure standardization

Abstract

Background:

PubChem is a chemical information repository, consisting of three primary databases: Substance, Compound, and BioAssay. When individual data contributors submit chemical substance descriptions to  substance, the unique chemical structures are extracted and stored into Compound through an automated process called structure standardization. The present study describes the PubChem standardization approaches and analyzes them for their success rates, reasons that cause structures to be  rejected, and modifcations applied to structures during the standardization process. Furthermore, the PubChem standardization is compared to the structure normalization of the IUPAC International Chemical Identifer (InChI) software, as manifested by conversion of the InChI back into a chemical structure.

Information
Content Type OER
Author(s) Volker D. Hähnke, Sunghwan Kim, Evan E. Bolton
DOI https://doi.org/10.1186/s13321-018-0293-8
Content Link https://jcheminf.biomedcentral.com/track/pdf/10.1186/s13321-018-0293-8
License Open Access
Content Status publish
Date Published August 10, 2018
Content Tags Cheminformatics, Classroom Material, Content type, InChI Applications, Organic, Publication, Search