Classroom Material

10 November 2019 / Last updated : 13 November 2019 Andrew Cornell Cheminformatics

Open Data, Open Source and Open Standards in chemistry: The Blue Obelisk five years on

Abstract Background: The Blue Obelisk movement was established in 2005 as a response to the lack of Open Data, Open Standards and Open Source (ODOSOS) in chemistry. It aims to make it easier to carry out chemistry research by promoting interoperability between chemistry software, encouraging cooperation between Open Source developers, and developing community resources and […]

10 November 2019 / Last updated : 13 November 2019 Andrew Cornell Cheminformatics

UniChem: a unified chemical structure cross-referencing and identifier tracking system

Abstract UniChem is a freely available compound identifier mapping service on the internet, designed to optimize the efficiency with which structure-based hyperlinks may be built and maintained between chemistry-based resources. In the past, the creation and maintenance of such links at EMBL-EBI, where several chemistry-based resources exist, has required independent efforts by each of the […]

10 November 2019 / Last updated : 13 November 2019 Andrew Cornell Cheminformatics

International chemical identifier for reactions (RInChI)

Abstract The Reaction InChI (RInChI) extends the idea of the InChI, which provides a unique descriptor of molecular structures, towards reactions. Prototype versions of the RInChI have been available since 2011. The frst ofcial release (RInChIV1.00), funded by the InChI Trust, is now available for download (https://inchi-trust.info/wp/downloads/). This release defnes the format and generates hashed […]

10 November 2019 / Last updated : 13 November 2019 Andrew Cornell Cheminformatics

Comparative evaluation of open source software for mapping between metabolite identifiers in metabolic network reconstructions: application to Recon 2

Abstract Background: An important step in the reconstruction of a metabolic network is annotation of metabolites. Metabolites are generally annotated with various database or structure based identifiers. Metabolite annotations in metabolic reconstructions may be incorrect or incomplete and thus need to be updated prior to their use. Genome-scale metabolic reconstructions generally include hundreds of metabolites. […]

10 November 2019 / Last updated : 13 November 2019 Andrew Cornell Cheminformatics

Consistency of systematic chemical identifiers within and between small-molecule databases

Abstract Background: Correctness of structures and associated metadata within public and commercial chemical databases greatly impacts drug discovery research activities such as quantitative structure–property relationships modelling and compound novelty checking. MOL files, SMILES notations, IUPAC names, and InChI strings are ubiquitous file formats and systematic identifiers for chemical structures. While interchangeable for many cheminformatics purposes […]

10 November 2019 / Last updated : 13 November 2019 Andrew Cornell Cheminformatics

Towards a Universal SMILES representation – A standard method to generate canonical SMILES based on the InChI

Abstract Background: There are two line notations of chemical structures that have established themselves in the field: the SMILES string and the InChI string. The InChI aims to provide a unique, or canonical, identifier for chemical structures, while SMILES strings are widely used for storage and interchange of chemical structures, but no standard exists to […]

10 November 2019 / Last updated : 13 November 2019 Andrew Cornell Cheminformatics

The status of the InChI project and the InChI trust

10 November 2019 / Last updated : 13 November 2019 Andrew Cornell Cheminformatics

Status of the InChI algorithm and InChI trust

10 November 2019 / Last updated : 13 November 2019 Andrew Cornell Classroom Material

Detection of IUPAC and IUPAC-like chemical names

Abstract Motivation: Chemical compounds like small signal molecules or other biological active chemical substances are an important entity class in life science publications and patents. Several representations and nomenclatures for chemicals like SMILES, InChI, IUPAC or trivial names exist. Only SMILES and InChI names allow a direct structure search, but in biomedical texts trivial names […]

7 November 2019 / Last updated : 7 November 2019 Andrew Cornell Classroom Material

Tautomer Identification and Tautomer Structure Generation Based on the InChI Code

Abstract An algorithm is introduced that enables a fast generation of all possible prototropic tautomers resulting from the mobile H atoms and associated heteroatoms as defined in the InChI code. The InChI-derived set of possible tautomers comprises (1,3)-shifts for open-chain molecules and (1,n)-shifts (with n being an odd number >3) for ring systems. In addition, our algorithm […]

7 November 2019 / Last updated : 13 November 2019 Andrew Cornell Cheminformatics

How Many Miles Have We Gone, InChI by InChI?

7 November 2019 / Last updated : 13 November 2019 Andrew Cornell Classroom Material

yaInChI: Modified InChI string scheme for line notation of chemical structures

Abstract A modified InChI (International Chemical Identifier) string scheme, yaInChI (yet another InChI), is suggested as a method for including the structural information of a given molecule, making it straightforward and more easily readable. The yaInChI theme is applicable for checking the structural identity with higher sensitivity and generating three-dimensional (3-D) structures from the one-dimensional […]

6 November 2019 / Last updated : 13 November 2019 Andrew Cornell Cheminformatics

InChI: connecting and navigating chemistry

Abstract The International Chemical Identifier (InChI) has had a dramatic impact on providing a means by which to deduplicate, validate and link together chemical compounds and related information across databases. Its influence has been especially valuable as the internet has exploded in terms of the amount of chemistry related information available online. This thematic issue […]

6 November 2019 / Last updated : 13 November 2019 Andrew Cornell Cheminformatics

InChI in the wild: an assessment of InChIKey searching in Google

Abstract While chemical databases can be queried using the InChI string and InChIKey (IK) the latter was designed for open-web searching. It is becoming increasingly effective for this since more sources enhance crawling of their websites by the Googlebot and consequent IK indexing. Searchers who use Google as an adjunct to database access may be […]

6 November 2019 / Last updated : 13 November 2019 Andrew Cornell Cheminformatics

UniChem: extension of InChI-based compound mapping to salt, connectivity and stereochemistry layers

Abstract UniChem is a low-maintenance, fast and freely available compound identifier mapping service, recently made available on the Internet. Until now, the criterion of molecular equivalence within UniChem has been on the basis of complete identity between Standard InChIs. However, a limitation of this approach is that stereoisomers, isotopes and salts of otherwise identical molecules […]

6 November 2019 / Last updated : 13 November 2019 Andrew Cornell Cheminformatics

On InChI and Evaluating the Quality of Cross-reference Links

Abstract Background: There are many databases of small molecules focused on different aspects of research and its applications. Some tasks may require integration of information from various databases. However, determining which entries from different databases represent the same compound is not straightforward. Integration can be based, for example, on automatically generated cross-reference links between entries. […]

5 November 2019 / Last updated : 13 November 2019 Andrew Cornell Cheminformatics

InChI, the IUPAC International Chemical Identifier

Abstract This paper documents the design, layout and algorithms of the IUPAC International Chemical Identifier, InChI.

5 November 2019 / Last updated : 5 November 2019 Andrew Cornell Cheminformatics

IUPAC STANDARDS ONLINE

Abstract IUPAC Standards Online is a database built from IUPAC’s (The International Union of Pure and Applied Chemistry) standards and recommendations, which are extracted from the journal Pure and Applied Chemistry (PAC). The International Union of Pure and Applied Chemistry (IUPAC) is the organization responsible for setting the standards in chemistry that are internationally binding […]

5 November 2019 / Last updated : 13 November 2019 Andrew Cornell Cheminformatics

Solving the Issues in Standardisation of Stereochemical Representations

Abstract Learn about technology that solves the issue of interpreting 3D stereochemical information implied in 2D structure representations.

5 November 2019 / Last updated : 5 November 2019 Andrew Cornell Classroom Material

Current Status and Future Development in Relation to IUPAC Activities

Abstract The IUPAC International Chemical Identifier (InChI) is a non-proprietary, machine-readable chemical structure representation format enabling electronic searching, and interlinking and combining, of chemical information from different sources. It was developed from 2001 onwards at the U.S. National Institute of Standards and Technology under the auspices of IUPAC’s Chemical Identifier project. Since 2009, the InChI […]

5 November 2019 / Last updated : 13 November 2019 Andrew Cornell Cheminformatics

A brief introduction to SMILES and InChI

Project for Cheminformatics Fall 2012. Part 2/2. Presentation on encodings, SMILES and InChI.

5 November 2019 / Last updated : 13 November 2019 Andrew Cornell Cheminformatics

Applications of the InChI in cheminformatics with the CDK and Bioclipse

Abstract Background The InChI algorithms are written in C++ and not available as Java library. Integration into software written in Java therefore requires a bridge between C and Java libraries, provided by the Java Native Interface (JNI) technology. Results We here describe how the InChI library is used in the Bioclipse workbench and the Chemistry […]

5 November 2019 / Last updated : 5 November 2019 Andrew Cornell Cheminformatics

Application of InChI to curate, index, and query 3-D structures

Abstract The HIV structural database (HIVSDB) is a comprehensive collection of the structures of HIV protease, both of unliganded enzyme and of its inhibitor complexes. It contains abstracts and crystallographic data such as inhibitor and protein coordinates for 248 data sets, of which only 141 are from the Protein Data Bank (PDB). Efficient annotation, indexing, […]

4 November 2019 / Last updated : 13 November 2019 Andrew Cornell Cheminformatics

Additive InChI-based optimal descriptors: QSPR modeling of fullerene C60 solubility in organic solvents

Abstract Optimal descriptors calculated with International Chemical Identifier (InChI) have been used to construct one-variable model of the solubility of fullerene C60 in organic solvents . Attempts to calculate the model for three splits into training and test sets gave stable results.

11 October 2019 / Last updated : 11 October 2019 Andrew Cornell Cheminformatics

IUPAC InChI (Video)

This presentation is a part of Google Tech Talks which was added to the GoogleTalksArchive on August 22, 2006. The original presentation date took place on November 2, 2006. ABSTRACT (Imported From YouTube Source) The central token of information in Chemistry is a chemical substance, an entity that can often be represented as a well-defined […]

26 August 2019 / Last updated : 26 November 2022 InChI OER Admin Cheminformatics

Capturing mixture composition: an open machine-readable format for representing mixed substances

Capturing mixture composition: an open machine-readable format for representing mixed substances Alex M. Clark, Leah R. McEwen, Peter Gedeck & Barry A. Bunin Journal of Cheminformatics volume 11, Article number: 33 (2019) Abstract: We describe a file format that is designed to represent mixtures of compounds in a way that is fully machine readable. This […]

23 August 2019 / Last updated : 8 December 2020 Martin Walker Organic

Introduction to the International Chemical Identifier (for Organic Chemistry Undergraduates)

29 July 2019 / Last updated : 29 July 2019 Vincent Scalfani Cheminformatics

RDKit InChI Calculation with Jupyter Notebook

This RDKit InChI Calculation with Jupyter Notebook tutorial is useful to teach the basics of how to interact with InChI using a cheminformatics toolkit in a Jupyter Notebook. The notebook has the following learning objectives: Setup RDKit with a Jupyter Notebook Construct a molecule (RDKit molecular object) from a SMILES string Display molecule images Calculate […]

6 March 2019 / Last updated : 8 December 2020 Steve Wathen Classroom Material

InChI Student Worksheet

This document contains a brief intro to InChI suitable for undergraduate students and two exercises, with answer keys. The first assignment asks about the information encoded in a sample InChI. The last question in this assignment asks students to use the InChI Key as a search term – this will be a lot easier to […]

19 November 2018 / Last updated : 21 November 2018 Richard Kidd Cheminformatics

Status of the IUPAC InChI Chemical Structure Standard – Today and the Future. Poster, Mainz 2018

Poster presented by Steve Heller, project director at the Mainz cheminformatics meeting. Nov 2018 Mainz-poster-final-11-18

19 November 2018 / Last updated : 22 November 2018 Richard Kidd Cheminformatics

InChI Infographic

Web and high-resolution formats of an InChI infographic. Nov 2018 [pdf-embedder url=”https://inchi-trust.info/wp/wp-content/uploads/2018/11/Web_A55856_InChi-trust_infographic.pdf”] HRM_A55856_InChi-trust_infographic Web_A55856_InChi-trust_infographic HR_A55856_InChi-trust_infographic

25 July 2018 / Last updated : 7 September 2018 Vincent Scalfani Cheminformatics

Matlab InChIKey Scripts

This is a collection of Matlab scripts for working with InChIKeys: IKextract, IKfreqFH, IKstring, and IKmusic IKextract, InChIKey Extract, can extract InChIKeys from chemical Structure data files (SDFs). This script was successfully used to extract over 90 million InChIKeys (unique chemical identifiers) from over 5000 PubChem SD files. Users can also extract other data from SDFs […]

13 July 2018 / Last updated : 31 December 2022 InChI OER Admin Classroom Material

IUPAC Name2PubChem

This submission shows you how to create a smart spreadsheet with Google Sheets that links an IUPAC name to a chemical’s PubChem landing page. You may click here to get a copy of this sheet. This particular sheet uses the Centre for Molecular Informatics OPSIN (Open Parser for Systematic IUPAC nomenclature) web service to convert the name […]

6 December 2013 / Last updated : 8 August 2023 Rudy Potenzone Cheminformatics

Applications of the InChI in cheminformatics with the CDK and Bioclipse.

Events