Test (All Tags)
Mixtures: Informatics for formulations and consumer products
Leah R. McEwen & Alex M. Clark
Presentation to the Royal Society of Chemistry Formulation 4.1
Capturing Mixtures — Bringing Informatics to the World of Practical Chemistry
Recorded live December 19, 2019
CDD Bault Webinar
Chris Jakober, Leah McEwen and Alex Clark
Slides and Video Available at CDD VAULT
Our industry has been using cheminformatics to support drug discovery for decades, leveraging formats for describing organic molecules, such as Molfile, SMILES, and InChI. These are idealized concepts rather than a description of the laboratory reality: it is rare that a substance can be accurately described with a single molecule.
Almost every sample has an impurity level, or is dissolved in solvent, or exists as an adduct, or is explicitly combined with other substances. Mixtures often combine certainty and uncertainty within the same description: some components can be well-defined molecules with an accurately measured molar concentration, while others are estimated portions, or amorphous adjuncts.
The cheminformatics community has yet to select a standard format for describing mixtures in a machine-readable, standardized, and interoperable way, and most publications fall back to using a text description. Electronic lab notebooks and inventory databases are forced to choose between using text, proprietary formats, or ignoring mixture composition altogether.
We will describe our work toward two new data structures: Mixfile and MInChI, which are intended to fill roles that are analogous to Molfile and InChI, respectively. We will describe the ways in which we expect that mixtures-based informatics tools will affect all industries that intersect with chemistry.
Join our expert panelists as we discuss the future of mixture representation, including:
- Introduction of the new mixtures standards, MInChI and Mixfile
- Impacts on health and chemical safety communities
- New technologies for unambiguous mixture capture
Capturing mixture composition: an open machine-readable format for representing mixed substances
Alex M. Clark, Leah R. McEwen, Peter Gedeck & Barry A. Bunin
Journal of Cheminformatics volume 11, Article number: 33 (2019)
Abstract: We describe a file format that is designed to represent mixtures of compounds in a way that is fully machine readable. This Mixfile format is intended to fill the same role for substances that are composed of multiple components as the venerable Molfile does for specifying individual structures. This much needed datastructure is intended to replace current practices for communicating information about mixtures, which usually relies on human-readable text descriptions, drawing several species within a single molecular diagram, or mutually incompatible ad hoc solutions. We describe an open source software application for editing mixture files, which can also be used as web-ready tools for manipulating the file format. We also present a corpus of mixture examples, which we have extracted from collections of text-based descriptions. Furthermore, we present an early look at the proposed IUPAC Mixtures InChI specification, instances of which can be automatically generated using the Mixfile format as a precursor.