Generative Chemical Transformer: Neural Machine Learning of Molecular Geometric Structures from Chemical Language via Attention

Hyunseung Kim; Jonggeol Na; Won Bo Lee
J. Chem. Inf. Model. 2021, 61, 12, 5804–5814.
AI/ML Published: (Dec/2021)

Discovering new materials better suited to specific purposes is an important issue in improving the quality of human life. Here, a neural network that creates molecules that meet some desired multiple target conditions based on a deep understanding of chemical language is proposed (generative chemical Transformer, GCT). The attention mechanism in GCT allows a deeper understanding of molecular structures beyond the limitations of chemical language itself which cause semantic discontinuity by paying attention to characters sparsely. The significance of language models for inverse molecular design problems is investigated by quantitatively evaluating the quality of the generated molecules. GCT generates highly realistic chemical strings that satisfy both chemical and linguistic grammar rules. Molecules parsed from the generated strings simultaneously satisfy the multiple target properties and vary for a single condition set. These advances will contribute to improving the quality of human life by accelerating the process of desired material discovery.