The CSD Drug Subset: The Changing Chemistry and Crystallography of Small Molecule Pharmaceuticals.

Bryant, Mathew J.; Black, Simon N.; Blade, Helen; Docherty, Robert; Maloney, Andrew G. P.; Taylor, Stefan C.
J Pharm Sci 108 1655-1662 (2019).
Drug Design Published: (Jan/2019)

We report the generation and statistical analysis of the CSD drug subset: a subset of the Cambridge Structural Database (CSD) consisting of every published small-molecule crystal structure containing an approved drug molecule. By making use of InChI matching, a CSD Python API workflow to link CSD entries to the online database has been produced. This has resulted in a subset of 8632 crystal structures, representing all published solid forms of 785 unique drug molecules. We hope that this new resource will lead to improvements in targeted cheminformatics and statistical model building in a pharmaceutical setting. In addition to this, as part of the Advanced Digital Design of Pharmaceutical Therapeutics collaboration between academia and industry, we have been given the unique opportunity to run comparative analysis on the internal crystal structure databases of AstraZeneca and Pfizer, alongside comparison to the CSD as a whole.

