Matlab InChIKey Scripts

Matlab InChIKey Scripts
Authored by:
Vincent F. Scalfani

This is a collection of Matlab scripts for working with InChIKeys: IKextract, IKfreqFH, IKstring, and IKmusic

IKextract, InChIKey Extract, can extract InChIKeys from chemical Structure data files (SDFs). This script was successfully used to extract over 90 million InChIKeys (unique chemical identifiers) from over 5000 PubChem SD files. Users can also extract other data from SDFs by specifying the desired SD tag.

IKfreqFH, InChIKey frequency of first hash block, extracts the first hash block of InChIKeys and sorts them by frequency. Such a method is useful for analyzing the variety of chemical connectivity in large datasets.

IKstring, InChIKey String, allows for searching for strings within InChIKeys. I use it to search the > 90 million InChIKeys in PubChem.

IKmusic, InChIKey music, creates music from InChIKeys. A unique song is created for each InChIKey (i.e. every unique chemical substance has a different song!)

Information
Content Type Off-Site
Uploaded By Vincent F. Scalfani,
DOI https://www.mathworks.com/matlabcentral/fileexchange/62870-matlab-inchikey-scripts-ikextract-ikfreqfh-ikstring-and-ikmusic,
Content Status publish
Number of Comments No Comments
Date Published 07/25/2018
Date Last Modified 09/07/2018
Content Tags Cheminformatics, Curricular Material, Data Extraction, English, InChI Application, Matlab (.m), Script, Search, Software, Undergraduate