The simplified molecular input-line entry system (SMILES) and IUPAC International Chemical Identifier (InChI) were examined as representations of the molecular structure for quantitative structure-activity relationships (QSAR), which can be used to predict the inhibitory activity of styrylquinoline derivatives against the human immunodeficiency virus type 1 (HIV-1). Optimal SMILES-based descriptors give a best model with n = 26, r(2) = 0.6330, q(2) = 0.5812, s = 0.502, F = 41 for the training set and n = 10, r(2) = 0.7493, r(pred)(2) = 0.6235, R(m)(2) = 0.537, s = 0.541, F = 24 for the validation set. Optimal InChI-based descriptors give a best model with n = 26, r(2) = 0.8673, q(2) = 0.8456, s = 0.302, F = 157 for the training set and n = 10, r(2) = 0.8562, r(pred)(2) = 0.7715, R(m)(2) = 0.819, s = 0.329, F = 48 for the validation set. Thus, the InChI-based model is preferable. The described SMILES-based and InChI-based approaches have been checked with five random splits into the training and test sets.
|Content Type||Non OER|
|Number of Comments||No Comments|
|Content Tags||Audience, Content type, English, Graduate, Language, Publication, Researcher|