Method of matrix factorization with side information for accurate modeling bioactivates of ligands acting with GPCRs
WU Jiansheng1*, LAN Chuangchuang1, QIN Jie2, ZHU Yanxiang3, HU Haifeng2
(1 School of Geographic and Biological Information, Nanjing University of Posts and Telecommunications, Nanjing 210023, Jiangsu, China; 2 School of Telecommunication and Information Engineering, Nanjing University of Posts and Telecommunications, Nanjing 210023, Jiangsu, China; 3 VeriMake Research, Nanjing Qujike Info-tech Co.,Ltd., Nanjing 210088, Jiangsu, China)
Abstract:
G protein-coupled receptors (GPCRs) are one of the most important drug targets, accounting for about 34% of drugs on the market. For drug discovery, accurate modeling of bioactivities of ligand molecules is critical for the screening of hit compounds. For each GPCR task, its associated ligand entries with bioactivity values via biological assays usually are insufficient. The inclusion of multiple GPCR tasks in learning bioactivities of ligands through matrix factorization potentially enhances the model performance due to the utilization of correlation information among GPCR tasks.A matrix factorization-based method named MFSI for predicting bioactivities of ligand molecules targeting GPCRs is proposed. Our method couples some side information about the extended connectivity fingerprints of ligand molecules, and also overcomes the problem of existing a large number of missing bioactivity values in GPCR-ligand association matrices. Our method has been tested on a series of 72 representative GPCR tasks which cover 24 subfamilies. The results show that our method is overall superior to classical single-task learning methods and matrix factorization methods. In addition, our method achieves better performance than state-of-the-art deep multi-task learning-based methods of predicting ligand bioactivities on most datasets (66/72),and our method obtained an average improvement of 18% on r2 and 12% on root mean square error over the DeepNeuralNet-QSAR predictors.
KeyWords:
G protein-coupled receptors; bioactivities of ligands; matrix factorization; extended connectivity fingerprints