This data is from Drug-target binding data from BindingDB using IC50 measurements. The task is: Regression. Given a target protein amino acid sequence and a drug SMILES string, predict the binding affinity score between them. We predict pIC50 (pIC50 = -log10(IC50 in M); higher means more potent). Dataset: bindingdb_ic50. (1) The small molecule is COC(=O)c1cc2ccccc2n1CCCCCCCCCOC(=O)Cc1ccc([N+](C)(C)C)cc1. The target protein (Q6IA69) has sequence MGRKVTVATCALNQWALDFEGNLQRILKSIEIAKNRGARYRLGPELEICGYGCWDHYYESDTLLHSFQVLAALVESPVTQDIICDVGMPVMHRNVRYNCRVIFLNRKILLIRPKMALANEGNYRELRWFTPWSRSRHTEEYFLPRMIQDLTKQETVPFGDAVLVTWDTCIGSEICEELWTPHSPHIDMGLDGVEIITNASGSHQVLRKANTRVDLVTMVTSKNGGIYLLANQKGCDGDRLYYDGCAMIAMNGSVFAQGSQFSLDDVEVLTATLDLEDVRSYRAEISSRNLAASRASPYPRVKVDFALSCHEDLLAPISEPIEWKYHSPEEEISLGPACWLWDFLRRSQQAGFLLPLSGGVDSAATACLIYSMCCQVCEAVRSGNEEVLADVRTIVNQISYTPQDPRDLCGRILTTCYMASKNSSQETCTRARELAQQIGSHHISLNIDPAVKAVMGIFSLVTGKSPLFAAHGGSSRENLALQNVQARIRMVLAYLFAQLS.... The pIC50 is 4.2. (2) The drug is CC(C)C[C@H](NC(=O)[C@H](Cc1c[nH]c2ccccc12)NC(=O)OC(C)(C)C)C(=O)N[C@@H](CC(=O)O)C(=O)N1CCCC[C@H]1C(=O)O. The target protein (P25102) has sequence MEPNGTVHSCCLDSMALKVTISVVLTTLILITIAGNVVVCLAVSLNRRLRSLTNCFIVSLAATDLLLGLLVLPFSAIYQLSFTWSFGHVFCNIYTSLDVMLCTASILNLFMISLDRYCAVTDPLRYPVLVTPVRVAISLVFIWVISITLSFLSIHLGWNSRNGTRGGNDTFKCKVQVNEVYGLVDGLVTFYLPLLIMCVTYYRIFKIAREQAKRINHISSWKAATIREHKATVTLAAVMGAFIICWFPYFTAFVYRGLRGDDAINEAVEGIVLWLGYANSALNPILYAALNRDFRTAYQQLFHCKFASHNSHKTSLRLNNSLLPRSQSREGRWQEEKPLKLQVWSGTELTHPQGNPIR. The pIC50 is 4.5. (3) The compound is CC(C)Cn1c(CN)c(-c2ccccc2)c2cc(/C=C/C(N)=O)ccc2c1=O. The target protein (Q3ZCJ8) has sequence MGPWSGSRLVALLLLVYGAGSVRGDTPANCTYPDLLGTWVFQVGSSGSQRDVNCSVMGPPEKKVVVHLKKLDTAYDDFGNSGHFTIIYNQGFEIVLNDYKWFAFFKYKEEGGKVTSYCHETMTGWVHDVLGRNRACFTGRKTGNTSENVNVNTARLAGLEETYSNRLYRYNHDFVKAINAIQKSWTAAPYMEYETLTLKEMIRRGGGHSRRIPRPKPAPITAEIQKKILHLPTSWDWRNVHGINFVTPVRNQGSCGSCYSFASMGMMEARIRILTNNTQTPILSPQEVVSCSQYAQGCEGGFPYLIAGKYAQDFGLVEEDCFPYTGTDSPCRLKEGCFRYYSSEYHYVGGFYGGCNEALMKLELVHQGPMAVAFEVYDDFLHYRKGVYHHTGLRDPFNPFELTNHAVLLVGYGTDAASGLDYWIVKNSWGTSWGENGYFRIRRGTDECAIESIALAATPIPKL. The pIC50 is 3.8. (4) The drug is CCN([C@H]1CC[C@H]([C@H](N)Cc2cc(F)ccc2F)CC1)S(C)(=O)=O. The target protein (Q12884) has sequence MKTWVKIVFGVATSAVLALLVMCIVLRPSRVHNSEENTMRALTLKDILNGTFSYKTFFPNWISGQEYLHQSADNNIVLYNIETGQSYTILSNRTMKSVNASNYGLSPDRQFVYLESDYSKLWRYSYTATYYIYDLSNGEFVRGNELPRPIQYLCWSPVGSKLAYVYQNNIYLKQRPGDPPFQITFNGRENKIFNGIPDWVYEEEMLATKYALWWSPNGKFLAYAEFNDTDIPVIAYSYYGDEQYPRTINIPYPKAGAKNPVVRIFIIDTTYPAYVGPQEVPVPAMIASSDYYFSWLTWVTDERVCLQWLKRVQNVSVLSICDFREDWQTWDCPKTQEHIEESRTGWAGGFFVSTPVFSYDAISYYKIFSDKDGYKHIHYIKDTVENAIQITSGKWEAINIFRVTQDSLFYSSNEFEEYPGRRNIYRISIGSYPPSKKCVTCHLRKERCQYYTASFSDYAKYYALVCYGPGIPISTLHDGRTDQEIKILEENKELENALKN.... The pIC50 is 4.5. (5) The drug is CC[C@H]1NC[C@H](O)[C@@H]1O. The target protein (P69328) has sequence MSFRSLLALSGLVCTGLANVISKRATLDSWLSNEATVARTAILNNIGADGAWVSGADSGIVVASPSTDNPDYFYTWTRDSGLVLKTLVDLFRNGDTSLLSTIENYISAQAIVQGISNPSGDLSSGAGLGEPKFNVDETAYTGSWGRPQRDGPALRATAMIGFGQWLLDNGYTSTATDIVWPLVRNDLSYVAQYWNQTGYDLWEEVNGSSFFTIAVQHRALVEGSAFATAVGSSCSWCDSQAPEILCYLQSFWTGSFILANFDSSRSGKDANTLLGSIHTFDPEAACDDSTFQPCSPRALANHKEVVDSFRSIYTLNDGLSDSEAVAVGRYPEDTYYNGNPWFLCTLAAAEQLYDALYQWDKQGSLEVTDVSLDFFKALYSDAATGTYSSSSSTYSSIVDAVKTFADGFVSIVETHAASNGSMSEQYDKSDGEQLSARDLTWSYAALLTANNRRNSVVPASWGETSASSVPGTCAATSAIGTYSSVTVTSWPSIVATGGTT.... The pIC50 is 4.5.