From a dataset of Drug-target binding data from BindingDB using IC50 measurements. Regression. Given a target protein amino acid sequence and a drug SMILES string, predict the binding affinity score between them. We predict pIC50 (pIC50 = -log10(IC50 in M); higher means more potent). Dataset: bindingdb_ic50. (1) The small molecule is CCOP(=O)(O)C(NCCCCCCNC(c1ccccc1O)P(=O)(O)OCC)c1ccccc1O. The target protein (P17706) has sequence MPTTIEREFEELDTQRRWQPLYLEIRNESHDYPHRVAKFPENRNRNRYRDVSPYDHSRVKLQNAENDYINASLVDIEEAQRSYILTQGPLPNTCCHFWLMVWQQKTKAVVMLNRIVEKESVKCAQYWPTDDQEMLFKETGFSVKLLSEDVKSYYTVHLLQLENINSGETRTISHFHYTTWPDFGVPESPASFLNFLFKVRESGSLNPDHGPAVIHCSAGIGRSGTFSLVDTCLVLMEKGDDINIKQVLLNMRKYRMGLIQTPDQLRFSYMAIIEGAKCIKGDSSIQKRWKELSKEDLSPAFDHSPNKIMTEKYNGNRIGLEEEKLTGDRCTGLSSKMQDTMEENSESALRKRIREDRKATTAQKVQQMKQRLNENERKRKRWLYWQPILTKMGFMSVILVGAFVGWTLFFQQNAL. The pIC50 is 2.0. (2) The drug is Cc1c(C(C)n2c(=O)c(C(=O)O)cn(-c3ccc4c(c3)oc(=O)n4C)c2=O)cccc1C(F)(F)F. The target protein sequence is MLLPALRLLLFLLGSSAEAGKIIGGTECRPHARPYMAYLEIVTPENHLSACSGFLIRRNFVMTAAHCAGRSITVLLGAHNKKVKEDTWQKLEVEKQFPHPKYDDHLVLNDIMLLKLKEKANLTLGVGTLPISAKSNSIPPGRVCRAVGWGRTNVNEPPSDTLQEVKMRILDPQACKHFEDFHQEPQLCVGNPKKIRNVYKGDSGGPLLCAGIAQGIASYVLRNAKPPSVFTRISHYRPWINKILREN. The pIC50 is 6.9. (3) The drug is CC(C)C[C@@H]1NC(=O)[C@@H](Cc2ccccc2)NC(=O)[C@H]2CCCN2C(=O)[C@@H](Cc2cc3ccccc3[nH]2)NC(=O)[C@H](CCCCN)NC(=O)[C@@H](CCC(N)=O)NC(=O)[C@H]2CCCN2C(=O)[C@H](CC(C)C)NC(=O)[C@@H](Cc2ccccc2)NC(=O)[C@H]2CCCN2C(=O)[C@@H](Cc2cc3ccccc3[nH]2)NC(=O)[C@H](CCCCN)NC(=O)[C@@H](CCC(N)=O)NC(=O)[C@H]2CCCN2C1=O. The target protein (O93400) has sequence MEDDIAALVVDNGSGMCKAGFAGDDAPRAVFPSIVGRPRHQGVMVGMGQKDSYVGDEAQSKRGILTLKYPIEHGIVTNWDDMEKIWHHTFYNELRVAPEEHPVLLTEAPLNPKANREKMTQIMFETFNTPAMYVAIQAVLSLYASGRTTGIVMDSGDGVTHTVPIYEGYALPHAILRLDLAGRDLTDYLMKILTERGYSFTTTAEREIVRDIKEKLCYVALDFEQEMATAASSSSLEKSYELPDGQVITIGNERFRCPEALFQPSFLGMESCGIHETTYNSIMKCDVDIRKDLYANTVLSGGTTMYPGIADRMQKEITALAPSTMKIKIIAPPERKYSVWIGGSILASLSTFQQMWISKQEYDESGPSIVHRKCF. The pIC50 is 6.0. (4) The target protein (P59722) has sequence PRAQPAPAQPRVAPPPGGAPGAARAGGAARRGDSSTAASRVPGPEDATQAGSGPGPAEPSSEDPPPSRSPGPERASLCPAGGGPGEALSPSGGLRPNGQTKPLPALKLALEYIVPCMNKHGICVVDDFLGRETGQQIGDEVRALHDTGKFTDGQLVSQKSDSSKDIRGDKITWIEGKEPGCETIGLLMSSMDDLIRHCSGKLGNYRINGRTKAMVACYPGNGTGYVRHVDNPNGDGRCVTCIYYLNKDWDAKVSGGILRIFPEGKAQFADIEPKFDRLLFFWSDRRNPHEVQPAYATRYAITVWYFDADERARAKVKYLTGEKGVRVELKPNSVSKDV. The pIC50 is 7.1. The small molecule is COc1ccc(-c2cccc3c(O)oc(=O)c(C(=O)NCC(=O)O)c23)cc1. (5) The pIC50 is 7.4. The drug is CC(C)(O)c1ccc2c(c1)ncn2-c1ccnc(-c2cccnc2Cl)c1. The target protein (P14867) has sequence MRKSPGLSDCLWAWILLLSTLTGRSYGQPSLQDELKDNTTVFTRILDRLLDGYDNRLRPGLGERVTEVKTDIFVTSFGPVSDHDMEYTIDVFFRQSWKDERLKFKGPMTVLRLNNLMASKIWTPDTFFHNGKKSVAHNMTMPNKLLRITEDGTLLYTMRLTVRAECPMHLEDFPMDAHACPLKFGSYAYTRAEVVYEWTREPARSVVVAEDGSRLNQYDLLGQTVDSGIVQSSTGEYVVMTTHFHLKRKIGYFVIQTYLPCIMTVILSQVSFWLNRESVPARTVFGVTTVLTMTTLSISARNSLPKVAYATAMDWFIAVCYAFVFSALIEFATVNYFTKRGYAWDGKSVVPEKPKKVKDPLIKKNNTYAPTATSYTPNLARGDPGLATIAKSATIEPKEVKPETKPPEPKKTFNSVSKIDRLSRIAFPLLFGIFNLVYWATYLNREPQLKAPTPHQ. (6) The target protein (P26010) has sequence MVALPMVLVLLLVLSRGESELDAKIPSTGDATEWRNPHLSMLGSCQPAPSCQKCILSHPSCAWCKQLNFTASGEAEARRCARREELLARGCPLEELEEPRGQQEVLQDQPLSQGARGEGATQLAPQRVRVTLRPGEPQQLQVRFLRAEGYPVDLYYLMDLSYSMKDDLERVRQLGHALLVRLQEVTHSVRIGFGSFVDKTVLPFVSTVPSKLRHPCPTRLERCQSPFSFHHVLSLTGDAQAFEREVGRQSVSGNLDSPEGGFDAILQAALCQEQIGWRNVSRLLVFTSDDTFHTAGDGKLGGIFMPSDGHCHLDSNGLYSRSTEFDYPSVGQVAQALSAANIQPIFAVTSAALPVYQELSKLIPKSAVGELSEDSSNVVQLIMDAYNSLSSTVTLEHSSLPPGVHISYESQCEGPEKREGKAEDRGQCNHVRINQTVTFWVSLQATHCLPEPHLLRLRALGFSEELIVELHTLCDCNCSDTQPQAPHCSDGQGHLQCGVC.... The small molecule is CO[C@]12CCCC[C@@]1(OC)OC1[C@H](COCC(C)C)O[C@@H](OCC(=O)O)[C@H](OCc3ccccc3)[C@H]1O2. The pIC50 is 3.4.