From a dataset of Drug-target binding data from BindingDB using IC50 measurements. Regression. Given a target protein amino acid sequence and a drug SMILES string, predict the binding affinity score between them. We predict pIC50 (pIC50 = -log10(IC50 in M); higher means more potent). Dataset: bindingdb_ic50. (1) The compound is N#Cc1c(N)nc2nc(C3CCCCC3)ncc2c1N. The target protein (Q9AIU7) has sequence MADLSSRVNELHDLLNQYSYEYYVEDNPSVPDSEYDKLLHELIKIEEEHPEYKTVDSPTVRVGGEAQASFNKVNHDTPMLSLGNAFNEDDLRKFDQRIREQIGNVEYMCELKIDGLAVSLKYVDGYFVQGLTRGDGTTGEDITENLKTIHAIPLKMKEPLNVEVRGEAYMPRRSFLRLNEEKEKNDEQLFANPRNAAAGSLRQLDSKLTAKRKLSVFIYSVNDFTDFNARSQSEALDELDKLGFTTNKNRARVNNIDGVLEYIEKWTSQRESLPYDIDGIVIKVNDLDQQDEMGFTQKSPRWAIAYKFPAEEVVTKLLDIELSIGRTGVVTPTAILEPVKVAGTTVSRASLHNEDLIHDRDIRIGDSVVVKKAGDIIPEVVRSIPERRPEDAVTYHMPTHCPSCGHELVRIEGEVALRCINPKCQAQLVEGLIHFVSRQAMNIDGLGTKIIQQLYQSELIKDVADIFYLTEEDLLPLDRMGQKKVDNLLAAIQQAKDNSL.... The pIC50 is 6.1. (2) The compound is O=P(O)(O)OC(CNS(=O)(=O)c1ccc(F)cc1)CN1CCOCC1. The target protein (P25051) has sequence MNRIKVAILFGGCSEEHDVSVKSAIEIAANINKEKYEPLYIGITKSGVWKMCEKPCAEWENDNCYSAVLSPDKKMHGLLVKKNHEYEINHVDVAFSALHGKSGEDGSIQGLFELSGIPFVGCDIQSSAICMDKSLTYIVAKNAGIATPAFWVINKDDRPVAATFTYPVFVKPARSGSSFGVKKVNSADELDYAIESARQYDSKILIEQAVSGCEVGCAVLGNSAALVVGEVDQIRLQYGIFRIHQEVEPEKGSENAVITVPADLSAEERGRIQETAKKIYKALGCRGLARVDMFLQDNGRIVLNEVNTLPGFTSYSRYPRMMAAAGIALPELIDRLIVLALKG. The pIC50 is 3.5. (3) The drug is O=C(O)[C@H]1O[C@@H](Oc2ccc([C@@H]3[C@@H](CC[C@H](O)c4ccc(F)cc4)C(=O)N3c3ccccc3)cc2)[C@H](O)[C@@H](O)[C@@H]1O. The target protein (Q6T3U4) has sequence MAAAWQGWLLWALLLNSAQGELYTPTHKAGFCTFYEECGKNPELSGGLTSLSNISCLSNTPARHVTGDHLALLQRVCPRLYNGPNDTYACCSTKQLVSLDSSLSITKALLTRCPACSENFVSIHCHNTCSPDQSLFINVTRVVQRDPGQLPAVVAYEAFYQRSFAEKAYESCSRVRIPAAASLAVGSMCGVYGSALCNAQRWLNFQGDTGNGLAPLDITFHLLEPGQALADGMKPLDGKITPCNESQGEDSAACSCQDCAASCPVIPPPPALRPSFYMGRMPGWLALIIIFTAVFVLLSVVLVYLRVASNRNKNKTAGSQEAPNLPRKRRFSPHTVLGRFFESWGTRVASWPLTVLALSFIVVIALSVGLTFIELTTDPVELWSAPKSQARKEKAFHDEHFGPFFRTNQIFVTAKNRSSYKYDSLLLGPKNFSGILSLDLLQELLELQERLRHLQVWSHEAQRNISLQDICYAPLNPHNTSLTDCCVNSLLQYFQNNHTL.... The pIC50 is 5.9. (4) The small molecule is O=C(O)c1nc(-c2cccc(OCc3ccccc3)c2)[nH]c(=O)c1O. The pIC50 is 4.0. The target protein (P00582) has sequence MVQIPQNPLILVDGSSYLYRAYHAFPPLTNSAGEPTGAMYGVLNMLRSLIMQYKPTHAAVVFDAKGKTFRDELFEHYKSHRPPMPDDLRAQIEPLHAMVKAMGLPLLAVSGVEADDVIGTLAREAEKAGRPVLISTGDKDMAQLVTPNITLINTMTNTILGPEEVVNKYGVPPELIIDFLALMGDSSDNIPGVPGVGEKTAQALLQGLGGLDTLYAEPEKIAGLSFRGAKTMAAKLEQNKEVAYLSYQLATIKTDVELELTCEQLEVQQPAAEELLGLFKKYEFKRWTADVEAGKWLQAKGAKPAAKPQETSVADEAPEVTATVISYDNYVTILDEETLKAWIAKLEKAPVFAFDTETDSLDNISANLVGLSFAIEPGVAAYIPVAHDYLDAPDQISRERALELLKPLLEDEKALKVGQNLKYDRGILANYGIELRGIAFDTMLESYILNSVAGRHDMDSLAERWLKHKTITFEEIAGKGKNQLTFNQIALEEAGRYAAE.... (5) The compound is CCOc1nc(/C=C/c2ccccc2)oc1C(=O)Oc1cncc(Cl)c1. The target protein (Q82122) has sequence MGAQVSRQNVGTHSTQNMVSNGSSLNYFNINYFKDAASSGASRLDFSQDPSKFTDPVKDVLEKGIPTLQSPSVEACGYSDRIIQITRGDSTITSQDVANAVVGYGVWPHYLTPQDATAIDKPTQPDTSSNRFYTLDSKMWNSTSKGWWWKLPDALKDMGIFGENMFYHFLGRSGYTVHVQCNASKFHQGTLLVVMIPEHQLATVNKGNVNAGYKYTHPGEAGREVGTQVENEKQPSDDNWLNFDGTLLGNLLIFPHQFINLRSNNSATLIVPYVNAVPMDSMVRHNNWSLVIIPVCQLQSNNISNIVPITVSISPMCAEFSGARAKTVVQGLPVYVTPGSGQFMTTDDMQSPCALPWYHPTKEIFIPGEVKNLIEMCQVDTLIPINSTQSNIGNVSMYTVTLSPQTKLAEEIFAIKVDIASHPLATTLIGEIASYFTHWTGSLRFSFMFCGTANTTLKVLLAYTPPGIGKPRSRKEAMLGTHVVWDVGLQSTVSLVVPWI.... The pIC50 is 6.2. (6) The target protein sequence is MEVQLGLGRVYPRPPSKTYRGAFQNLFQSVREVIQNPGPRHPEAASAAPPGASLQQQQQQQQQQQQQQETSPRQQQKQGEDGSPQAHRRGPTGYLVLDEEQQPSQPQSAPECHPERGCVPEPGAAVAAGKGLPQQLPAPPDEDDSAAPSTLSLLGPTFPGLSSCSADLKDILSEASTMQLLQQQQQEAVSEGSSNGRAREASGAPTSSKDNYLGGTSTISDSAKELCKAVSVSMGLGVEALEHLSPGEQLRGDCMYAPVLGVPPGVRPIPCAPLAECKGSLLDDSAGKSTEDTVEYSPFKGGYTKGLEGESLGCSGSAAAGSSGTLELPSTLSLYKSGALDEAAAYQSRDYYNFPLALAGPPPPPPPPHPHARIKLENPLDYGSAWAAAAAQCRYGDLASLHGAGAAGPGSGSPSAAASSSWHTLFTAEEGQLYGPCGGGGGGGGGGGGGGAGEAGAVAPYGYTRPPQGLAGQEGDFTAPDVWYPGGMVSRVPYPSPTCV.... The compound is CC(=O)Nc1ccc(OC[C@@H]2CN(c3ccc(C#N)c(C(F)(F)F)c3)[C@H](C(F)(F)F)O2)cc1. The pIC50 is 6.8. (7) The drug is COCCn1cc(-c2ccc(-c3c4c(nn3C)CCc3cnc(Nc5ccn(CC(=O)N(C)C)n5)nc3-4)cc2)cn1. The target protein sequence is FVFLFSVVIGSIYLFLRKRQPDGPLGPLYASSNPEYLSASDVFPCSVYVPDEWEVSREKITLLRELGQGSFGMVYEGNARDIIKGEAETRVAVKTVNESASLRERIEFLNEASVMKGFTCHHVVRLLGVVSKGQPTLVVMELMAHGDLKSYLRSLRPEAENNPGRPPPTLQEMIQMAAEIADGMAYLNAKKFVHRDLAARNCMVAHDFTVKIGDFGMTRDIYETDYYRKGGKGLLPVRWMAPESLKDGVFTTSSDMWSFGVVLWEITSLAEQPYQGLSNEQVLKFVMDGGYLDQPDNCPERVTDLMRMCWQFNPKMRPTFLEIVNLLKDDLHPSFPEVSFFHSEENKAPESEELEMEFEDMENVPLDRSSHCQREEAGGRDGGSSLGFKRSYEEHIPYTHMNGGKKN. The pIC50 is 9.0.