From a dataset of Drug-target binding data from BindingDB using IC50 measurements. Regression. Given a target protein amino acid sequence and a drug SMILES string, predict the binding affinity score between them. We predict pIC50 (pIC50 = -log10(IC50 in M); higher means more potent). Dataset: bindingdb_ic50. (1) The drug is C[C@]1(O)CCN(c2nc(N3C[C@H](O)[C@@H](O)C3)cc(C(F)(F)F)c2C#N)C1. The target protein (Q02974) has sequence MEEKQILCVGLVVLDIINVVDKYPEEDTDRRCLSQRWQRGGNASNSCTVLSLLGARCAFMGSLAHGHVADFLVADFRRRGVDVSQVAWQSQGDTPCSCCIVNNSNGSRTIILYDTNLPDVSAKDFEKVDLTRFKWIHIEGRNASEQVKMLQRIEQYNATQPLQQKVRVSVEIEKPREELFQLFGYGEVVFVSKDVAKHLGFRSAGEALKGLYSRVKKGATLICAWAEEGADALGPDGQLLHSDAFPPPRVVDTLGAGDTFNASVIFSLSKGNSMQEALRFGCQVAGKKCGLQGFDGIV. The pIC50 is 6.5. (2) The pIC50 is 6.3. The target protein (P01111) has sequence MTEYKLVVVGAGGVGKSALTIQLIQNHFVDEYDPTIEDSYRKQVVIDGETCLLDILDTAGQEEYSAMRDQYMRTGEGFLCVFAINNSKSFADINLYREQIKRVKDSDDVPMVLVGNKCDLPTRTVDTKQAHELAKSYGIPFIETSAKTRQGVEDAFYTLVREIRQYRMKKLNSSDDGTQGCMGLPCVVM. The compound is O=S(=O)(c1ccc(NO)cc1)N(CCOc1ccc2ccccc2c1)[C@@H]1O[C@H](CO)[C@@H](O)[C@H](O)[C@H]1O. (3) The drug is O=c1cc(-c2ccc(O)cc2)oc2cc(O)c(O)c(O)c12. The target protein (P04745) has sequence MKLFWLLFTIGFCWAQYSSNTQQGRTSIVHLFEWRWVDIALECERYLAPKGFGGVQVSPPNENVAIHNPFRPWWERYQPVSYKLCTRSGNEDEFRNMVTRCNNVGVRIYVDAVINHMCGNAVSAGTSSTCGSYFNPGSRDFPAVPYSGWDFNDGKCKTGSGDIENYNDATQVRDCRLSGLLDLALGKDYVRSKIAEYMNHLIDIGVAGFRIDASKHMWPGDIKAILDKLHNLNSNWFPEGSKPFIYQEVIDLGGEPIKSSDYFGNGRVTEFKYGAKLGTVIRKWNGEKMSYLKNWGEGWGFMPSDRALVFVDNHDNQRGHGAGGASILTFWDARLYKMAVGFMLAHPYGFTRVMSSYRWPRYFENGKDVNDWVGPPNDNGVTKEVTINPDTTCGNDWVCEHRWRQIRNMVNFRNVVDGQPFTNWYDNGSNQVAFGRGNRGFIVFNNDDWTFSLTLQTGLPAGTYCDVISGDKINGNCTGIKIYVSDDGKAHFSISNSAED.... The pIC50 is 5.0. (4) The compound is CCN(CC)C(=O)C1C(=O)N(O)C(=O)c2ccccc21. The target protein sequence is FLDGIDKAQEEHEKYHSNWRAMASDFNLPPVVAKEIVASCDKCQLKGEAMHGQVDCSPGIWQLDCTHLEGKVILVAVHVASGYIEAEVIPAETGQETAYFLLKLAGRWPVKTVHTDNGSNFTSTTVKAACWWAGIKQEFGIPYNPQSQGVIESMNKELKKIIGQVRDQAEHLKTAVQMAVFIHNFKRKGGIGGYSAGERIVDIIATDIQTKELQKQITKIQNFRVYYRDSRDPVWKGPAKLLWKGEGAVVIQDNSDIKVVPRRKAKIIRDYGKQMAGDDCVASRQDED. The pIC50 is 3.7. (5) The drug is CN1CCN(c2cc(C(=O)Nc3cccc(Nc4ccc5c(c4)NC(=O)/C5=C\c4ccc[nH]4)c3)cc(C(F)(F)F)c2)CC1. The target protein (Q06806) has sequence MVWWGSSLLLPTLFLASHVGASVDLTLLANLRITDPQRFFLTCVSGEAGAGRSSDPPLLLEKDDRIVRTFPPGQPLYLARNGSHQVTLRGFSKPSDLVGVFSCVGGAGARRTRVLYVHNSPGAHLFPDKVTHTVNKGDTAVLSAHVHKEKQTDVIWKNNGSYFNTLDWQEADDGRFQLQLQNVQPPSSGIYSATYLEASPLGSAFFRLIVRGCGAGRWGPGCVKDCPGCLHGGVCHDHDGECVCPPGFTGTRCEQACREGRFGQSCQEQCPGTAGCRGLTFCLPDPYGCSCGSGWRGSQCQEACAPGHFGADCRLQCQCQNGGTCDRFSGCVCPSGWHGVHCEKSDRIPQILSMATEVEFNIGTMPRINCAAAGNPFPVRGSMKLRKPDGTMLLSTKVIVEPDRTTAEFEVPSLTLGDSGFWECRVSTSGGQDSRRFKVNVKVPPVPLTAPRLLAKQSRQLVVSPLVSFSGDGPISSVRLHYRPQDSTIAWSAIVVDPSE.... The pIC50 is 5.2. (6) The drug is C[C@@H](N1CN([C@@H]2c3ccccc3SCc3cccc(C4CC4)c32)n2ccc(=O)c(O)c2C1=O)C(F)(F)F. The target protein (P21675) has sequence MGPGCDLLLRTAATITAAAIMSDTDSDEDSAGGGPFSLAGFLFGNINGAGQLEGESVLDDECKKHLAGLGALGLGSLITELTANEELTGTDGALVNDEGWVRSTEDAVDYSDINEVAEDESRRYQQTMGSLQPLCHSDYDEDDYDADCEDIDCKLMPPPPPPPGPMKKDKDQDSITGEKVDFSSSSDSESEMGPQEATQAESEDGKLTLPLAGIMQHDATKLLPSVTELFPEFRPGKVLRFLRLFGPGKNVPSVWRSARRKRKKKHRELIQEEQIQEVECSVESEVSQKSLWNYDYAPPPPPEQCLSDDEITMMAPVESKFSQSTGDIDKVTDTKPRVAEWRYGPARLWYDMLGVPEDGSGFDYGFKLRKTEHEPVIKSRMIEEFRKLEENNGTDLLADENFLMVTQLHWEDDIIWDGEDVKHKGTKPQRASLAGWLPSSMTRNAMAYNVQQGFAATLDDDKPWYSIFPIDNEDLVYGRWEDNIIWDAQAMPRLLEPPVL.... The pIC50 is 8.4. (7) The small molecule is CC(=N)N1CCC(Oc2ccc(N(Cc3cc(-c4cccc(C(=N)N)c4)no3)S(C)(=O)=O)cc2)CC1. The target protein (P00763) has sequence MRALLFLALVGAAVAFPVDDDDKIVGGYTCQENSVPYQVSLNSGYHFCGGSLINDQWVVSAAHCYKSRIQVRLGEHNINVLEGNEQFVNAAKIIKHPNFDRKTLNNDIMLIKLSSPVKLNARVATVALPSSCAPAGTQCLISGWGNTLSSGVNEPDLLQCLDAPLLPQADCEASYPGKITDNMVCVGFLEGGKDSCQGDSGGPVVCNGELQGIVSWGYGCALPDNPGVYTKVCNYVDWIQDTIAAN. The pIC50 is 5.0. (8) The small molecule is O=S(=O)(Nc1cc(Cl)cc(Cl)c1)c1cc(Cl)cc(Cl)c1O. The target protein (P53396) has sequence MSAKAISEQTGKELLYKFICTTSAIQNRFKYARVTPDTDWARLLQDHPWLLSQNLVVKPDQLIKRRGKLGLVGVNLTLDGVKSWLKPRLGQEATVGKATGFLKNFLIEPFVPHSQAEEFYVCIYATREGDYVLFHHEGGVDVGDVDAKAQKLLVGVDEKLNPEDIKKHLLVHAPEDKKEILASFISGLFNFYEDLYFTYLEINPLVVTKDGVYVLDLAAKVDATADYICKVKWGDIEFPPPFGREAYPEEAYIADLDAKSGASLKLTLLNPKGRIWTMVAGGGASVVYSDTICDLGGVNELANYGEYSGAPSEQQTYDYAKTILSLMTREKHPDGKILIIGGSIANFTNVAATFKGIVRAIRDYQGPLKEHEVTIFVRRGGPNYQEGLRVMGEVGKTTGIPIHVFGTETHMTAIVGMALGHRPIPNQPPTAAHTANFLLNASGSTSTPAPSRTASFSESRADEVAPAKKAKPAMPQDSVPSPRSLQGKSTTLFSRHTKAI.... The pIC50 is 5.6. (9) The small molecule is C=CC(=O)Nc1cc(C(=O)NC2CCN(C)CC2)ccc1Nc1ncc(C)c(N2CCN(C(=O)Nc3ccc(C#N)cc3)CC2)n1. The target protein (P22455) has sequence MRLLLALLGVLLSVPGPPVLSLEASEEVELEPCLAPSLEQQEQELTVALGQPVRLCCGRAERGGHWYKEGSRLAPAGRVRGWRGRLEIASFLPEDAGRYLCLARGSMIVLQNLTLITGDSLTSSNDDEDPKSHRDPSNRHSYPQQAPYWTHPQRMEKKLHAVPAGNTVKFRCPAAGNPTPTIRWLKDGQAFHGENRIGGIRLRHQHWSLVMESVVPSDRGTYTCLVENAVGSIRYNYLLDVLERSPHRPILQAGLPANTTAVVGSDVELLCKVYSDAQPHIQWLKHIVINGSSFGADGFPYVQVLKTADINSSEVEVLYLRNVSAEDAGEYTCLAGNSIGLSYQSAWLTVLPEEDPTWTAAAPEARYTDIILYASGSLALAVLLLLAGLYRGQALHGRHPRPPATVQKLSRFPLARQFSLESGSSGKSSSSLVRGVRLSSSGPALLAGLVSLDLPLDPLWEFPRDRLVLGKPLGEGCFGQVVRAEAFGMDPARPDQASTV.... The pIC50 is 7.3. (10) The small molecule is O=P(O)(O)C(O)(Cc1cccc(-c2ccc(Cl)cc2)c1)P(=O)(O)O. The target protein sequence is MIKMSPLLLKAAVVTACLCSLAVATTVEEQTAPKPIENATTYQQELGGRGKVDSPTAPGDAVSITSGIKVMSVTTATAIIFLASAFGFSFAMYWWYVASDIKITPGKGNIMRNAHLTDEVMRNVYVISKRVSDGANAFLFAEYRYMGIFMLGFGALLYFLLGVAMSSPQGEGKDGRPPVAVEAPWVNAAFSLYAFVIGAFTSVLAGWIGMRIAVYTNSRTAVMATVGSGGSDNDVLANGSQSRGYALAFQTAFRGGITMGFALTSIGLFALFCTVKLMQTYFGDSAERLPELFECVAAFGLGGSSVACFGRVGGGIYTKAADVGADLVGKVEKNIPEDDARNPGVIADCIGDNVGDIAGMGSDLFGSFGEATCAALVIAASSAELSADFTCMMYPLLITAGGIFVCIGTALLAATNSGVKWAEDIEPTLKHQLLVSTIGATVVLVFITAYSLPDAFTVGAVETTKWRAMVCVLCGLWSGLLIGYSTEYFTSNSYRPVQEI.... The pIC50 is 3.4.