Dataset: Drug-target binding data from BindingDB using IC50 measurements. Task: Regression. Given a target protein amino acid sequence and a drug SMILES string, predict the binding affinity score between them. We predict pIC50 (pIC50 = -log10(IC50 in M); higher means more potent). Dataset: bindingdb_ic50. (1) The compound is O=c1[nH]c(=O)c2nc3ccccc3nc2[nH]1. The target protein (Q17339) has sequence MPSCTTPTYGVSTQLESQSSESPSRSSVMTPTSLDGDNSPRKRFPIIDNVPADRWPSTRRDGWSSVRAPPPARLTLSTNNRHIMSPISSAYSQTPNSLLSPAMFNPKSRSIFSPTLPATPMSYGKSSMDKSLFSPTATEPIEVEATVEYLADLVKEKKHLTLFPHMFSNVERLLDDEIGRVRVALFQTEFPRVELPEPAGDMISITEKIYVPKNEYPDYNFVGRILGPRGMTAKQLEQDTGCKIMVRGKGSMRDKSKESAHRGKANWEHLEDDLHVLVQCEDTENRVHIKLQAALEQVKKLLIPAPEGTDELKRKQLMELAIINGTYRPMKSPNPARVMTAVPLLSPTPLRSSGPVLMSPTPGSGLPSTTFGGSILSPTLTASNLLGSNVFDYSLLSPSMFDSFSSLQLASDLTFPKYPTTTSFVNSFPGLFTSASSFANQTNTNVSPSGASPSASSVNNTSF. The pIC50 is 5.8. (2) The small molecule is CCc1nc2ccc(C3CCN(CC(=O)N4CCCC4)CC3)cn2c1N(C)c1nc(-c2ccc(F)cc2)cs1. The target protein (Q64610) has sequence MARQGCLGSFQVISLFTFAISVNICLGFTASRIKRAEWDEGPPTVLSDSPWTNTSGSCKGRCFELQEVGPPDCRCDNLCKSYSSCCHDFDELCLKTARGWECTKDRCGEVRNEENACHCSEDCLSRGDCCTNYQVVCKGESHWVDDDCEEIKVPECPAGFVRPPLIIFSVDGFRASYMKKGSKVMPNIEKLRSCGTHAPYMRPVYPTKTFPNLYTLATGLYPESHGIVGNSMYDPVFDASFHLRGREKFNHRWWGGQPLWITATKQGVRAGTFFWSVSIPHERRILTILQWLSLPDNERPSVYAFYSEQPDFSGHKYGPFGPEMTNPLREIDKTVGQLMDGLKQLRLHRCVNVIFVGDHGMEDVTCDRTEFLSNYLTNVDDITLVPGTLGRIRAKSINNSKYDPKTIIANLTCKKPDQHFKPYMKQHLPKRLHYANNRRIEDIHLLVDRRWHVARKPLDVYKKPSGKCFFQGDHGFDNKVNSMQTVFVGYGPTFKYRTKV.... The pIC50 is 6.9. (3) The compound is COCC(=O)O[C@]1(CCN(C)CCCc2nc3ccccc3[nH]2)CCc2cc(F)ccc2[C@@H]1C(C)C. The target protein (O95180) has sequence MTEGARAADEVRVPLGAPPPGPAALVGASPESPGAPGREAERGSELGVSPSESPAAERGAELGADEEQRVPYPALAATVFFCLGQTTRPRSWCLRLVCNPWFEHVSMLVIMLNCVTLGMFRPCEDVECGSERCNILEAFDAFIFAFFAVEMVIKMVALGLFGQKCYLGDTWNRLDFFIVVAGMMEYSLDGHNVSLSAIRTVRVLRPLRAINRVPSMRILVTLLLDTLPMLGNVLLLCFFVFFIFGIVGVQLWAGLLRNRCFLDSAFVRNNNLTFLRPYYQTEEGEENPFICSSRRDNGMQKCSHIPGRRELRMPCTLGWEAYTQPQAEGVGAARNACINWNQYYNVCRSGDSNPHNGAINFDNIGYAWIAIFQVITLEGWVDIMYYVMDAHSFYNFIYFILLIIVGSFFMINLCLVVIATQFSETKQRESQLMREQRARHLSNDSTLASFSEPGSCYEELLKYVGHIFRKVKRRSLRLYARWQSRWRKKVDPSAVQGQGP.... The pIC50 is 6.9. (4) The compound is COc1ccccc1NC(=O)COC(=O)c1cc(-c2ccco2)nc2ccccc12. The target protein (O35430) has sequence MNHLEGSAEVEVADEAPGGEVNESVEADLEHPEVEEEQQPSPPPPAGHAPEDHRAHPAPPPPPPPQEEEEERGECLARSASTESGFHNHTDTAEGDVLAAARDGYEAERAQDADDESAYAVQYRPEAEEYTEQAEAEHAEAAQRRALPNHLHFHSLEHEEAMNAAYSGYVYTHRLFHRAEDEPYAEPYADYGGLQEHVYEEIGDAPELEARDGLRLYERERDEAAAYRQEALGARLHHYDERSDGESDSPEKEAEFAPYPRMDSYEQEEDIDQIVAEVKQSMSSQSLDKAAEDMPEAEQDLERAPTPGGGHPDSPGLPAPAGQQQRVVGTPGGSEVGQRYSKEKRDAISLAIKDIKEAIEEVKTRTIRSPYTPDEPKEPIWVMRQDISPTRDCDDQRPVDGDSPSPGSSSPLGAESSITPLHPGDPTEASTNKESRKSLASFPTYVEVPGPCDPEDLIDGIIFAANYLGSTQLLSDKTPSKNVRMMQAQEAVSRIKTAQK.... The pIC50 is 5.2. (5) The drug is O=C1NCc2c(-c3ccc(F)cc3F)cc(N3CCNCC3)cc2N1c1c(Cl)cccc1Cl. The target protein sequence is MSQERPTFYRQELNKTIWEVPERYQNLSPVGSGAYGSVCAAFDTKTGHRVAVKKLSRPFQSIIHAKRTYRELRLLKHMKHENVIGLLDVFTPARSLEEFNDVYLVTHLMAADLNNIVKCQKLTDDHVQFLIYQILRGLKYIHSADIIHRDLKPSNLAVNEDCELKILDFGLARHTDDEMTGYVATRWYRAPEIMLNWMHYNQTVDIWSVGCIMAELLTGRTLFPGTDHIDQLKLILRLVGTPGAELLKKISSESARNYIQSLAQMPKMNFANVFIGANPLAVDLLEKMLVLDSDKRITAAQALAHAYFAQYHDPDDEPVADPYDQSFESRDLLIDEWKSLTYDEVISFVPPPLDQEEMES. The pIC50 is 7.6.