Dataset: Drug-target binding data from BindingDB using IC50 measurements. Task: Regression. Given a target protein amino acid sequence and a drug SMILES string, predict the binding affinity score between them. We predict pIC50 (pIC50 = -log10(IC50 in M); higher means more potent). Dataset: bindingdb_ic50. (1) The pIC50 is 4.7. The target protein sequence is MCSLKWDYDLRCGEYTLNLNEKTLIMGILNVTPDSFSDGGSYNEVDAAVRHAKEMRDEGAHIIDIGGESTRPGFAKVSVEEEIKRVVPMIQAVSKEVKLPISIDTYKAEVAKQAIEAGAHIINDIWGAKAEPKIAEVAAHYDVPIILMHNRDNMNYRNLMADMIADLYDSIKIAKDAGVRDENIILDPGIGFAKTPEQNLEAMRNLEQLNVLGYPVLLGTSRKSFIGHVLDLPVEERLEGTGATVCLGIEKGCEFVRVHDVKEMSRMAKMMDAMIGKGVK. The compound is Nc1nc2ncc(CNc3ccc(S(=O)(=O)Nc4ncccn4)cc3)nc2c(=O)[nH]1. (2) The compound is O=C1CC[C@@H](N2C(=O)c3ccccc3C2=O)C(=O)N1. The target protein (Q96SW2) has sequence MAGEGDQQDAAHNMGNHLPLLPAESEEEDEMEVEDQDSKEAKKPNIINFDTSLPTSHTYLGADMEEFHGRTLHDDDSCQVIPVLPQVMMILIPGQTLPLQLFHPQEVSMVRNLIQKDRTFAVLAYSNVQEREAQFGTTAEIYAYREEQDFGIEIVKVKAIGRQRFKVLELRTQSDGIQQAKVQILPECVLPSTMSAVQLESLNKCQIFPSKPVSREDQCSYKWWQKYQKRKFHCANLTSWPRWLYSLYDAETLMDRIKKQLREWDENLKDDSLPSNPIDFSYRVAACLPIDDVLRIQLLKIGSAIQRLRCELDIMNKCTSLCCKQCQETEITTKNEIFSLSLCGPMAAYVNPHGYVHETLTVYKACNLNLIGRPSTEHSWFPGYAWTVAQCKICASHIGWKFTATKKDMSPQKFWGLTRSALLPTIPDTEDEISPDKVILCL. The pIC50 is 3.5. (3) The target protein sequence is MHHHHHHSSGVDLGTENLYFQSMQGTNPYLTFHCVNQGTILLDLAPEDKEYQSVEEEMQSTIREHRDGGNAGGIFNRYNVIRIQKVVNKKLRERFCHRQKEVSEENHNHHNERMLFHGSPFINAIIHKGFDERHAYIGGMFGAGIYFAENSSKSNQYVYGIGGGTGCPTHKDRSCYICHRQMLFCRVTLGKSFLQFSTIKMAHAPPGHHSVIGRPSVNGLAYAEYVIYRGEQAYPEYLITYQIMKPEAPSQTATAAEQ. The pIC50 is 7.4. The drug is COc1ccc(-c2cc3c(C)cccc3c(=O)[nH]2)nc1. (4) The drug is CC(C)[C@H](NC(=S)Nc1ccccc1)C(=O)N[C@H]1CCOC1O. The target protein (P43367) has sequence YPNTFWMNPQYLIKLEEEDEDQEDGESGCTFLVGLIQKHRRRQRKMGEDMHTIGFGIYEVPEELTGQTNIHLSKNFFLTHRARERSDTFINLREVLNRFKLPPGEYILVPSTFEPNKDGDFCIRVFSEKKADYQVVDDEIEADLEENDASEDDIDDGFRRLFAQLAGEDAEISAFELQTILRRVLAKRQDIKSDGFSIETCRIMVDMLDSDGSAKLGLKEFYILWTKIQKYQKIYREIDVDRSGTMNSYEMRKALEEAGFKLPCQLHQVIVARFADDQLIIDFDNFVRCLVRLETLFRISKQLDSENTGTIELDLISWLCFSVL. The pIC50 is 5.7. (5) The drug is CCOC(C(=O)c1ccc(-c2cc(OC)c(Cl)c(OC)c2)o1)c1ccc(-c2nnc(C)s2)cc1. The pIC50 is 5.0. The target protein sequence is MDDHVTIRKKHLQRPIFRLRCLVKQLERGDVNVVDLKKNIEYAASVLEAVYIDETRRLLDTEDELSDIQTDSVPSEVRDWLASTFTRKMGMTKKKPEEKPKFRSIVHAVQAGIFVERMYRKTYHMVGLAYPAAVIVTLKDVDKWSFDVFALNEASGEHSLKFMIYELFTRYDLINRFKIPVSCLITFAEALEVGYSKYKNPYHNLIHAADVTQTVHYIMLHTGIMHWLTELEILAMVFAAAIHDYEHTGTTNNFHIQTRSDVAILYNDRSVLENHHVSAAYRLMQEEEMNILINLSKDDWRDLRNLVIEMVLSTDMSGHFQQIKNIRNSLQQPEGIDRAKTMSLILHAADISHPAKSWKLHYRWTMALMEEFFLQGDKEAELGLPFSPLCDRKSTMVAQSQIGFIDFIVEPTFSLLTDSTEKIVIPLIEEASKAETSSYVASSSTTIVGLHIADALRRSNTKGSMSDGSYSPDYSLAAVDLKSFKNNLVDIIQQNKERWK.... (6) The compound is Cc1cccc(C)c1C(=O)OCC(=O)CNC(=O)[C@H](CC(C)C)NC(=O)OCc1ccccc1. The target protein sequence is MWTALPLLCAGAWLLSAGATAELTVNAIEKFHFTSWMKQHQKTYSSREYSHRLQVFANNWRKIQAHNQRNHTFKMGLNQFSDMSFAEIKHKYLWSEPQNCSATKSNYLRGTGPYPSSMDWRKKGNVVSPVKNQGACGSCWTFSTTGALESAVAIASGKMMTLAEQQLVDCAQNFNNHGCQGGLPSQAFEYILYNKGIMGEDSYPYIGKNGQCKFNPEKAVAFVKNVVNITLNDEAAMVEAVALYNPVSFAFEVTEDFMMYKSGVYSSNSCHKTPDKVNHAVLAVGYGEQNGLLYWIVKNSWGSNWGNNGYFLIERGKNMCGLAACASYPIPQV. The pIC50 is 5.1. (7) The compound is O=c1[nH]cnc2c(-n3cc(CCN4CCC(c5cccs5)CC4)cn3)nccc12. The target protein (Q9Y2K7) has sequence MEPEEERIRYSQRLRGTMRRRYEDDGISDDEIEGKRTFDLEEKLHTNKYNANFVTFMEGKDFNVEYIQRGGLRDPLIFKNSDGLGIKMPDPDFTVNDVKMCVGSRRMVDVMDVNTQKGIEMTMAQWTRYYETPEEEREKLYNVISLEFSHTRLENMVQRPSTVDFIDWVDNMWPRHLKESQTESTNAILEMQYPKVQKYCLMSVRGCYTDFHVDFGGTSVWYHIHQGGKVFWLIPPTAHNLELYENWLLSGKQGDIFLGDRVSDCQRIELKQGYTFVIPSGWIHAVYTPTDTLVFGGNFLHSFNIPMQLKIYNIEDRTRVPNKFRYPFYYEMCWYVLERYVYCITNRSHLTKEFQKESLSMDLELNGLESGNGDEEAVDREPRRLSSRRSVLTSPVANGVNLDYDGLGKTCRSLPSLKKTLAGDSSSDCSRGSHNGQVWDPQCAPRKDRQVHLTHFELEGLRCLVDKLESLPLHKKCVPTGIEDEDALIADVKILLEELA.... The pIC50 is 5.6. (8) The small molecule is O=C(CC(Cc1cccc(C(=O)O)c1)C(=O)O)NO. The target protein sequence is KSSNEATNITPKHNMKAFLDELKAENIKKFLYNFTQIPHLAGTEQNFQLAKQIQSQWKEFGLDSVELAHYDVLLSYPNKTHPNYISIINEDGNEIFNTSLFEPPPPGYENVSDIVPPFSAFSPQGMPEGDLVYVNYARTEDFFKLERDMKINCSGKIVIARYGKVFRGNKVKNAQLAGAKGVILYSDPADYFAPGVKSYPDGWNLPGGGVQRGNILNLNGAGDPLTPGYPANEYAYRRGIAEAVGLPSIPVHPIGYYDAQKLLEKMGGSAPPDSSWRGSLKVPYNVGPGFTGNFSTQKVKMHIHSTNEVTRIYNVIGTLRGAVEPDRYVILGGHRDSWVFGGIDPQSGAAVVHEIVRSFGTLKKEGWRPRRTILFASWDAEEFGLLGSTEWAEENSRLLQERGVAYINADSSIEGNYTLRVDCTPLMYSLVHNLTKELKSPDEGFEGKSLYESWTKKSPSPEFSGMPRISKLGSGNDFEVFFQRLGIASGRARYTKNWET.... The pIC50 is 4.8. (9) The target is SSSEEGLTCRGIPNSISI. The pIC50 is 3.7. The small molecule is CN(C)c1ccc(/C=C2\CN(Cc3ccccc3)C/C(=C\c3ccc(N(C)C)cc3[N+](=O)[O-])C2=O)c([N+](=O)[O-])c1.