Dataset: Drug-target binding data from BindingDB using IC50 measurements. Task: Regression. Given a target protein amino acid sequence and a drug SMILES string, predict the binding affinity score between them. We predict pIC50 (pIC50 = -log10(IC50 in M); higher means more potent). Dataset: bindingdb_ic50. (1) The drug is CC(=O)N1c2ccc(S(=O)(=O)N3CCN(c4ccccc4)CC3)cc2CC1C. The target protein (Q9Y239) has sequence MEEQGHSEMEIIPSESHPHIQLLKSNRELLVTHIRNTQCLVDNLLKNDYFSAEDAEIVCACPTQPDKVRKILDLVQSKGEEVSEFFLYLLQQLADAYVDLRPWLLEIGFSPSLLTQSKVVVNTDPVSRYTQQLRHHLGRDSKFVLCYAQKEELLLEEIYMDTIMELVGFSNESLGSLNSLACLLDHTTGILNEQGETIFILGDAGVGKSMLLQRLQSLWATGRLDAGVKFFFHFRCRMFSCFKESDRLCLQDLLFKHYCYPERDPEEVFAFLLRFPHVALFTFDGLDELHSDLDLSRVPDSSCPWEPAHPLVLLANLLSGKLLKGASKLLTARTGIEVPRQFLRKKVLLRGFSPSHLRAYARRMFPERALQDRLLSQLEANPNLCSLCSVPLFCWIIFRCFQHFRAAFEGSPQLPDCTMTLTDVFLLVTEVHLNRMQPSSLVQRNTRSPVETLHAGRDTLCSLGQVAHRGMEKSLFVFTQEEVQASGLQERDMQLGFLRA.... The pIC50 is 5.0. (2) The drug is O=C(COc1ccc(OC(F)(F)F)cc1)N[C@H]1CCOC1=O. The target protein (P33905) has sequence MQHWLDKLTDLAAIEGDECILKTGLADIADHFGFTGYAYLHIQHRHITAVTNYHRQWQSTYFDKKFEALDPVVKRARSRKHIFTWSGEHERPTLSKDERAFYDHASDFGIRSGITIPIKTANGFMSMFTMASDKPVIDLDREIDAVAAAATIGQIHARISFLRTTPTAEDAAWLDPKEATYLRWIAVGKTMEEIADVEGVKYNSVRVKLREAMKRFDVRSKAHLTALAIRRKLI. The pIC50 is 6.3. (3) The small molecule is O=C(Nc1ccc(F)cc1)N1N=C(c2ccccc2)CC1(CCO)c1ccccc1. The target protein (Q13936) has sequence MVNENTRMYIPEENHQGSNYGSPRPAHANMNANAAAGLAPEHIPTPGAALSWQAAIDAARQAKLMGSAGNATISTVSSTQRKRQQYGKPKKQGSTTATRPPRALLCLTLKNPIRRACISIVEWKPFEIIILLTIFANCVALAIYIPFPEDDSNATNSNLERVEYLFLIIFTVEAFLKVIAYGLLFHPNAYLRNGWNLLDFIIVVVGLFSAILEQATKADGANALGGKGAGFDVKALRAFRVLRPLRLVSGVPSLQVVLNSIIKAMVPLLHIALLVLFVIIIYAIIGLELFMGKMHKTCYNQEGIADVPAEDDPSPCALETGHGRQCQNGTVCKPGWDGPKHGITNFDNFAFAMLTVFQCITMEGWTDVLYWVNDAVGRDWPWIYFVTLIIIGSFFVLNLVLGVLSGEFSKEREKAKARGDFQKLREKQQLEEDLKGYLDWITQAEDIDPENEDEGMDEEKPRNMSMPTSETESVNTENVAGGDIEGENCGARLAHRISKS.... The pIC50 is 5.7. (4) The small molecule is N=c1n(Cc2ccccc2)c2ccccc2n1CC(O)CN1CCOCC1. The target protein (P48050) has sequence MHGHSRNGQAHVPRRKRRNRFVKKNGQCNVYFANLSNKSQRYMADIFTTCVDTRWRYMLMIFSAAFLVSWLFFGLLFWCIAFFHGDLEASPGVPAAGGPAAGGGGAAPVAPKPCIMHVNGFLGAFLFSVETQTTIGYGFRCVTEECPLAVIAVVVQSIVGCVIDSFMIGTIMAKMARPKKRAQTLLFSHHAVISVRDGKLCLMWRVGNLRKSHIVEAHVRAQLIKPYMTQEGEYLPLDQRDLNVGYDIGLDRIFLVSPIIIVHEIDEDSPLYGMGKEELESEDFEIVVILEGMVEATAMTTQARSSYLASEILWGHRFEPVVFEEKSHYKVDYSRFHKTYEVAGTPCCSARELQESKITVLPAPPPPPSAFCYENELALMSQEEEEMEEEAAAAAAVAAGLGLEAGSKEEAGIIRMLEFGSHLDLERMQASLPLDNISYRRESAI. The pIC50 is 4.6. (5) The drug is O=c1c(-c2ccc(O)cc2)coc2cc(O)c(O)cc12. The pIC50 is 4.0. The target protein (O15296) has sequence MAEFRVRVSTGEAFGAGTWDKVSVSIVGTRGESPPLPLDNLGKEFTAGAEEDFQVTLPEDVGRVLLLRVHKAPPVLPLLGPLAPDAWFCRWFQLTPPRGGHLLFPCYQWLEGAGTLVLQEGTAKVSWADHHPVLQQQRQEELQARQEMYQWKAYNPGWPHCLDEKTVEDLELNIKYSTAKNANFYLQAGSAFAEMKIKGLLDRKGLWRSLNEMKRIFNFRRTPAAEHAFEHWQEDAFFASQFLNGLNPVLIRRCHYLPKNFPVTDAMVASVLGPGTSLQAELEKGSLFLVDHGILSGIQTNVINGKPQFSAAPMTLLYQSPGCGPLLPLAIQLSQTPGPNSPIFLPTDDKWDWLLAKTWVRNAEFSFHEALTHLLHSHLLPEVFTLATLRQLPHCHPLFKLLIPHTRYTLHINTLARELLIVPGQVVDRSTGIGIEGFSELIQRNMKQLNYSLLCLPEDIRTRGVEDIPGYYYRDDGMQIWGAVERFVSEIIGIYYPSDE.... (6) The drug is CN(C)CCOc1cc(-c2cn[nH]c2)ccc1NC(=O)C1COc2ccccc2O1. The target protein sequence is MSRPPPTGKMPGAPETAPGDGAGASRQRKLEALIRDPRSPINVESLLDGLNSLVLDLDFPALRKNKNIDNFLNRYEKIVKKIRGLQMKAEDYDVVKVIGRGAFGEVQLVRHKASQKVYAMKLLSKFEMIKRSDSAFFWEERDIMAFANSPWVVQLFYAFQDDRYLYMVMEYMPGGDLVNLMSNYDVPEKWAKFYTAEVVLALDAIHSMGLIHRDVKPDNMLLDKHGHLKLADFGTCMKMDETGMVHCDTAVGTPDYISPEVLKSQGGDGFYGRECDWWSVGVFLYEMLVGDTPFYADSLVGTYSKIMDHKNSLCFPEDAEISKHAKNLICAFLTDREVRLGRNGVEEIRQHPFFKNDQWHWDNIRETAAPVVPELSSDIDSSNFDDIEDDKGDVETFPIPKAFVGNQLPFIGFTYYRENLLLSDSPSCRENDSIQSRKNEESQEIQKKLYTLEEHLSNEMQAKEELEQKCKSVNTRLEKTAKELEEEITLRKSVESALRQ.... The pIC50 is 8.5. (7) The small molecule is CO[C@H]1[C@H]([C@@]2(C)O[C@@H]2CC=C(C)C)[C@]2(CC[C@H]1OC(=O)NC(=O)CCl)CO2. The target protein sequence is MANIDDIEKQIENIKINSDDNKNNVSKNKNILLNGVNLKDHEIKDNVKSVDYNNNNNENDTMNEINKHVKNDEYCNKENSNNNNNNNNNLDTQINETLNLNEKFEKKNEENLCSGCKKVLIKKLSCPICLKNKIFSYFCNQECFKGSWKEHQKIHENMNKENNEKEDHLKTIVKKHLSPENFDPTNRKYWVYDDHLKNFVNFKFTGDVRPWPLSKINHVPSHIERPDYAISSIPESELIYKRKSDIYVNNEEEIQRIREACILGRKTLDYAHTLVSPGVTTDEIDRKVHEFIIKNNAYPSTLNYYKFPKSCCTSVNEIVCHGIPDYRPLKSGDIINIDISVFYKGVHSDLNETYFVGDINDVPKEGKELVETCYFSLMEAIKKCKPGMFYKNIGTLIDAYVSKKNFSVVRSYSGHGVGKLFHSNPTVPHFKKNKAVGIMKPGHVFTIEPMINQGHYSDVLWPDQWTSATSDGKLSAQFEHTLLITNNGVEILTKRTQDSP.... The pIC50 is 3.5. (8) The target protein (P05091) has sequence MLRAAARFGPRLGRRLLSAAATQAVPAPNQQPEVFCNQIFINNEWHDAVSRKTFPTVNPSTGEVICQVAEGDKEDVDKAVKAARAAFQLGSPWRRMDASHRGRLLNRLADLIERDRTYLAALETLDNGKPYVISYLVDLDMVLKCLRYYAGWADKYHGKTIPIDGDFFSYTRHEPVGVCGQIIPWNFPLLMQAWKLGPALATGNVVVMKVAEQTPLTALYVANLIKEAGFPPGVVNIVPGFGPTAGAAIASHEDVDKVAFTGSTEIGRVIQVAAGSSNLKRVTLELGGKSPNIIMSDADMDWAVEQAHFALFFNQGQCCCAGSRTFVQEDIYDEFVERSVARAKSRVVGNPFDSKTEQGPQVDETQFKKILGYINTGKQEGAKLLCGGGIAADRGYFIQPTVFGDVQDGMTIAKEEIFGPVMQILKFKTIEEVVGRANNSTYGLAAAVFTKDLDKANYLSQALQAGTVWVNCYDVFGAQSPFGGYKMSGSGRELGEYGLQ.... The small molecule is C=CCOc1ccc2c(=O)c(-c3ccc(O)cc3)coc2c1. The pIC50 is 6.1. (9) The small molecule is CS(=O)(=O)NCC#CN1C(=O)[C@H](CC[C@H](O)c2ccc(F)cc2)[C@H]1c1ccc(O[C@@H]2O[C@H](C(=O)O)[C@@H](O)[C@H](O)[C@H]2O)cc1. The target protein sequence is MAEAGLRGWLLWALLLHLAQSEPYTPIHQPGYCAFYDECGKNPELSGGLMTLSNVSCLSNTPARNITGDHLILLQRICPRLYTGPNTQACCSAKQLVSLEASLSITKALLTRCPACSDNFVSLHCHNTCSPNQSLFINVTRVAQLGAGQLPAVVAYEAFYQHSFAEQSYDSCSRVHIPAAATLAVGSMCGVYGSALCNAQRWLNFQGDTGNGLAPLDITFHLLEPGQAVGSGIQPLNEGVARCNESQGDDAVACSCQDCAASCPAIAHPQALDSTFRLGRMPGGLVLIIILCSVFTVVAILLVGLRVAPTRDKSKTVDPKKGTSLSDKLSFSTHTLLGQFFQGWGTWVASWPLTILVLSVIPVVVLAAGLVFTELTTDPVELWSAPNSQARSEKAFHDQHFGPFFRTNQVILTAPNRSSYRYDSLLLGPKNFSGILDLDLLLELLELQERLRHLQVWSPEAQRNISLQHICYAPLNPDNTSLSDCCINSLLQYFQNNRTL.... The pIC50 is 7.7. (10) The compound is O=c1cc(-c2ccccc2)oc2ccccc12. The target protein (P0DP25,P0DP24,P0DP23) has sequence MADQLTEEQIAEFKEAFSLFDKDGDGTITTKELGTVMRSLGQNPTEAELQDMINEVDADGNGTIDFPEFLTMMARKMKDTDSEEEIREAFRVFDKDGNGYISAAELRHVMTNLGEKLTDEEVDEMIREADIDGDGQVNYEEFVQMMTAK. The pIC50 is 4.0.