Dataset: Drug-target binding data from BindingDB using IC50 measurements. Task: Regression. Given a target protein amino acid sequence and a drug SMILES string, predict the binding affinity score between them. We predict pIC50 (pIC50 = -log10(IC50 in M); higher means more potent). Dataset: bindingdb_ic50. (1) The small molecule is Cc1c(Oc2cccc(C(=O)O)c2)c2ccccc2n1-c1cnn(C)c1. The target protein (Q13822) has sequence MARRSSFQSCQIISLFTFAVGVNICLGFTAHRIKRAEGWEEGPPTVLSDSPWTNISGSCKGRCFELQEAGPPDCRCDNLCKSYTSCCHDFDELCLKTARGWECTKDRCGEVRNEENACHCSEDCLARGDCCTNYQVVCKGESHWVDDDCEEIKAAECPAGFVRPPLIIFSVDGFRASYMKKGSKVMPNIEKLRSCGTHSPYMRPVYPTKTFPNLYTLATGLYPESHGIVGNSMYDPVFDATFHLRGREKFNHRWWGGQPLWITATKQGVKAGTFFWSVVIPHERRILTILQWLTLPDHERPSVYAFYSEQPDFSGHKYGPFGPEMTNPLREIDKIVGQLMDGLKQLKLHRCVNVIFVGDHGMEDVTCDRTEFLSNYLTNVDDITLVPGTLGRIRSKFSNNAKYDPKAIIANLTCKKPDQHFKPYLKQHLPKRLHYANNRRIEDIHLLVERRWHVARKPLDVYKKPSGKCFFQGDHGFDNKVNSMQTVFVGYGSTFKYKTK.... The pIC50 is 4.5. (2) The drug is CC1=C(C(=O)Nc2ccc3[nH]ncc3c2)C(c2ccc(F)cc2)N=C(c2ccc(C(F)(F)F)cc2)N1. The target protein (P28327) has sequence MDFGSLETVVANSAFIAARGSFDASSGPASRDRKYLARLKLPPLSKCEALRESLDLGFEGMCLEQPIGKRLFQQFLRTHEQHGPALQLWKDIEDYDTADDALRPQKAQALRAAYLEPQAQLFCSFLDAETVARARAGAGDGLFQPLLRAVLAHLGQAPFQEFLDSLYFLRFLQWKWLEAQPMGEDWFLDFRVLGRGGFGEVFACQMKATGKLYACKKLNKKRLKKRKGYQGAMVEKKILAKVHSRFIVSLAYAFETKTDLCLVMTIMNGGDIRYHIYNVDEDNPGFQEPRAIFYTAQIVSGLEHLHQRNIIYRDLKPENVLLDDDGNVRISDLGLAVELKAGQTKTKGYAGTPGFMAPELLLGEEYDFSVDYFALGVTLYEMIAARGPFRARGEKVENKELKQRVLEQAVTYPDKFSPASKDFCEALLQKDPEKRLGFRDGSCDGLRTHPLFRDISWRQLEAGMLTPPFVPDSRTVYAKNIQDVGAFSTVKGVAFEKADT.... The pIC50 is 3.0. (3) The small molecule is CC[C@H]1NC[C@H](O)[C@@H]1O. The target protein (P07683) has sequence MQLFNLPLKVSFFLVLSYFSLLVSAASIPSSASVQLDSYNYDGSTFSGKIYVKNIAYSKKVTVIYADGSDNWNNNGNTIAASYSAPISGSNYEYWTFSASINGIKEFYIKYEVSGKTYYDNNNSANYQVSTSKPTTTTATATTTTAPSTSTTTPPSRSEPATFPTGNSTISSWIKKQEGISRFAMLRNINPPGSATGFIAASLSTAGPDYYYAWTRDAALTSNVIVYEYNTTLSGNKTILNVLKDYVTFSVKTQSTSTVCNCLGEPKFNPDASGYTGAWGRPQNDGPAERATTFILFADSYLTQTKDASYVTGTLKPAIFKDLDYVVNVWSNGCFDLWEEVNGVHFYTLMVMRKGLLLGADFAKRNGDSTRASTYSSTASTIANKISSFWVSSNNWIQVSQSVTGGVSKKGLDVSTLLAANLGSVDDGFFTPGSEKILATAVAVEDSFASLYPINKNLPSYLGNSIGRYPEDTYNGNGNSQGNSWFLAVTGYAELYYRAI.... The pIC50 is 4.7. (4) The compound is C[C@@H]1C(=O)N(C(=O)C2C(C)(C)C2(C)C)[C@H]2CCN(C(=O)OCc3ccccc3)[C@H]12. The target protein (P16753) has sequence MTMDEQQSQAVAPVYVGGFLARYDQSPDEAELLLPRDVVEHWLHAQGQGQPSLSVALPLNINHDDTAVVGHVAAMQSVRDGLFCLGCVTSPRFLEIVRRASEKSELVSRGPVSPLQPDKVVEFLSGSYAGLSLSSRRCDDVEAATSLSGSETTPFKHVALCSVGRRRGTLAVYGRDPEWVTQRFPDLTAADRDGLRAQWQRCGSTAVDASGDPFRSDSYGLLGNSVDALYIRERLPKLRYDKQLVGVTERESYVKASVSPEAACDIKAASAERSGDSRSQAATPAAGARVPSSSPSPPVEPPSPVQPPALPASPSVLPAESPPSLSPSEPAEAASMSHPLSAAVPAATAPPGATVAGASPAVSSLAWPHDGVYLPKDAFFSLLGASRSAVPVMYPGAVAAPPSASPAPLPLPSYPASYGAPVVGYDQLAARHFADYVDPHYPGWGRRYEPAPSLHPSYPVPPPPSPAYYRRRDSPGGMDEPPSGWERYDGGHRGQSQKQH.... The pIC50 is 3.3. (5) The small molecule is CC[C@H](C)[C@H](NC(=O)[C@H](CCCNC(=N)N)NC(=O)[C@H](CC(N)=O)NC(=O)[C@H](C)NC(=O)[C@H](Cc1cnc[nH]1)NC(=O)[C@@H](NC(=O)[C@H](CCC(N)=O)NC(=O)[C@@H]1CCCN1C(=O)[C@H](CC(=O)O)NC(C)=O)[C@@H](C)O)C(=O)N[C@@H](Cc1ccc(O)cc1)C(=O)N[C@@H](CCCNC(=N)N)C(=O)N[C@@H](CCSC)C(=O)N[C@H](C(=O)N[C@H]1CCCCNC(=O)C[C@@H](C(N)=O)NC(=O)[C@H](CC(C)C)NC(=O)[C@H](Cc2c[nH]c3ccccc23)NC(=O)[C@H](CC(C)C)NC1=O)[C@@H](C)CC. The target protein sequence is MPEEVQTQDQPMETFAVQTFAFQAEIAQLMSLIYESLTDPSKLDSGK. The pIC50 is 5.1. (6) The drug is COc1cn(C(C[C@H]2CC[C@@H](OC)CC2)C(=O)Nc2ccc(C(=O)O)cc2)c(=O)cc1-c1cc(Cl)ccc1C#N. The target protein (P03951) has sequence MIFLYQVVHFILFTSVSGECVTQLLKDTCFEGGDITTVFTPSAKYCQVVCTYHPRCLLFTFTAESPSEDPTRWFTCVLKDSVTETLPRVNRTAAISGYSFKQCSHQISACNKDIYVDLDMKGINYNSSVAKSAQECQERCTDDVHCHFFTYATRQFPSLEHRNICLLKHTQTGTPTRITKLDKVVSGFSLKSCALSNLACIRDIFPNTVFADSNIDSVMAPDAFVCGRICTHHPGCLFFTFFSQEWPKESQRNLCLLKTSESGLPSTRIKKSKALSGFSLQSCRHSIPVFCHSSFYHDTDFLGEELDIVAAKSHEACQKLCTNAVRCQFFTYTPAQASCNEGKGKCYLKLSSNGSPTKILHGRGGISGYTLRLCKMDNECTTKIKPRIVGGTASVRGEWPWQVTLHTTSPTQRHLCGGSIIGNQWILTAAHCFYGVESPKILRVYSGILNQSEIKEDTSFFGVQEIIIHDQYKMAESGYDIALLKLETTVNYTDSQRPIC.... The pIC50 is 8.3. (7) The small molecule is Cn1cc(-c2cccc(-c3ncc(-c4cnn(C)c4)c(NCC4CCCC4N)n3)c2)cn1. The target protein sequence is RPFPFCWPLCEISRGTHNFSEELKIGEGGFGCVYRAVMRNTVYAVKRLKENADLEWTAVKQSFLTEVEQLSRFRHPNIVDFAGYCAQNGFYCLVYGFLPNGSLEDRLHCQTQACPPLSWPQRLDILLGTARAIQFLHQDSPSLIHGDIKSSNVLLDERLTPKLGDFGLARFSRFAGSSPSQSSMVARTQTVRGTLAYLPEEYIKTGRLAVDTDTFSFGVVVLETLAGQRAVKTHGARTKYLKDLVEEEAEEAGVALRSTQSTLQAGLAADAWAAPIAMQIYKKHLDPRPGPCPPELGLGLGQLACCCLHRRAKRRPPMTQVYERLEKLQAVVAGVPGHSEAASCIPPSPQENSYVSSTGRAHSGAAPWQPLAAPSGASAQAAEQLQRGPNQPVESDESLGGLSAALRSWHLTPSCPLDPAPLREAGCPQGDTAGESSWGSGPGSRPTAVEGLALGSSASSSSEPPQIIINPARQKMVQKLALYEDGALDSLQLLSSSSLP.... The pIC50 is 5.5.