Dataset: Drug-target binding data from BindingDB using IC50 measurements. Task: Regression. Given a target protein amino acid sequence and a drug SMILES string, predict the binding affinity score between them. We predict pIC50 (pIC50 = -log10(IC50 in M); higher means more potent). Dataset: bindingdb_ic50. (1) The drug is Cc1sc2c(c1C)C(c1ccc(Cl)cc1)=N[C@H](CC(=O)OC(C)(C)C)c1nnc(C)n1-2. The pIC50 is 5.1. The target protein sequence is MSAESGPGTRLRNLPVMGDGLETSQMSTTQAQAQPQPANAASTNPPPPETSNPNKPKRQTNQLQYLLRVVLKTLWKHQFAWPFQQPVDAVKLNLPDYYKIIKTPMDMGTIKKRLENNYYWNAQECIQDFNTMFTNCYIYNKPGDDIVLMAEALEKLFLQKINELPTEETEIMIVQAKGRGRGRKETGTAKPGVSTVPNTTQASTPPQTQTPQPNPPPVQATPHPFPAVTPDLIVQTPVMTVVPPQPLQTPPPVPPQPQPPPAPAPQPVQSHPPIIAATPQPVKTKKGVKRKADTTTPTTIDPIHEPPSLPPEPKTTKLGQRRESSRPVKPPKKDVPDSQQHPAPEKSSKVSEQLKCCSGILKEMFAKKHAAYAWPFYKPVDVEALGLHDYCDIIKHPMDMSTIKSKLEAREYRDAQEFGADVRLMFSNCYKYNPPDHEVVAMARKLQDVFEMRFAKMPDEPEEPVVAVSSPAVPPPTKVVAPPSSSDSSSDSSSDSDSST.... (2) The compound is C=CS(=O)(=O)Nc1ccc(F)c(Nc2ncccc2-c2ncnc3[nH]cnc23)c1F. The target is CKENALLRYLLDKDD. The pIC50 is 6.3. (3) The small molecule is Cc1nc2c3cccnc3nn2c(C)c1CCC(=O)NCC1CCCCC1. The target protein (Q8TE04) has sequence MLKLVGGGGGQDWACSVAGTSLGGEEAAFEVARPGDQGKAGGGSPGWGCAGIPDSAPGAGVLQAGAVGPARGGQGAEEVGESAGGGEERRVRHPQAPALRLLNRKPQGGSGEIKTPENDLQRGRLSRGPRTAPPAPGMGDRSGQQERSVPHSPGAPVGTSAAAVNGLLHNGFHPPPVQPPHVCSRGPVGGSDAAPQRLPLLPELQPQPLLPQHDSPAKKCRLRRRMDSGRKNRPPFPWFGMDIGGTLVKLVYFEPKDITAEEEQEEVENLKSIRKYLTSNTAYGKTGIRDVHLELKNLTMCGRKGNLHFIRFPSCAMHRFIQMGSEKNFSSLHTTLCATGGGAFKFEEDFRMIADLQLHKLDELDCLIQGLLYVDSVGFNGKPECYYFENPTNPELCQKKPYCLDNPYPMLLVNMGSGVSILAVYSKDNYKRVTGTSLGGGTFLGLCCLLTGCETFEEALEMAAKGDSTNVDKLVKDIYGGDYERFGLQGSAVASSFGNM.... The pIC50 is 5.9. (4) The small molecule is Cc1cccc(CCCOP(=O)(CCCCC2(C(=O)NCC(F)(F)F)c3ccccc3-c3ccccc32)OCCCc2cccc(C)n2)n1. The target protein (P55156) has sequence FLCFISSYSASVKGHTTGLSLNNDRLYKLTYSTEVFLDRGKGNLQDSVGYRISSNVDVALLWRSPDGDDNQLIQITMKDVNLENVNQQRGEKSIFKGKKSSQIIRKENLEAMQRPVLLHLIHGKIKEFYSYQNEPAAIENLKRGLASLFQMQLSSGTTNEVDISGDCKVTYQAHQDKVTKIKALDSCKIERAGFTTPHQVLGVTSKATSVTTYKIEDSFVVAVLSEEIRALRLNFLQSIAGKIVSRQKLELKTTEASVRLKPGKQVAAIIKAVDSKYTAIPIVGQVFQSKCKGCPSLSEHWQSIRKHLQPDNLSKAEAVRSFLAFIKHLRTAKKEEILQILKAENKEVLPQLVDAVTSAQTPDSLDAILDFLDFKSTESVILQERFLYACAFASHPDEELLRALISKFKGSFGSNDIRESVMIIIGALVRKLCQNQGCKLKGVIEAKKLILGGLEKAEKKEDIVMYLLALKNARLPEGIPLLLKYTETGEGPISHLAATT.... The pIC50 is 7.3. (5) The small molecule is C/C=C(/C=C/C=C/C=C(C)\C=C1/CCCC(CC)=C1C(C)C)C(=O)O. The target protein (P40220) has sequence MPNFAGTWKMRSSENFDELLKALGVNAMLRKVAVAAASKPHVEIRQDGDQFYIKTSTTVRTTEINFKIGESFEEETVDGRKCRSLATWENENKIYCKQTLIEGDGPKTYWTRELANDELILTFGADDVVCTRIYVRE. The pIC50 is 6.0. (6) The small molecule is C#CCCC/C=C(\C)C(=O)N(C)[C@H](C(=O)N(C)[C@H](C(=O)N(C)[C@H](C(=O)N(C)[C@H](C(=O)N1CCC[C@H]1C(=O)N1CCC[C@H]1C(=O)CC)C(C)C)C(C)C)[C@@H](C)CC)[C@@H](C)CC. The target protein (O35235) has sequence MRRASRDYGKYLRSSEEMGSGPGVPHEGPLHPAPSAPAPAPPPAASRSMFLALLGLGLGQVVCSIALFLYFRAQMDPNRISEDSTHCFYRILRLHENADLQDSTLESEDTLPDSCRRMKQAFQGAVQKELQHIVGPQRFSGAPAMMEGSWLDVAQRGKPEAQPFAHLTINAASIPSGSHKVTLSSWYHDRGWAKISNMTLSNGKLRVNQDGFYYLYANICFRHHETSGSVPTDYLQLMVYVVKTSIKIPSSHNLMKGGSTKNWSGNSEFHFYSINVGGFFKLRAGEEISIQVSNPSLLDPDQDATYFGAFKVQDID. The pIC50 is 6.1. (7) The small molecule is C[C@H]1CCCN(c2ncnc3[nH]cc(-c4ccccc4F)c23)C1. The target protein sequence is MASGSCQGCEEDEETLKKLIVRLNNVQEGKQIETLVQILEDLLVFTYSERASKLFQGKNIHVPLLIVLDSYMRVASVQQVGWSLLCKLIEVCPGTMQSLMGPQDVGNDWEVLGVHQLILKMLTVHNASVNLSVIGLKTLDLLLTSGKITLLILDEESDIFMLIFDAMHSFPANDEVQKLGCKALHVLFERVSEEQLTEFVENKDYMILLSALTNFKDEEEIVLHVLHCLHSLAIPCNNVEVLMSGNVRCYNIVVEAMKAFPMSERIQEVSCCLLHRLTLGNFFNILVLNEVHEFVVKAVQQYPENAALQISALSCLALLTETIFLNQDLEEKNENQENDDEGEEDKLFWLEACYKALTWHRKNKHVQEAACWALNNLLMYQNSLHEKIGDEDGHFPAHREVMLSMLMHSSSKEVFQASANALSTLLEQNVNFRKILLSKGIHLNVLELMQKHIHSPEVAESGCKMLNHLFEGSNTSLDIMAAVVPKILTVMKRHETSLPV.... The pIC50 is 6.9. (8) The target protein (O62698) has sequence MLARALLLCAAVALSGAANPCCSHPCQNRGVCMSVGFDQYKCDCTRTGFYGENCTTPEFLTRIKLLLKPTPNTVHYILTHFKGVWNIVNKISFLRNMIMRYVLTSRSHLIESPPTYNVHYSYKSWEAFSNLSYYTRALPPVPDDCPTPMGVKGRKELPDSKEVVKKVLLRRKFIPDPQGTNLMFAFFAQHFTHQFFKTDFERGPAFTKGKNHGVDLSHIYGESLERQHKLRLFKDGKMKYQMINGEMYPPTVKDTQVEMIYPPHVPEHLKFAVGQEVFGLVPGLMMYATIWLREHNRVCDVLKQEHPEWGDEQLFQTSRLILIGETIKIVIEDYVQHLSGYHFKLKFDPELLFNQQFQYQNRIAAEFNTLYHWHPLLPDVFQIDGQEYNYQQFIYNNSVLLEHGLTQFVESFTRQRAGRVAGGRNLPVAVEKVSKASIDQSREMKYQSFNEYRKRFLVKPYESFEELTGEKEMAAELEALYGDIDAMEFYPALLVEKPRP.... The small molecule is Cc1ccc(-c2cc(C(F)(F)F)nn2-c2ccc(S(N)(=O)=O)cc2)cc1. The pIC50 is 7.2. (9) The compound is CCCCCC[Se](=O)CC(=O)C(F)(F)F. The target protein sequence is MIQQRMLQLLLLGQLLAGPGPFCAALATVDQLTVCPPSVGCLKGTNLQGYQSERFEAFMGIPYALPPIGDLRFSNPKVMPKLLGMYDASAPKMDCIQKNYLLPTPVVYGDEDCLYLNVYRPEIRKSALPVMVYIHGGGFFGGSAGPGVTGPEYFMDSGEVILVTMAYRLGPFGFLSTQDAVMSGNFGLKDQNLALRWVQRNIRFFGGDPQRVTIFGQSAGGVAAHMHLLSPRSHGLFHRVISMSGTANVPFAIAEQPLEQARLLAEFADVPDARNLSTVKLTKALRRINATKLLNAGDGLKYWDVDHMTNFRPVVEEGLEVDAFLNAHPMDMLAQGMPTSIPLLLGTVPGEGAVRVVNILGNETLRQSFNLRFDELLQELLEFPASFSQDRREKMMDLLVEVYFQGQHEVNELTVQGFMNLISDRGFKQPLYNTIHKNVCHTPNPVYLYSFNYQGPLSYASAYTSANVTGKYGVVHCDDLLYLFRSPLLFPDFQRNSTEA.... The pIC50 is 9.2.