Dataset: Drug-target binding data from BindingDB using IC50 measurements. Task: Regression. Given a target protein amino acid sequence and a drug SMILES string, predict the binding affinity score between them. We predict pIC50 (pIC50 = -log10(IC50 in M); higher means more potent). Dataset: bindingdb_ic50. (1) The compound is CCCCCCCC/C=C\CCCCCCCC(=O)Oc1cccc2c1C(=O)C=CC2=O. The target protein (P06526) has sequence MDPLCTASSGPRKKRPRQVGASMASPPHDIKFQNLVLFILEKKMGTTRRNFLMELARRKGFRVENELSDSVTHIVAENNSGSEVLEWLQVQNIRASSQLELLDVSWLIESMGAGKPVEITGKHQLVVRTDYSATPNPGFQKTPPLAVKKISQYACQRKTTLNNYNHIFTDAFEILAENSEFKENEVSYVTFMRAASVLKSLPFTIISMKDTEGIPCLGDKVKCIIEEIIEDGESSEVKAVLNDERYQSFKLFTSVFGVGLKTSEKWFRMGFRSLSKIMSDKTLKFTKMQKAGFLYYEDLVSCVTRAEAEAVGVLVKEAVWAFLPDAFVTMTGGFRRGKKIGHDVDFLITSPGSAEDEEQLLPKVINLWEKKGLLLYYDLVESTFEKFKLPSRQVDTLDHFQKCFLILKLHHQRVDSSKSNQQEGKTWKAIRVDLVMCPYENRAFALLGWTGSRQFERDIRRYATHERKMMLDNHALYDKTKRVFLKAESEEEIFAHLGLD.... The pIC50 is 4.9. (2) The small molecule is CNC(=O)C[C@H]1COc2cc(F)ccc2N1C(=O)c1ccc2c(c1)NC(=O)CO2. The target protein sequence is TISRALTPSPVMVLENIEPEIVYAGYDSSKPDTAENLLSTLNRLAGKQMIQVVKWAKVLPGFKNLPLEDQITLIQYSWMCLSSFALSWRSYKHTNSQFLYFAPDLVFNEEKMHQSAMYELCQGMHQISLQFVRLQLTFEEYTIMKVLLLLSTIPKDGLKSQAAFEEMRTNYIKELRKMVTKCPNNSGQSWQRFYQLTKLLDSMHDLVSDLLEFCFYTFRESHALKVEFPAMLVEIISDQLPKVESGNAKPLYFHRK. The pIC50 is 6.6. (3) The drug is Clc1ccc(CC[C@@]2(Cn3ccnc3)OC[C@@H](CSC3CCCCC3)O2)cc1. The target protein sequence is MERPQLDSMSQDLSEALKEATKEVHIRAENSEFMRNFQKGQVSREGFKLVMASLYHIYTALEEEIERNKQNPVYAPLYFPEELHRRAALEQDMAFWYGPHWQEAIPYTPATQHYVKRLHEVGGTHPELLVAHAYTRYLGDLSGGQVLKKIAQKAMALPSSGEGLAFFTFPSIDNPTKFKQLYRARMNTLEMTPEVKHRVTEEAKTAFLLNIELFEELQALLTEEHKDQSPSQTEFLRQRPASLVQDTTSAETPRGKSQISTSSSQTPLLRWVLTLSFLLATVAVGIYAM. The pIC50 is 6.0. (4) The drug is O=C(Nc1ccccc1C(=O)O)c1ccc(N2CCSCC2)c(Oc2ccccc2)c1. The target protein (Q820T1) has sequence MKNYARISCTSRYVPENCVTNHQLSEMMDTSDEWIHSRTGISERRIVTQENTSDLCHQVAKQLLEKSGKQASEIDFILVATVTPDFNMPSVACQVQGAIGATEAFAFDISAACSGFVYALSMAEKLVLSGRYQTGLVIGGETFSKMLDWTDRSTAVLFGDGAAGVLIEAAETPHFLNEKLQADGQRWAALTSGYTINESPFYQGHKQASKTLQMEGRSIFDFAIKDVSQNILSLVTDETVDYLLLHQANVRIIDKIARKTKISREKFLTNMDKYGNTSAASIPILLDEAVENGTLILGSQQRVVLTGFGGGLTWGSLLLTL. The pIC50 is 5.9. (5) The drug is O=C(/C=C/c1cccc([N+](=O)[O-])c1)n1cnc2ccccc21. The target protein (P17861) has sequence MVVVAAAPNPADGTPKVLLLSGQPASAAGAPAGQALPLMVPAQRGASPEAASGGLPQARKRQRLTHLSPEEKALRRKLKNRVAAQTARDRKKARMSELEQQVVDLEEENQKLLLENQLLREKTHGLVVENQELRQRLGMDALVAEEEAEAKGNEVRPVAGSAESAALRLRAPLQQVQAQLSPLQNISPWILAVLTLQIQSLISCWAFWTTWTQSCSSNALPQSLPAWRSSQRSTQKDPVPYQPPFLCQWGRHQPSWKPLMN. The pIC50 is 5.3. (6) The drug is O=C(O)c1sc2c(Cl)cccc2c1O. The target protein (O62664) has sequence MSRQGISLRFPLLLLLLSPSPVLPADPGAPAPVNPCCYYPCQHQGICVRFGLDRYQCDCTRTGYYGPNCTIPEIWTWLRTTLRPSPSFVHFLLTHGRWLWDFVNATFIRDKLMRLVLTVRSNLIPSPPTYNVAHDYISWESFSNVSYYTRILPSVPRDCPTPMGTKGKKQLPDAEFLSRRFLLRRKFIPDPQGTNLMFAFFAQHFTHQFFKTSGKMGPGFTKALGHGVDLGHIYGDNLERQYQLRLFKDGKLKYQMLNGEVYPPSVEEAPVLMHYPRGIPPQSQMAVGQEVFGLLPGLMVYATIWLREHNRVCDLLKAEHPTWGDEQLFQTARLILIGETIKIVIEEYVQQLSGYFLQLKFDPELLFGAQFQYRNRIAMEFNQLYHWHPLMPDSFRVGPQDYSYEQFLFNTSMLVDYGVEALVDAFSRQPAGRIGGGRNIDHHILHVAVDVIKESRELRLQPFNEYRKRFGMKPYTSFQELTGEKEMAAELEELYGDIDA.... The pIC50 is 5.6. (7) The drug is CC(C)c1cc(Oc2c(Br)cc(NC(=O)C(=O)O)cc2Br)ccc1O. The target protein (P18113) has sequence MTPNSMTENRLPAWDKQKPHPDRGQDWKLVGMSEACLHRKSHVERRGALKNEQTSSHLIQATWASSIFHLDPDDVNDQSVSSAQTFQTEEKKCKGYIPSYLDKDELCVVCGDKATGYHYRCITCEGCKGFFRRTIQKSLHPSYSCKYEGKCIIDKVTRNQCQECRFKKCIYVGMATDLVLDDSKRLAKRKLIEENREKRRREELQKSIGHKPEPTDEEWELIKTVTEAHVATNAQGSHWKQKRKFLPEDIGQAPIVNAPEGGQVDLEAFSHFTKIITPAITRVVDFAKKLPMFCELPCEDQIILLKGCCMEIMSLRAAVRYDPDSETLTLNGEMAVTRGQLKNGGLGVVSDAIFDLGMSLSSFNLDDTEVALLQAVLLMSSDRPGLACVERIEKYQDSFLLAFEHYINYRKHHVTHFWPKLLMKVTDLRMIGACHASRFLHMKVECPTELFPPLFLEVFED. The pIC50 is 5.4. (8) The drug is CCN(CC)CCCCc1ccc(OC2Cc3cc(OC)c(OC)cc3C2=O)cc1. The target protein sequence is MVTEIHFLLWILLLCMLFGKSHTEEDVIITTKTGRVRGLSMPILGGTVTAFLGIPYAQPPLGSLRFKKPQPLNKWPDVYNATKYANSCYQNIDQAFPGFQGSEMWNPNTNLSEDCLYLNVWIPVPKPKNATVMVWVYGGGFQTGTSSLPVYDGKFLTRVERVIVVSMNYRVGALGFLAFPGNSEAPGNMGLFDQQLALQWIQRNIAAFGGNPKSVTLFGESAGAASVSLHLLCPQSYPLFTRAILESGSSNAPWAVKHPEEARNRTLTLAKFIGCSKENEKEIITCLRSKDPQEILLNEKLVLPSDSIRSINFGPTVDGDFLTDMPHTLLQLGKVKTAQILVGVNKDEGTAFLVYGAPGFSKDNDSLITRREFQEGLNMYFPGVSSLGKEAILFYYVDWLGDQTPEVYREAFDDIIGDYNIICPALEFTKKFAELEINAFFYYFEHRSSKLPWPEWMGVMHGYEIEFVFGLPLERRVNYTRAEEIFSRSIMKTWANFAKY.... The pIC50 is 5.5. (9) The small molecule is CCN(CC)C(C)c1ccc(-c2c(O)ccc3[nH]c(=O)c4sccc4c23)cc1F. The target protein (Q96KB5) has sequence MEGISNFKTPSKLSEKKKSVLCSTPTINIPASPFMQKLGFGTGVNVYLMKRSPRGLSHSPWAVKKINPICNDHYRSVYQKRLMDEAKILKSLHHPNIVGYRAFTEANDGSLCLAMEYGGEKSLNDLIEERYKASQDPFPAAIILKVALNMARGLKYLHQEKKLLHGDIKSSNVVIKGDFETIKICDVGVSLPLDENMTVTDPEACYIGTEPWKPKEAVEENGVITDKADIFAFGLTLWEMMTLSIPHINLSNDDDDEDKTFDESDFDDEAYYAALGTRPPINMEELDESYQKVIELFSVCTNEDPKDRPSAAHIVEALETDV. The pIC50 is 8.4. (10) The small molecule is C=C(C)[C@H]1Cc2c(ccc3c2O[C@@H]2COc4cc(OC)c(OC)cc4[C@@H]2[C@]32CC3O[C@]3(O)[C@@H](O)O2)O1. The target protein (P00591) has sequence SEVCFPRLGCFSDDAPWAGIVQRPLKILPWSPKDVDTRFLLYTNQNQNNYQELVADPSTITNSNFRMDRKTRFIIHGFIDKGEEDWLSNICKNLFKVESVNCICVDWKGGSRTGYTQASQNIRIVGAEVAYFVEVLKSSLGYSPSNVHVIGHSLGSHAAGEAGRRTNGTIERITGLDPAEPCFQGTPELVRLDPSDAKFVDVIHTDAAPIIPNLGFGMSQTVGHLDFFPNGGKQMPGCQKNILSQIVDIDGIWEGTRDFVACNHLRSYKYYADSILNPDGFAGFPCDSYNVFTANKCFPCPSEGCPQMGHYADRFPGKTNGVSQVFYLNTGDASNFARWRYKVSVTLSGKKVTGHILVSLFGNEGNSRQYEIYKGTLQPDNTHSDEFDSDVEVGDLQKVKFIWYNNNVINPTLPRVGASKITVERNDGKVYDFCSQETVREEVLLTLNPC. The pIC50 is 4.5.