Dataset: Drug-target binding data from BindingDB using IC50 measurements. Task: Regression. Given a target protein amino acid sequence and a drug SMILES string, predict the binding affinity score between them. We predict pIC50 (pIC50 = -log10(IC50 in M); higher means more potent). Dataset: bindingdb_ic50. (1) The small molecule is COc1ccc(Oc2ccc(OC)c(C(=O)Oc3c(C)c(C)c(C(=O)Oc4c(C)c(C)c(C(=O)O)c(OC)c4C)c(OC)c3C)c2)cc1C(=O)Oc1c(C)c(C)c(C(=O)Oc2c(C)c(C)c(C(=O)O)c(OC)c2C)c(OC)c1C. The target protein (P14423) has sequence MKVLLLLAVVIMAFGSIQVQGSLLEFGQMILFKTGKRADVSYGFYGCHCGVGGRGSPKDATDWCCVTHDCCYNRLEKRGCGTKFLTYKFSYRGGQISCSTNQDSCRKQLCQCDKAAAECFARNKKSYSLKYQFYPNKFCKGKTPSC. The pIC50 is 4.4. (2) The drug is CC(/C(=N\OCCCCC(=O)O)c1ccccc1)n1ccnc1. The target protein (P21731) has sequence MWPNGSSLGPCFRPTNITLEERRLIASPWFAASFCVVGLASNLLALSVLAGARQGGSHTRSSFLTFLCGLVLTDFLGLLVTGTIVVSQHAALFEWHAVDPGCRLCRFMGVVMIFFGLSPLLLGAAMASERYLGITRPFSRPAVASQRRAWATVGLVWAAALALGLLPLLGVGRYTVQYPGSWCFLTLGAESGDVAFGLLFSMLGGLSVGLSFLLNTVSVATLCHVYHGQEAAQQRPRDSEVEMMAQLLGIMVVASVCWLPLLVFIAQTVLRNPPAMSPAGQLSRTTEKELLIYLRVATWNQILDPWVYILFRRAVLRRLQPRLSTRPRSLSLQPQLTQRSGLQ. The pIC50 is 3.5. (3) The compound is NCc1c(OCCCO)cccc1OCCCO. The target protein (Q9TRC7) has sequence MGRGTLALGWAGAALLLLQMLAAAERSPRTPGGKAGVFADLSAQELKAVHSFLWSQKELKLEPSGTLTMAKNSVFLIEMLLPKKQHVLKFLDKGHRRPVREARAVIFFGAQEQPNVTEFAVGPLPTPRYMRDLPPRPGHQVSWASRPISKAEYALLSHKLQEATQPLRQFFRRTTGSSFGDCHEQCLTFTDVAPRGLASGQRRTWFILQRQMPGYFLHPTGLELLVDHGSTNAQDWTVEQVWYNGKFYRSPEELAQKYNDGEVDVVILEDPLAKGKDGESLPEPALFSFYQPRGDFAVTMHGPHVVQPQGPRYSLEGNRVMYGGWSFAFRLRSSSGLQILDVHFGGERIAYEVSVQEAVALYGGHTPAGMQTKYIDVGWGLGSVTHELAPDIDCPETATFLDALHHYDADGPVLYPRALCLFEMPTGVPLRRHFNSNFSGGFNFYAGLKGQVLVLRTTSTVYNYDYIWDFIFYPNGVMEAKMHATGYVHATFYTPEGLRY.... The pIC50 is 3.0. (4) The drug is COc1ccc(CC2(C)SC(=O)C(C)C2=O)cc1. The target protein (P0A953) has sequence MKRAVITGLGIVSSIGNNQQEVLASLREGRSGITFSQELKDSGMRSHVWGNVKLDTTGLIDRKVVRFMSDASIYAFLSMEQAIADAGLSPEAYQNNPRVGLIAGSGGGSPRFQVFGADAMRGPRGLKAVGPYVVTKAMASGVSACLATPFKIHGVNYSISSACATSAHCIGNAVEQIQLGKQDIVFAGGGEELCWEMACEFDAMGALSTKYNDTPEKASRTYDAHRDGFVIAGGGGMVVVEELEHALARGAHIYAEIVGYGATSDGADMVAPSGEGAVRCMKMAMHGVDTPIDYLNSHGTSTPVGDVKELAAIREVFGDKSPAISATKAMTGHSLGAAGVQEAIYSLLMLEHGFIAPSINIEELDEQAAGLNIVTETTDRELTTVMSNSFGFGGTNATLVMRKLKD. The pIC50 is 3.0. (5) The small molecule is NC(=O)c1cc(-c2ccc(Cl)c(Cl)c2)cc2c1[nH]c1ccc(C(=O)N3CCOCC3)cc12. The target protein (O60674) has sequence MGMACLTMTEMEGTSTSSIYQNGDISGNANSMKQIDPVLQVYLYHSLGKSEADYLTFPSGEYVAEEICIAASKACGITPVYHNMFALMSETERIWYPPNHVFHIDESTRHNVLYRIRFYFPRWYCSGSNRAYRHGISRGAEAPLLDDFVMSYLFAQWRHDFVHGWIKVPVTHETQEECLGMAVLDMMRIAKENDQTPLAIYNSISYKTFLPKCIRAKIQDYHILTRKRIRYRFRRFIQQFSQCKATARNLKLKYLINLETLQSAFYTEKFEVKEPGSGPSGEEIFATIIITGNGGIQWSRGKHKESETLTEQDLQLYCDFPNIIDVSIKQANQEGSNESRVVTIHKQDGKNLEIELSSLREALSFVSLIDGYYRLTADAHHYLCKEVAPPAVLENIQSNCHGPISMDFAISKLKKAGNQTGLYVLRCSPKDFNKYFLTFAVERENVIEYKHCLITKNENEEYNLSGTKKNFSSLKDLLNCYQMETVRSDNIIFQFTKCCP.... The pIC50 is 6.7.