Task: Regression. Given a target protein amino acid sequence and a drug SMILES string, predict the binding affinity score between them. We predict pIC50 (pIC50 = -log10(IC50 in M); higher means more potent). Dataset: bindingdb_ic50.. Dataset: Drug-target binding data from BindingDB using IC50 measurements (1) The drug is CC(Oc1ccc(Cl)cc1)c1nnc(Nc2ccc(Cl)cc2)o1. The target protein sequence is MKLSPREVEKLGLHNAGYLAQKRLARGVRLNYTEAVALIASQIMEYARDGEKTVAQLMCLGQHLLGRRQVLPAVPHLLNAVQVEATFPDGTKLVTVHDPISRENGELQEALFGSLLPVPSLDKFAETKEDNRIPGEILCEDECLTLNIGRKAVILKVTSKGDRPIQVGSHYHFIEVNPYLTFDRRKAYGMRLNIAAGTAVRFEPGDCKSVTLVSIEGNKVIRGGNAIADGPVNETNLEAAMHAVRSKGFGHEEEKDASEGFTKEDPNCPFNTFIHRKEYANKYGPTTGDKIRLGDTNLLAEIEKDYALYGDECVFGGGKVIRDGMGQSCGHPPAISLDTVITNAVIIDYTGIIKADIGIKDGLIASIGKAGNPDIMNGVFSNMIIGANTEVIAGEGLIVTAGAIDCHVHYICPQLVYEAISSGITTLVGGGTGPAAGTRATTCTPSPTQMRLMLQSTDYLPLNFGFTGKGSSSKPDELHEIIKAGAMGLKLHEDWGSTPA.... The pIC50 is 4.5. (2) The drug is c1ccc(-c2c[nH]c(SSC3CCCCC3)n2)cc1. The target protein (P10599) has sequence MVKQIESKTAFQEALDAAGDKLVVVDFSATWCGPCKMIKPFFHSLSEKYSNVIFLEVDVDDCQDVASECEVKCMPTFQFFKKGQKVGEFSGANKEKLEATINELV. The pIC50 is 6.3. (3) The compound is CC1(C)C(=O)CC[C@@]2(C)[C@H]1CC[C@]1(C)[C@@H]2C(=O)C=C2[C@@H]3C[C@@](C)(C(=O)NCCn4c(CCC(=O)O)cc5ccccc54)CC[C@]3(C)CC[C@]21C. The target protein (Q13526) has sequence MADEEKLPPGWEKRMSRSSGRVYYFNHITNASQWERPSGNSSSGGKNGQGEPARVRCSHLLVKHSQSRRPSSWRQEKITRTKEEALELINGYIQKIKSGEEDFESLASQFSDCSSAKARGDLGAFSRGQMQKPFEDASFALRTGEMSGPVFTDSGIHIILRTE. The pIC50 is 5.0. (4) The drug is O=C(Nc1ccc(Cl)cn1)c1ccccc1F. The target protein (P06582) has sequence MSLYHYFRPAQRSVFGDLMRDMALMERQFAPVCRISPSESSEIVNNDQKFAINLNVSQFKPEDLKINLDGRTLSIQGEQELKTDHGYSKKSFSRVILLPEDVDVGAVASNLSEDGKLSIEAPKKEAVQGRSIPIQQAIVEEKSAE. The pIC50 is 4.2. (5) The drug is C[C@H](CCC(=O)N[C@H](CC(=O)O)Cc1c[nH]c2ccccc12)[C@H]1CC[C@H]2[C@@H]3CC[C@@H]4C[C@H](O)CC[C@]4(C)[C@H]3CC[C@@]21C. The target protein (Q03145) has sequence MELRAVGFCLALLWGCALAAAAAQGKEVVLLDFAAMKGELGWLTHPYGKGWDLMQNIMDDMPIYMYSVCNVVSGDQDNWLRTNWVYREEAERIFIELKFTVRDCNSFPGGASSCKETFNLYYAESDVDYGTNFQKRQFTKIDTIAPDEITVSSDFEARNVKLNVEERMVGPLTRKGFYLAFQDIGACVALLSVRVYYKKCPEMLQSLARFPETIAVAVSDTQPLATVAGTCVDHAVVPYGGEGPLMHCTVDGEWLVPIGQCLCQEGYEKVEDACRACSPGFFKSEASESPCLECPEHTLPSTEGATSCQCEEGYFRAPEDPLSMSCTRPPSAPNYLTAIGMGAKVELRWTAPKDTGGRQDIVYSVTCEQCWPESGECGPCEASVRYSEPPHALTRTSVTVSDLEPHMNYTFAVEARNGVSGLVTSRSFRTASVSINQTEPPKVRLEDRSTTSLSVTWSIPVSQQSRVWKYEVTYRKKGDANSYNVRRTEGFSVTLDDLAP.... The pIC50 is 6.0. (6) The compound is O=C1c2ccccc2C(=O)N1CCSC(=S)N1CCCC1. The target protein sequence is MLFSRFVLLAFGSVAAVSASSIYARGRGGSSTDQPVANPYNTKEISLAAGLVQQTYCDSTENGLKIGDSELLYTMGEGYARQRVNIYHSPSLGIAVAIEGTNLFSLNSDLHDAKFWQEDPNERYIQYYPKGTKLMHGFQQAYNDLMDDIFTAVKKYKKEKNEKRVTVIGHSLGAAMGLLCAMDIELRMDGGLYKTYLFGLPRLGNPTFASFVDQKIGDKFHSIINGRDWVPTVPPRALGYQHPSDYVWIYPGNSTSAKLYPGQENVHGILTVAREFNFDDHQGIYFHTQIGAVMGECPAQVGAH. The pIC50 is 3.7. (7) The small molecule is O=C1CCC(=O)N1. The target protein sequence is MPLDAGGQNSTQMVLAPGASIFRCRQCGQTISRRDWLLPMGGDHEHVVFNPAGMIFRVWCFSLAQGLRLIGAPSGEFSWFKGYDWTIALCGQCGSHLGWHYEGGSQPQTFFGLIKDRLAEGPAD. The pIC50 is 5.1.