From a dataset of Drug-target binding data from BindingDB using IC50 measurements. Regression. Given a target protein amino acid sequence and a drug SMILES string, predict the binding affinity score between them. We predict pIC50 (pIC50 = -log10(IC50 in M); higher means more potent). Dataset: bindingdb_ic50. (1) The small molecule is COc1cc(C(=O)N2CO[C@](CCN3CCC4(CC3)NC(=O)Cc3ccccc34)(c3ccc(Cl)cc3)C2)cc(OC)c1OC. The target protein (Q64077) has sequence MGACVIVTNTNISSGLESNTTGITAFSMPTWQLALWATAYLALVLVAVTGNATVTWIILAHQRMRTVTNYFIVNLALADLCMAAFNAAFNFVYASHNIWYFGRAFCYFQNLFPITAMFVSIYSMTAIAIDRYMAIVHPFQPRLSAPSTKAVIGGIWLVALALAFPQCFYSTITEDEGATKCVVAWPEDSRDKSLLLYHLVVIVLIYLLPLTVMFVAYSIIGLTLWRRAVPRHQAHGANLRHLQAKKKFVKTMVLVVVTFAICWLPYHLYFILGSFQEDIYCHKFIQQVYLALFWLAMSSTMYNPIIYCCLNRRFRSGFRLAFRCCPWVTPTEEDKLELTHTPSFSLRVNRCHTKEILFMAGDTVPSEATNGQAGGPQDRESVELSSLPGCRAGPSILAKASS. The pIC50 is 8.5. (2) The small molecule is Cc1nn(C)c(C)c1N[S+](=O)([O-])c1ccc(-c2ccnc(N3CCNCC3)c2)cc1. The target protein (Q9UVX3) has sequence MSDSKDRKGKAPEGQSSEKKDGAVNITPQMAESLLENNPALRNETAGMDKDKAAEAMRKMNIAELLTGLSVSGKNQKDMASYKFWQTQPVPRFDETSTDTGGPIKIIDPEKVSKEPDALLEGFEWATLDLTNETELQELWDLLTYHYVEDDNAMFRFRYSQSFLHWALMSPGWKKEWHVGVRATKSRKLVASICGVPTEINVRNQKLKVVEINFLCIHKKLRSKRLTPVLIKEITRRCYLNGIYQAIYTAGVVLPTPVSSCRYYHRPLDWLKLYEVGFSPLPAGSTKARQITKNHLPSTTSTPGLRPMEPKDIDTVHDLLQRYLSRFALNQAFTREEVDHWLVHKPETVKEQVVWAYVVEDPETHKITDFFSFYNLESTVIQNPKHDNVRAAYLYYYATETAFTNNMKALKERLLMLMNDALILAKKAHFDVFNALTLHDNPLFLEQLKFGAGDGQLHFYLYNYRTAPVPGGVNEKNLPDEKRMGGVGIVML. The pIC50 is 8.0. (3) The drug is O=C(Nc1cccc(C(F)(F)F)c1)Nc1c(Oc2ccc(Cl)c(Cl)c2)ccc(Cl)c1S(=O)(=O)O. The target protein (P14677) has sequence MKWTKRVIRYATKNRKSPAENRRRVGKSLSLLSVFVFAIFLVNFAVIIGTGTRFGTDLAKEAKKVHQTTRTVPAKRGTIYDRNGVPIAEDATSYNVYAVIDENYKSATGKILYVEKTQFNKVAEVFHKYLDMEESYVREQLSQPNLKQVSFGAKGNGITYANMMSIKKELEAAEVKGIDFTTSPNRSYPNGQFASSFIGLAQLHENEDGSKSLLGTSGMESSLNSILAGTDGIITYEKDRLGNIVPGTEQVSQRTMDGKDVYTTISSPLQSFMETQMDAFQEKVKGKYMTATLVSAKTGEILATTQRPTFDADTKEGITEDFVWRDILYQSNYEPGSTMKVMMLAAAIDNNTFPGGEVFNSSELKIADATIRDWDVNEGLTGGRTMTFSQGFAHSSNVGMTLLEQKMGDATWLDYLNRFKFGVPTRFGLTDEYAGQLPADNIVNIAQSSFGQGISVTQTQMIRAFTAIANDGVMLEPKFISAIYDPNDQTARKSQKEIVG.... The pIC50 is 4.1. (4) The compound is N#CCC[C@@H]1OCC[C@@]2(S(=O)(=O)c3ccc(Cl)cc3)c3c(F)ccc(F)c3OC[C@@H]12. The target protein (P49810) has sequence MLTFMASDSEEEVCDERTSLMSAESPTPRSCQEGRQGPEDGENTAQWRSQENEEDGEEDPDRYVCSGVPGRPPGLEEELTLKYGAKHVIMLFVPVTLCMIVVVATIKSVRFYTEKNGQLIYTPFTEDTPSVGQRLLNSVLNTLIMISVIVVMTIFLVVLYKYRCYKFIHGWLIMSSLMLLFLFTYIYLGEVLKTYNVAMDYPTLLLTVWNFGAVGMVCIHWKGPLVLQQAYLIMISALMALVFIKYLPEWSAWVILGAISVYDLVAVLCPKGPLRMLVETAQERNEPIFPALIYSSAMVWTVGMAKLDPSSQGALQLPYDPEMEEDSYDSFGEPSYPEVFEPPLTGYPGEELEEEEERGVKLGLGDFIFYSVLVGKAAATGSGDWNTTLACFVAILIGLCLTLLLLAVFKKALPALPISITFGLIFYFSTDNLVRPFMDTLASHQLYI. The pIC50 is 7.8.