Dataset: Drug-target binding data from BindingDB using IC50 measurements. Task: Regression. Given a target protein amino acid sequence and a drug SMILES string, predict the binding affinity score between them. We predict pIC50 (pIC50 = -log10(IC50 in M); higher means more potent). Dataset: bindingdb_ic50. (1) The small molecule is OC(CN(Cc1cccc(OCC(F)(F)F)c1)c1cccc(Oc2ccccc2)c1)C(F)(F)F. The target protein (P11597) has sequence MLAATVLTLALLGNAHACSKGTSHEAGIVCRITKPALLVLNHETAKVIQTAFQRASYPDITGEKAMMLLGQVKYGLHNIQISHLSIASSQVELVEAKSIDVSIQNVSVVFKGTLKYGYTTAWWLGIDQSIDFEIDSAIDLQINTQLTCDSGRVRTDAPDCYLSFHKLLLHLQGEREPGWIKQLFTNFISFTLKLVLKGQICKEINVISNIMADFVQTRAASILSDGDIGVDISLTGDPVITASYLESHHKGHFIYKNVSEDLPLPTFSPTLLGDSRMLYFWFSERVFHSLAKVAFQDGRLMLSLMGDEFKAVLETWGFNTNQEIFQEVVGGFPSQAQVTVHCLKMPKISCQNKGVVVNSSVMVKFLFPRPDQQHSVAYTFEEDIVTTVQASYSKKKLFLSLLDFQITPKTVSNLTESSSESVQSFLQSMITAVGIPEVMSRLEVVFTALMNSKGVSLFDIINPEIITRDGFLLLQMDFGFPEHLLVDFLQSLS. The pIC50 is 4.2. (2) The small molecule is COc1ccccc1-c1ccc(C[C@H](CC(=O)O)NC(=O)CCC(=O)O)cc1. The target protein (Q495T6) has sequence MGKSEGPVGMVESAGRAGQKRPGFLEGGLLLLLLLVTAALVALGVLYADRRGKQLPRLASRLCFLQEERTFVKRKPRGIPEAQEVSEVCTTPGCVIAAARILQNMDPTTEPCDDFYQFACGGWLRRHVIPETNSRYSIFDVLRDELEVILKAVLENSTAKDRPAVEKARTLYRSCMNQSVIEKRGSQPLLDILEVVGGWPVAMDRWNETVGLEWELERQLALMNSQFNRRVLIDLFIWNDDQNSSRHIIYIDQPTLGMPSREYYFNGGSNRKVREAYLQFMVSVATLLREDANLPRDSCLVQEDMVQVLELETQLAKATVPQEERHDVIALYHRMGLEELQSQFGLKGFNWTLFIQTVLSSVKIKLLPDEEVVVYGIPYLQNLENIIDTYSARTIQNYLVWRLVLDRIGSLSQRFKDTRVNYRKALFGTMVEEVRWRECVGYVNSNMENAVGSLYVREAFPGDSKSMVRELIDKVRTVFVETLDELGWMDEESKKKAQEK.... The pIC50 is 8.4. (3) The compound is C[C@@H](N)[C@H]1CC[C@H](C(=O)Nc2ccncc2)CC1. The target protein sequence is MGNAAAAKKGSEQESVKEFLAKAKEDFLKKWENPAQNTAHLDQFERIKTIGTGSFGRVMLVKHMETGNHYAMKILDKQKVVKLKQIEHTLNEKRILQAVNFPFLVKLEFSFKDNSNLYMVMEYVPGGEMFSHLRRIGRFSEPHARFYAAQIVLTFEYLHSLDLIYRDLKPENLLIDQQGYIKVADFGFAKRVKGRTWTLCGTPEYLAPEIILSKGYNKAVDWWALGVLIYEMAAGYPPFFADQPIQIYEKIVSGKVRFPSHFSSDLKDLLRNLLQVDLTKRFGNLKNGVNDIKNHKWFATTDWIAIYQRKVEAPFIPKFKGPGDTSNFDDYEEEEIRVSINEKCGKEFSEF. The pIC50 is 5.2. (4) The compound is CCn1c(=O)c(-c2ccc(-c3cncs3)cc2Cl)cc2cnc(Nc3ccc(N4CCN(C)CC4)cc3)nc21. The target protein sequence is SDEEILEKLRSIVSVGDPKKKYTRFEKIGQGASGTVYTAMDVATGQEVAIRQMNLQQQPKKELIINEILVMRENKNPNIVNYLDSYLVGDELWVVMEYLAGGSLTDVVTETCMDEGQIAAVCRECLQALEFLHSNQVIHRDIKSDNILLGMDGSVKLTDFGFCAQITPEQSKRSTMVGTPYWMAPEVVTRKAYGPKVDIWSLGIMAIEMIEGEPPYLNENPLRALYLIATNGTPELQNPEKLSAIFRDFLNRCLEMDVEKRGSAKELLQHQFLKIAKPLSSLTPLIAAAKEATKNNH. The pIC50 is 8.1. (5) The small molecule is CCC(=O)N1CCCCC1C(C)(O)[C@H]1CCC2C3CCC4C[C@@H](O[Si](C)(C)C(C)(C)C)CC[C@]4(C)C3CC[C@@]21C. The target protein (Q9LM02) has sequence MDLASNLGGKIDKSDVLTAVEKYEQYHVFHGGNEEERKANYTDMVNKYYDLATSFYEYGWGESFHFAQRWKGESLRESIKRHEHFLALQLGIQPGQKVLDVGCGIGGPLREIARFSNSVVTGLNNNEYQITRGKELNRLAGVDKTCNFVKADFMKMPFPENSFDAVYAIEATCHAPDAYGCYKEIYRVLKPGQCFAAYEWCMTDAFDPDNAEHQKIKGEIEIGDGLPDIRLTTKCLEALKQAGFEVIWEKDLAKDSPVPWYLPLDKNHFSLSSFRLTAVGRFITKNMVKILEYIRLAPQGSQRVSNFLEQAAEGLVDGGRREIFTPMYFFLARKPE. The pIC50 is 4.0.