Dataset: Drug-target binding data from BindingDB using IC50 measurements. Task: Regression. Given a target protein amino acid sequence and a drug SMILES string, predict the binding affinity score between them. We predict pIC50 (pIC50 = -log10(IC50 in M); higher means more potent). Dataset: bindingdb_ic50. (1) The drug is COc1ccc(OC2CCN(c3nc4c(nc3NC3CCC3)CN(C(=O)N(C)C)CC4)CC2)c(F)c1. The target protein (P46095) has sequence MNASAASLNDSQVVVVAAEGAAAAATAAGGPDTGEWGPPAAAALGAGGGANGSLELSSQLSAGPPGLLLPAVNPWDVLLCVSGTVIAGENALVVALIASTPALRTPMFVLVGSLATADLLAGCGLILHFVFQYLVPSETVSLLTVGFLVASFAASVSSLLAITVDRYLSLYNALTYYSRRTLLGVHLLLAATWTVSLGLGLLPVLGWNCLAERAACSVVRPLARSHVALLSAAFFMVFGIMLHLYVRICQVVWRHAHQIALQQHCLAPPHLAATRKGVGTLAVVLGTFGASWLPFAIYCVVGSHEDPAVYTYATLLPATYNSMINPIIYAFRNQEIQRALWLLLCGCFQSKVPFRSRSPSEV. The pIC50 is 9.9. (2) The compound is C=CC(=O)Nc1ccc2ncnc(Nc3cccc(Br)c3)c2c1. The target protein sequence is MRRRHIVRKRTLRRLLQERELVEPLTPSGEAPNQALLRILKETEFKKIKVLGSGAFGTVYKGLWIPEGEKVKIPVAIKELREATSPKANKEILDEAYVMASVDNPHVCRLLGICLTSTVQLITQLMPFGCLLDYVREHKDNIGSQYLLNWCVQIAKGMNYLEDRRLVHRDLAARNVLVKTPQHVKITDFGLAKLLGAEEKEYHAEGGKVPIKWMALESILHRIYTHQSDVWSYGVTVWELMTFGSKPYDGIPASEISSILEKGERLPQPPICTIDVYMIMVKCWMIDADSRPKFRELIIEFSKMARDPQRYLVIQGDERMHLPSPTDSNFYRALMDEEDMDDVVDADEYLIPQQGFFSSPSTSRTPLLSSLSATSNNSTVACIDRNGLQSCPIKEDSFLQRYSSDPTGALTEDSIDDTFLPVPEYINQSVPKRPAGSVQNPVYHNQPLNPAPSRDPHYQDPHSTAVGNPEYLNTVQPTCVNSTFDSPAHWAQKGSHQISL.... The pIC50 is 9.7. (3) The drug is CC1=NN(c2cccc(C(=O)O)c2)C(=O)/C1=C\c1ccc(-c2cc(Cl)ccc2C(=O)O)o1. The target protein (P0C6X7) has sequence MESLVLGVNEKTHVQLSLPVLQVRDVLVRGFGDSVEEALSEAREHLKNGTCGLVELEKGVLPQLEQPYVFIKRSDALSTNHGHKVVELVAEMDGIQYGRSGITLGVLVPHVGETPIAYRNVLLRKNGNKGAGGHSYGIDLKSYDLGDELGTDPIEDYEQNWNTKHGSGALRELTRELNGGAVTRYVDNNFCGPDGYPLDCIKDFLARAGKSMCTLSEQLDYIESKRGVYCCRDHEHEIAWFTERSDKSYEHQTPFEIKSAKKFDTFKGECPKFVFPLNSKVKVIQPRVEKKKTEGFMGRIRSVYPVASPQECNNMHLSTLMKCNHCDEVSWQTCDFLKATCEHCGTENLVIEGPTTCGYLPTNAVVKMPCPACQDPEIGPEHSVADYHNHSNIETRLRKGGRTRCFGGCVFAYVGCYNKRAYWVPRASADIGSGHTGITGDNVETLNEDLLEILSRERVNINIVGDFHLNEEVAIILASFSASTSAFIDTIKSLDYKSFK.... The pIC50 is 4.3. (4) The drug is CCN(CC)CCCC(C)Nc1ccnc2cc(Cl)ccc12. The target protein (P9WNV1) has sequence MSSPDADQTAPEVLRQWQALAEEVREHQFRYYVRDAPIISDAEFDELLRRLEALEEQHPELRTPDSPTQLVGGAGFATDFEPVDHLERMLSLDNAFTADELAAWAGRIHAEVGDAAHYLCELKIDGVALSLVYREGRLTRASTRGDGRTGEDVTLNARTIADVPERLTPGDDYPVPEVLEVRGEVFFRLDDFQALNASLVEEGKAPFANPRNSAAGSLRQKDPAVTARRRLRMICHGLGHVEGFRPATLHQAYLALRAWGLPVSEHTTLATDLAGVRERIDYWGEHRHEVDHEIDGVVVKVDEVALQRRLGSTSRAPRWAIAYKYPPEEAQTKLLDIRVNVGRTGRITPFAFMTPVKVAGSTVGQATLHNASEIKRKGVLIGDTVVIRKAGDVIPEVLGPVVELRDGSEREFIMPTTCPECGSPLAPEKEGDADIRCPNARGCPGQLRERVFHVASRNGLDIEVLGYEAGVALLQAKVIADEGELFALTERDLLRTDLFR.... The pIC50 is 4.3. (5) The drug is NC1=N[C@@]2(c3cccc(NC(=O)c4ccc(Cl)cn4)c3)C[C@H]2CCS1. The target protein (Q6IE75) has sequence MGALLRALLLPLLAQWLLRAVPVLAPAPFTLPLQVAGAANHRASTVPGLGTPELPRADGLALALEPARATANFLAMVDNLQGDSGRGYYLEMLIGTPPQKVRILVDTGSSNFAVAGAPHSYIDTYFDSESSSTYHSKGFEVTVKYTQGSWTGFVGEDLVTIPKGFNSSFLVNIATIFESENFFLPGIKWNGILGLAYAALAKPSSSLETFFDSLVAQAKIPDIFSMQMCGAGLPVAGSGTNGGSLVLGGIEPSLYKGDIWYTPIKEEWYYQIEILKLEIGGQSLNLDCREYNADKAIVDSGTTLLRLPQKVFDAVVEAVARTSLIPEFSDGFWTGAQLACWTNSETPWAYFPKISIYLRDENASRSFRITILPQLYIQPMMGAGFNYECYRFGISSSTNALVIGATVMEGFYVVFDRAQRRVGFAVSPCAEIAGTTVSEISGPFSTEDIASNCVPAQALNEPILWIVSYALMSVCGAILLVLILLLLFPLHCRHAPRDPE.... The pIC50 is 8.7. (6) The small molecule is COc1cc2c(c(O)c1OC)C(=O)C[C@@H](c1ccc(O)cc1)O2. The target protein (P59538) has sequence MTTFIPIIFSSVVVVLFVIGNFANGFIALVNSIERVKRQKISFADQILTALAVSRVGLLWVLLLNWYSTVFNPAFYSVEVRTTAYNVWAVTGHFSNWLATSLSIFYLLKIANFSNLIFLHLKRRVKSVILVMLLGPLLFLACQLFVINMKEIVRTKEYEGNLTWKIKLRSAVYLSDATVTTLGNLVPFTLTLLCFLLLICSLCKHLKKMQLHGKGSQDPSTKVHIKALQTVIFFLLLCAVYFLSIMISVWSFGSLENKPVFMFCKAIRFSYPSIHPFILIWGNKKLKQTFLSVLRQVRYWVKGEKPSSP. The pIC50 is 5.6. (7) The compound is C[C@@H]1CO[C@@](CCc2ccc(Cl)cc2)(Cn2ccnc2)O1. The pIC50 is 5.8. The target protein (P09601) has sequence MERPQPDSMPQDLSEALKEATKEVHTQAENAEFMRNFQKGQVTRDGFKLVMASLYHIYVALEEEIERNKESPVFAPVYFPEELHRKAALEQDLAFWYGPRWQEVIPYTPAMQRYVKRLHEVGRTEPELLVAHAYTRYLGDLSGGQVLKKIAQKALDLPSSGEGLAFFTFPNIASATKFKQLYRSRMNSLEMTPAVRQRVIEEAKTAFLLNIQLFEELQELLTHDTKDQSPSRAPGLRQRASNKVQDSAPVETPRGKPPLNTRSQAPLLRWVLTLSFLVATVAVGLYAM.