Dataset: Drug-target binding data from BindingDB using IC50 measurements. Task: Regression. Given a target protein amino acid sequence and a drug SMILES string, predict the binding affinity score between them. We predict pIC50 (pIC50 = -log10(IC50 in M); higher means more potent). Dataset: bindingdb_ic50. (1) The small molecule is C[C@H](N)C(=O)N1CCC[C@H]1C(=O)O. The target protein (P12821) has sequence MGAASGRRGPGLLLPLPLLLLLPPQPALALDPGLQPGNFSADEAGAQLFAQSYNSSAEQVLFQSVAASWAHDTNITAENARRQEEAALLSQEFAEAWGQKAKELYEPIWQNFTDPQLRRIIGAVRTLGSANLPLAKRQQYNALLSNMSRIYSTAKVCLPNKTATCWSLDPDLTNILASSRSYAMLLFAWEGWHNAAGIPLKPLYEDFTALSNEAYKQDGFTDTGAYWRSWYNSPTFEDDLEHLYQQLEPLYLNLHAFVRRALHRRYGDRYINLRGPIPAHLLGDMWAQSWENIYDMVVPFPDKPNLDVTSTMLQQGWNATHMFRVAEEFFTSLELSPMPPEFWEGSMLEKPADGREVVCHASAWDFYNRKDFRIKQCTRVTMDQLSTVHHEMGHIQYYLQYKDLPVSLRRGANPGFHEAIGDVLALSVSTPEHLHKIGLLDRVTNDTESDINYLLKMALEKIAFLPFGYLVDQWRWGVFSGRTPPSRYNFDWWYLRTKYQ.... The pIC50 is 4.0. (2) The drug is O=C(NC[C@@H](O)C(=O)O)c1ccc(CN(C(=O)Nc2cccc(Cl)c2)c2ccc(C3=CCCCC3)cc2)cc1. The target protein (P47871) has sequence MPPCQPQRPLLLLLLLLACQPQVPSAQVMDFLFEKWKLYGDQCHHNLSLLPPPTELVCNRTFDKYSCWPDTPANTTANISCPWYLPWHHKVQHRFVFKRCGPDGQWVRGPRGQPWRDASQCQMDGEEIEVQKEVAKMYSSFQVMYTVGYSLSLGALLLALAILGGLSKLHCTRNAIHANLFASFVLKASSVLVIDGLLRTRYSQKIGDDLSVSTWLSDGAVAGCRVAAVFMQYGIVANYCWLLVEGLYLHNLLGLATLPERSFFSLYLGIGWGAPMLFVVPWAVVKCLFENVQCWTSNDNMGFWWILRFPVFLAILINFFIFVRIVQLLVAKLRARQMHHTDYKFRLAKSTLTLIPLLGVHEVVFAFVTDEHAQGTLRSAKLFFDLFLSSFQGLLVAVLYCFLNKEVQSELRRRWHRWRLGKVLWEERNTSNHRASSSPGHGPPSKELQFGRGGGSQDSSAETPLAGGLPRLAESPF. The pIC50 is 7.8. (3) The small molecule is Cc1nc(CCc2ccccc2)c(C(=O)O)c(C(=O)O)c1O. The target protein (P51576) has sequence MARRLQDELSAFFFEYDTPRMVLVRNKKVGVIFRLIQLVVLVYVIGWVFVYEKGYQTSSGLISSVSVKLKGLAVTQLQGLGPQVWDVADYVFPAHGDSSFVVMTNFIMTPQQAQGHCAENPEGGICQDDSGCTPGKAERKAQGIRTGNCVPFNGTVKTCEIFGWCPVEVDDKIPSPALLHEAENFTLFIKNSISFPRFKVNRRNLVEEVNGTYMKKCLYHKILHPLCPVFSLGYVVRESGQDFRSLAEKGGVVGITIDWECDLDWHVRHCKPIYQFHGLYGEKNLSPGFNFRFARHFVQNGTNRRHLFKVFGIRFDILVDGKAGKFDIIPTMTTIGSGIGIFGVATVLCDLLLLHILPKRHYYKQKKFKYAEDMGPGEGERDPAATSSTLGLQENMRTS. The pIC50 is 6.2. (4) The drug is O=c1c(-c2ccc(O)cc2)coc2cc(O)cc(O)c12. The target protein sequence is MRFLIIHIAVIVLPFVLMIDVKRENSFFLRHSPKRLYKKADYNNMYDKIIKKQQNRIYDVSSQINQDNINGQNISFNLTFPNYDTSIDIEDIKKILPHRYPFLLVDKVIYMQPNKTIIGLKQVSTNEPFFNGHFPQKQIMPGVLQIEALAQLAGILCLKSDDSQKNNLFLFAGVDGVRWKKPVLPGDTLTMQANLISFKSSLGIAKLSGVGYVNGKVVINISEMTFALSK. The pIC50 is 5.2. (5) The small molecule is CCCC[C@H](NC(=O)OCC1(CCCc2ccccc2)CCC1)C(=O)C(=O)N[C@H](C)c1ccccc1. The target protein (P09668) has sequence MWATLPLLCAGAWLLGVPVCGAAELCVNSLEKFHFKSWMSKHRKTYSTEEYHHRLQTFASNWRKINAHNNGNHTFKMALNQFSDMSFAEIKHKYLWSEPQNCSATKSNYLRGTGPYPPSVDWRKKGNFVSPVKNQGACGSCWTFSTTGALESAIAIATGKMLSLAEQQLVDCAQDFNNHGCQGGLPSQAFEYILYNKGIMGEDTYPYQGKDGYCKFQPGKAIGFVKDVANITIYDEEAMVEAVALYNPVSFAFEVTQDFMMYRTGIYSSTSCHKTPDKVNHAVLAVGYGEKNGIPYWIVKNSWGPQWGMNGYFLIERGKNMCGLAACASYPIPLV. The pIC50 is 5.7. (6) The compound is COC(=O)c1c(-c2ccccc2)c2cc(Br)ccc2c(=O)n1Cc1ccc(S(C)(=O)=O)cc1. The target protein (P49185) has sequence MSRSKRDNNFYSVEIADSTFTVLKRYQNLKPIGSGAQGIVCAAYDAILERNVAIKKLSRPFQNQTHAKRAYRELVLMKCVNHKNIIGLLNVFTPQKSLEEFQDVYIVMELMDANLCQVIQMELDHERMSYLLYQMLCGIKHLHSAGIIHRDLKPSNIVVKSDCTLKILDFGLARTAGTSFMMTPYVVTRYYRAPEVILGMGYKENVDLWSVGCIMGEMVCLKILFPGRDYIDQWNKVIEQLGTPCPEFMKKLQPTVRTYVENRPKYAGYSFEKLFPDVLFPADSEHNKLKASQARDLLSKMLVIDASKRISVDEALQHPYINVWYDPSEAEAPPPKIPDKQLDEREHTIEEWKELIYKEVMDLEERTKNGVIRGQPSPLGAAVINGSQHPVSSPSVNDMSSMSTDPTLASD. The pIC50 is 8.6. (7) The compound is O=C(Nc1ccc(CCNC[C@H](O)c2cccnc2)cc1)c1cccc(CCc2ccccc2)c1. The target protein (P08588) has sequence MGAGVLVLGASEPGNLSSAAPLPDGAATAARLLVPASPPASLLPPASESPEPLSQQWTAGMGLLMALIVLLIVAGNVLVIVAIAKTPRLQTLTNLFIMSLASADLVMGLLVVPFGATIVVWGRWEYGSFFCELWTSVDVLCVTASIETLCVIALDRYLAITSPFRYQSLLTRARARGLVCTVWAISALVSFLPILMHWWRAESDEARRCYNDPKCCDFVTNRAYAIASSVVSFYVPLCIMAFVYLRVFREAQKQVKKIDSCERRFLGGPARPPSPSPSPVPAPAPPPGPPRPAAAAATAPLANGRAGKRRPSRLVALREQKALKTLGIIMGVFTLCWLPFFLANVVKAFHRELVPDRLFVFFNWLGYANSAFNPIIYCRSPDFRKAFQRLLCCARRAARRRHATHGDRPRASGCLARPGPPPSPGAASDDDDDDVVGATPPARLLEPWAGCNGGAAADSDSSLDEPCRPGFASESKV. The pIC50 is 5.4. (8) The drug is O=C(c1ccc(C(=O)N2CCC(N3CCCC3)CC2)c(Oc2ccccc2)c1)N1CCC(N2CCCC2)CC1. The target protein (Q96T88) has sequence MWIQVRTMDGRQTHTVDSLSRLTKVEELRRKIQELFHVEPGLQRLFYRGKQMEDGHTLFDYEVRLNDTIQLLVRQSLVLPHSTKERDSELSDTDSGCCLGQSESDKSSTHGEAAAETDSRPADEDMWDETELGLYKVNEYVDARDTNMGAWFEAQVVRVTRKAPSRDEPCSSTSRPALEEDVIYHVKYDDYPENGVVQMNSRDVRARARTIIKWQDLEVGQVVMLNYNPDNPKERGFWYDAEISRKRETRTARELYANVVLGDDSLNDCRIIFVDEVFKIERPGEGSPMVDNPMRRKSGPSCKHCKDDVNRLCRVCACHLCGGRQDPDKQLMCDECDMAFHIYCLDPPLSSVPSEDEWYCPECRNDASEVVLAGERLRESKKKAKMASATSSSQRDWGKGMACVGRTKECTIVPSNHYGPIPGIPVGTMWRFRVQVSESGVHRPHVAGIHGRSNDGAYSLVLAGGYEDDVDHGNFFTYTGSGGRDLSGNKRTAEQSCDQK.... The pIC50 is 5.0.