From a dataset of Drug-target binding data from BindingDB using IC50 measurements. Regression. Given a target protein amino acid sequence and a drug SMILES string, predict the binding affinity score between them. We predict pIC50 (pIC50 = -log10(IC50 in M); higher means more potent). Dataset: bindingdb_ic50. (1) The drug is O=C(CBr)c1ccc([N+](=O)[O-])cc1. The target is XTSFAESXKPVQQPSAFGS. The pIC50 is 5.7. (2) The small molecule is C[C@@H]1NC(=O)[C@H]2CCCN2C(=O)[C@H](CCCCCC(=O)[C@@H]2CO2)NC(=O)[C@@H](C)NC1=O. The target protein sequence is SSPITGLVYDQRMMLHHNMWDSHHPELPQRISRIFSRHEELRLLSRCHRIPARLATEEELALCHSSKHISIIKSSEHMKPRDLNRLGDEYNSIFISNESYTCALLAAGSCFNSAQAILTGQVRNAVAIVRPPGHHAEKDTACGFCFFNTAALTARYAQSITRESLRVLIVDWDVHHGNGTQHIFEEDDSVLYISLHRYEDGAFFPNSEDANYDKVGLGKGRGYNVNIPWNGGKMGDPEYMAAFHHLVMPIAREFAPELVLVSAGFDAARGDPLGGFQVTPEGYAHLTHQLMSLAAGRVLIILEGGYNLTSISESMSMCTSMLLGDSPPSLDHLTPLKTSATVSINNVLRAHAPFWSSLR. The pIC50 is 6.2. (3) The compound is CC(=O)N[C@H](C(=O)N[C@H](C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CC(=O)O)C(=O)N[C@@H](CC(C)C)C(=O)O)C(C)C)C(C)C. The target protein (P08543) has sequence MASRPAASSPVEARAPVGGQEAGGPSAATQGEAAGAPLAHGHHVYCQRVNGVMVLSDKTPGSASYRISDNNFVQCGSNCTMIIDGDVVRGRPQDPGAAASPAPFVAVTNIGAGSDGGTAVVAFGGTPRRSAGTSTGTQTADVPTEALGGPPPPPRFTLGGGCCSCRDTRRRSAVFGGEGDPVGPAEFVSDDRSSDSDSDDSEDTDSETLSHASSDVSGGATYDDALDSDSSSDDSLQIDGPVCRPWSNDTAPLDVCPGTPGPGADAGGPSAVDPHAPTPEAGAGLAADPAVARDDAEGLSDPRPRLGTGTAYPVPLELTPENAEAVARFLGDAVNREPALMLEYFCRCAREETKRVPPRTFGSPPRLTEDDFGLLNYALVEMQRLCLDVPPVPPNAYMPYYLREYVTRLVNGFKPLVSRSARLYRILGVLVHLRIRTREASFEEWLRSKEVALDFGLTERLREHEAQLVILAQALDHYDCLIHSTPHTLVERGLQSALKY.... The pIC50 is 3.4. (4) The drug is Cc1cc(-c2cnc(CC#N)nc2N2CCC(c3[nH]cnc3C)CC2)ccc1F. The target protein (P19634) has sequence MVLRSGICGLSPHRIFPSLLVVVALVGLLPVLRSHGLQLSPTASTIRSSEPPRERSIGDVTTAPPEVTPESRPVNHSVTDHGMKPRKAFPVLGIDYTHVRTPFEISLWILLACLMKIGFHVIPTISSIVPESCLLIVVGLLVGGLIKGVGETPPFLQSDVFFLFLLPPIILDAGYFLPLRQFTENLGTILIFAVVGTLWNAFFLGGLMYAVCLVGGEQINNIGLLDNLLFGSIISAVDPVAVLAVFEEIHINELLHILVFGESLLNDAVTVVLYHLFEEFANYEHVGIVDIFLGFLSFFVVALGGVLVGVVYGVIAAFTSRFTSHIRVIEPLFVFLYSYMAYLSAELFHLSGIMALIASGVVMRPYVEANISHKSHTTIKYFLKMWSSVSETLIFIFLGVSTVAGSHHWNWTFVISTLLFCLIARVLGVLGLTWFINKFRIVKLTPKDQFIIAYGGLRGAIAFSLGYLLDKKHFPMCDLFLTAIITVIFFTVFVQGMTIR.... The pIC50 is 8.3. (5) The drug is O=C(O)C(=O)N[C@@H](CSCc1cccc2ccccc12)C(=O)O. The target protein (Q9UPP1) has sequence MNRSRAIVQRGRVLPPPAPLDTTNLAGRRTLQGRAKMASVPVYCLCRLPYDVTRFMIECDMCQDWFHGSCVGVEEEKAADIDLYHCPNCEVLHGPSIMKKRRGSSKGHDTHKGKPVKTGSPTFVRELRSRTFDSSDEVILKPTGNQLTVEFLEENSFSVPILVLKKDGLGMTLPSPSFTVRDVEHYVGSDKEIDVIDVTRQADCKMKLGDFVKYYYSGKREKVLNVISLEFSDTRLSNLVETPKIVRKLSWVENLWPEECVFERPNVQKYCLMSVRDSYTDFHIDFGGTSVWYHVLKGEKIFYLIRPTNANLTLFECWSSSSNQNEMFFGDQVDKCYKCSVKQGQTLFIPTGWIHAVLTPVDCLAFGGNFLHSLNIEMQLKAYEIEKRLSTADLFRFPNFETICWYVGKHILDIFRGLRENRRHPASYLVHGGKALNLAFRAWTRKEALPDHEDEIPETVRTVQLIKDLAREIRLVEDIFQQNVGKTSNIFGLQRIFPAG.... The pIC50 is 3.0. (6) The drug is O=C(c1nc2cccnn2c1-c1cncc(Cl)c1)N1CCC1. The target protein sequence is ATFPGHSQRREEFLYRSDSDYDLSPKAMSRNSSLPSEQHGDDLIVTPFAQVLASLRSVRNNFTILTNLHGTSNKRSPAASQPPVSRVNPQEESYQKLAMETLEELDWCLDQLETIQTYRSVSEMASNKFKRMLNRELTHLSEMSRSGNQVSEYISNTFLDKQNDVEIPSPTQKDREKKKKQQLMTQISGVKKLMHSSSLNNTSISRFGVNTENEDHLAKELEDLNKWGLNIFNVAGYSHNRPLTCIMYAIFQERDLLKTFRISSDTFITYMMTLEDHYHSDVAYHNSLHAADVAQSTHVLLSTPALDAVFTDLEILAAIFAAAIHDVDHPGVSNQFLINTNSELALMYNDESVLENHHLAVGFKLLQEEHCDIFMNLTKKQRQTLRKMVIDMVLATDMSKHMSLLADLKTMVETKKVTSSGVLLLDNYTDRIQVLRNMVHCADLSNPTKSLELYRQWTDRIMEEFFQQGDKERERGMEISPMCDKHTASVEKSQVGFIDY.... The pIC50 is 6.6. (7) The compound is Clc1cc2ccccc2cc1Cl. The target protein (P20852) has sequence MLTSGLLLVAAVAFLSVLVLMSVWKQRKLSGKLPPGPTPLPFIGNFLQLNTEQMYNSLMKISQRYGPVFTIYLGPRRIVVLCGQEAVKEALVDQAEEFSGRGEQATFDWLFKGYGVVFSSGERAKQLRRFSIATLRDFGVGKRGIEERIQEEAGFLIDSFRKTNGAFIDPTFYLSRTVSNVISSIVFGDRFDYEDKEFLSLLRMMLGSFQFTATSMGQLYEMFSSVMKHLPGPQQQAFKELQGLEDFITKKVEHNQRTLDPNSPRDFIDSFLIRMLEEKKNPNTEFYMKNLVLTTLNLFFAGTETVSTTLRYGFLLLMKHPDIEAKVHEEIDRVIGRNRQPKYEDRMKMPYTEAVIHEIQRFADMIPMGLARRVTKDTKFRDFLLPKGTEVFPMLGSVLKDPKFFSNPKDFNPKHFLDDKGQFKKNDAFVPFSIGKRYCFGEGLARMELFLFLTNIMQNFHFKSTQAPQDIDVSPRLVGFATIPPTYTMSFLSR. The pIC50 is 4.2. (8) The small molecule is CC(C)CCN1c2nc(Nc3cc(F)c(O)c(F)c3)ncc2N(C)C(=O)C1C. The target protein sequence is MPLAQLKEPWPLMELVPLDPENGQASGEEAGLQPSKDEGILKEISITHHVKAGSEKADPSHFELLKVLGQGSFGKVFLVRKVTRPDNGHLYAMKVLKKATLKVRDRVRTKMERDILADVNHPFVVKLHYAFQTEGKLYLILDFLRGGDLFTRLSKEVMFTEEDVKFYLAELALGLDHLHSLGIIYRDLKPENILLDEEGHIKLTDFGLSKEAIDHEKKAYSFCGTVEYMAPEVVNRQGHTHSADWWSYGVLMFEMLTGSLPFQGKDRKETMTLILKAKLGMPQFLSTEAQSLLRALFKRNPANRLGSGPDGAEEIKRHIFYSTIDWNKLYRREIKPPFKPAVAQPDDTFYFDTEFTSRTPRDSPGIPPSAGAHQLFRGFSFVATGLMEDDSKPRATQAPLHSVVQQLHGKNLVFSDGYIVKETIGVGSYSVCKRCVHKATNMEYAVKVIDKSKRDPSEEIEILLRYGQHPNIITLKDVYDDSKHVYLVTELMRGGELLDK.... The pIC50 is 7.5. (9) The drug is CC(=O)N1CCc2[nH]nc(Nc3ccccc3)c2C1. The target protein (Q9H0E9) has sequence MATGTGKHKLLSTGPTEPWSIREKLCLASSVMRSGDQNWVSVSRAIKPFAEPGRPPDWFSQKHCASQYSELLETTETPKRKRGEKGEVVETVEDVIVRKLTAERVEELKKVIKETQERYRRLKRDAELIQAGHMDSRLDELCNDIATKKKLEEEEAEVKRKATDAAYQARQAVKTPPRRLPTVMVRSPIDSASPGGDYPLGDLTPTTMEEATSGVNESEMAVASGHLNSTGVLLEVGGVLPMIHGGEIQQTPNTVAASPAASGAPTLSRLLEAGPTQFTTPLASFTTVASEPPVKLVPPPVESVSQATIVMMPALPAPSSAPAVSTTESVAPVSQPDNCVPMEAVGDPHTVTVSMDSSEISMIINSIKEECFRSGVAEAPVGSKAPSIDGKEELDLAEKMDIAVSYTGEELDFETVGDIIAIIEDKVDDHPEVLDVAAVEAALSFCEENDDPQSLPGPWEHPIQQERDKPVPLPAPEMTVKQERLDFEETENKGIHELVD.... The pIC50 is 4.7. (10) The target is XTSFAESXKPVQQPSAFGS. The compound is CCOC(=O)CN1C(=O)C=CC1=O. The pIC50 is 5.5.