From a dataset of Drug-target binding data from BindingDB using IC50 measurements. Regression. Given a target protein amino acid sequence and a drug SMILES string, predict the binding affinity score between them. We predict pIC50 (pIC50 = -log10(IC50 in M); higher means more potent). Dataset: bindingdb_ic50. (1) The small molecule is Cc1nnn(-c2ccc(-c3ccc(C4(C(=O)O)CC4)cc3)cc2)c1NC(=O)O[C@H](C)c1cccc(C(F)(F)F)c1. The target protein (Q92633) has sequence MAAISTSIPVISQPQFTAMNEPQCFYNESIAFFYNRSGKHLATEWNTVSKLVMGLGITVCIFIMLANLLVMVAIYVNRRFHFPIYYLMANLAAADFFAGLAYFYLMFNTGPNTRRLTVSTWLLRQGLIDTSLTASVANLLAIAIERHITVFRMQLHTRMSNRRVVVVIVVIWTMAIVMGAIPSVGWNCICDIENCSNMAPLYSDSYLVFWAIFNLVTFVVMVVLYAHIFGYVRQRTMRMSRHSSGPRRNRDTMMSLLKTVVIVLGAFIICWTPGLVLLLLDVCCPQCDVLAYEKFFLLLAEFNSAMNPIIYSYRDKEMSATFRQILCCQRSENPTGPTEGSDRSASSLNHTILAGVHSNDHSVV. The pIC50 is 6.8. (2) The pIC50 is 5.2. The target protein (Q70I53) has sequence MAIGYVWNTLYGWVDTGTGSLAAANLTARMQPISHHLAHPDTKRRFHELVCASGQIEHLTPIAAVAATDADILRAHSAAHLENMKRVSNLPTGGDTGDGITMMGNGGLEIARLSAGGAVELTRRVATGELSAGYALVNPPGHHAPHNAAMGFCIFNNTSVAAGYARAVLGMERVAILDWDVHHGNGTQDIWWNDPSVLTISLHQHLCFPPDSGYSTERGAGNGHGYNINVPLPPGSGNAAYLHAMDQVVLHALRAYRPQLIIVGSGFDASMLDPLARMMVTADGFRQMARRTIDCAADICDGRIVFVQEGGYSPHYLPFCGLAVIEELTGVRSLPDPYHEFLAGMGGNTLLDAERAAIEEIVPLLADIR. The compound is CCCCCCCCCCCC[C@@H](C)O[C@@H]1O[C@H](CO)[C@@H](O)[C@H](O)[C@H]1O[C@@H]1O[C@H](COC(C)=O)[C@@H](O)[C@H](O)[C@H]1O. (3) The compound is CC(C)(O)C[C@H]1C=C([C@H]2CC[C@]3(C)[C@@H]2CC[C@@H]2[C@@]4(C)CC[C@H](O)C(C)(C)[C@@H]4CC[C@]23C)C(=O)O1. The target protein (P18031) has sequence MEMEKEFEQIDKSGSWAAIYQDIRHEASDFPCRVAKLPKNKNRNRYRDVSPFDHSRIKLHQEDNDYINASLIKMEEAQRSYILTQGPLPNTCGHFWEMVWEQKSRGVVMLNRVMEKGSLKCAQYWPQKEEKEMIFEDTNLKLTLISEDIKSYYTVRQLELENLTTQETREILHFHYTTWPDFGVPESPASFLNFLFKVRESGSLSPEHGPVVVHCSAGIGRSGTFCLADTCLLLMDKRKDPSSVDIKKVLLEMRKFRMGLIQTADQLRFSYLAVIEGAKFIMGDSSVQDQWKELSHEDLEPPPEHIPPPPRPPKRILEPHNGKCREFFPNHQWVKEETQEDKDCPIKEEKGSPLNAAPYGIESMSQDTEVRSRVVGGSLRGAQAASPAKGEPSLPEKDEDHALSYWKPFLVNMCVATVLTAGAYLCYRFLFNSNT. The pIC50 is 4.6. (4) The drug is Cc1cc(Cn2c(=O)n(Cc3cnn(C)c3)c(=O)c3cc(S(=O)(=O)NC4(C#N)CC4)ccc32)n(C)n1. The target protein (Q86W56) has sequence MNAGPGCEPCTKRPRWGAATTSPAASDARSFPSRQRRVLDPKDAHVQFRVPPSSPACVPGRAGQHRGSATSLVFKQKTITSWMDTKGIKTAESESLDSKENNNTRIESMMSSVQKDNFYQHNVEKLENVSQLSLDKSPTEKSTQYLNQHQTAAMCKWQNEGKHTEQLLESEPQTVTLVPEQFSNANIDRSPQNDDHSDTDSEENRDNQQFLTTVKLANAKQTTEDEQAREAKSHQKCSKSCDPGEDCASCQQDEIDVVPESPLSDVGSEDVGTGPKNDNKLTRQESCLGNSPPFEKESEPESPMDVDNSKNSCQDSEADEETSPGFDEQEDGSSSQTANKPSRFQARDADIEFRKRYSTKGGEVRLHFQFEGGESRTGMNDLNAKLPGNISSLNVECRNSKQHGKKDSKITDHFMRLPKAEDRRKEQWETKHQRTERKIPKYVPPHLSPDKKWLGTPIEEMRRMPRCGIRLPLLRPSANHTVTIRVDLLRAGEVPKPFPT.... The pIC50 is 7.2. (5) The compound is O=C1CCC(N2Cc3c(OCc4ccc(CN5CCOCC5)cc4)cccc3C2=O)C(=O)N1. The target protein (Q01558) has sequence MVKVGINGFGRIGRVVFRAAQMRPDIEIVGINDLLDAEYMAYSLKYDSTHGRFDGTVEVIKGALVVNGKSIRVTSERDPANLKWDEIGVEVVVESTGLFLTQETAHKHIEAGARRVVMTGPPKDDTPMFVMGVNHTTYKGQPIISNASCTTNCLAPLAKVVNEKYGIVEGLMTTVHATTATQKTVDGPSLKDWRGGRGASQNIIPSSTGAPKAVGKVYPALDGKLTGMAFRVPTPNVSVVDLTVRLEKPATYKDICAAIKAAAEGEMKGILGYTDDEVVSSDFNGVALTSVFDVKAGISLNDHFVKLVSWYDNETGYSHKVLDLILHTSAR. The pIC50 is 5.0. (6) The drug is O=C(CBr)NCC(=O)N1CCN(C(=O)C23CC4CC(CC(C4)C2)C3)CC1. The target protein sequence is MSETSRTAFGGRRAVPPNNSNAAEDDLPTVELQGVVPRGVNLQEFLNVTSVHLFKERWDTNKVDHHTDKYENNKLIVRRGQSFYVQIDFSRPYDPRRDLFRVEYVIGRYPQENKGTYIPVPIVSELQSGKWGAKIVMREDRSVRLSIQSSPKCIVGKFRMYVAVWTPYGVLRTSRNPETDTYILFNPWCEDDAVYLDNEKEREEYVLNDIGVIFYGEVNDIKTRSWSYGQFEDGILDTCLYVMDRAQMDLSGRGNPIKVSRVGSAMVNAKDDEGVLVGSWDNIYAYGVPPSAWTGSVDILLEYRSSENPVRYGQCWVFAGVFNTFLRCLGIPARIVTNYFSAHDNDANLQMDIFLEEDGNVNSKLTKDSVWNYHCWNEAWMTRPDLPVGFGGWQAVDSTPQENSDGMYRCGPASVQAIKHGHVCFQFDAPFVFAEVNSDLIYITAKKDGTHVVENVDATHIGKLIVTKQIGGDGMMDITDTYKFQEGQEEERLALETALM.... The pIC50 is 4.0. (7) The compound is C[C@@H](C(=O)O)n1cc(C(=O)c2ccccc2F)c2ccccc21. The target protein (O09114) has sequence MAALRMLWMGLVLLGLLGFPQTPAQGHDTVQPNFQQDKFLGRWYSAGLASNSSWFREKKAVLYMCKTVVAPSTEGGLNLTSTFLRKNQCETKIMVLQPAGAPGHYTYSSPHSGSIHSVSVVEANYDEYALLFSRGTKGPGQDFRMATLYSRTQTLKDELKEKFTTFSKAQGLTEEDIVFLPQPDKCIQE. The pIC50 is 5.5. (8) The compound is CCOc1cc(C(=O)NC(=S)Nn2cnnc2)cc(OCC)c1OCC. The target protein sequence is MDKKAREYAQDALKFIQRSGSNFLACKNLKERLENNGFINLSEGETWNLNKNEGYVLCKENRNICGFFVGKNFNIDTGSILISIGHIDSCALKISPNNNVIKKKIHQINVECYGSGLWHTWFDRSLGLSGQVLYKKGNKLVEKLIQINKSVLFLPSLAIHLQNRTRYDFSVKINYENHIKPIISTTLFNQLNKCKRNNVHHDTILTTDTKFSHKENSQNKRDDQMCHSFNDKDVSNHNLDKNTIEHLTNQQNEEKNKHTKDNPNSKDIVEHINTDNSYPLLYLLSKELNCKEEDILDFELCLMDTQEPCFTGVYEEFIEGARFDNLLGSFCVFEGFIELVNSIKNHTSNENTNHTNNITNDINDNIHNNLYISIGYDHEEIGSLSEVGARSYCTKNFIDRIISSVFKKEIHEKNLSVQEIYGNLVNRSFILNVDMAHCSHPNYPETVQDNHQLFFHEGIAIKYNTNKNYVTSPLHASLIKRTFELYYNKYKQQIKYQNFM.... The pIC50 is 6.0. (9) The drug is CC(C)C[C@H](NC(=O)[C@@H](N)Cc1c[nH]c2ccccc12)C(=O)N[C@@H](Cc1ccc(O)cc1)C(=O)NO. The target protein (P83512) has sequence MIEVLLVTICLAVFPYQGSSIILESGNVNDYEVVYPRKVTELPKGAVQPKYEDAMQYEFKVNGEPVVLHLEKNKGLFSEDYSETHYSPDGRKIITYPSFEDHCYYHGRIENDADSTASISACNGLKGHFKLQGETYLIEPLKLSDSEAHAVYKYENVEKEDEAPKMCGVTETNWESYEPIKKASQSNLTPEQQRFSPRYIELAVVADHGIFTKYNSNLNTIRTRVHEMLNTVNGFYRSVDVHAPLANLEVWSKQDLIKVQKDSSKTLKSFGEWRERDLLPRISHDHAQLLTAVVFDGNTIGRAYTGGMCDPRHSVGVVRDHSKNNLWVAVTMAHELGHNLGIHHDTGSCSCGAKSCIMASVLSKVLSYEFSDCSQNQYETYLTNHNPQCILNKPLLTVSGNELLEAGE. The pIC50 is 4.3.