From a dataset of Drug-target binding data from BindingDB using IC50 measurements. Regression. Given a target protein amino acid sequence and a drug SMILES string, predict the binding affinity score between them. We predict pIC50 (pIC50 = -log10(IC50 in M); higher means more potent). Dataset: bindingdb_ic50. (1) The drug is O=c1oc2c(O)c(O)cc3c(=O)oc4c(O)c(O)cc1c4c23. The target protein (Q8MU52) has sequence MGDNIVLYYFDARGKAELIRLIFAYLGIEYTDKRFGVNGDAFVEFKNFKKEKDTPFEQVPILQIGDLILAQSQAIVRYLSKKYNICGESELNEFYADMIFCGVQDIHYKFNNTNLFKQNETTFLNEDLPKWSGYFEKLLKKNHTNNNNDKYYFVGNNLTYADLAVFNLYDDIETKYPSSLKNFPLLKAHNEFISNLPNIKNYITNRKESVY. The pIC50 is 4.1. (2) The small molecule is C[C@]12C[C@H]3CC[C@@H]4C[C@@H](O)CC[C@H]4[C@@H]3C[C@@H]1CC[C@@H]2[N+](=O)[O-]. The target protein (P28471) has sequence MVSVQKVPAIVLCSGVSLALLHVLCLATCLNESPGQNSKDEKLCPENFTRILDSLLDGYDNRLRPGFGGPVTEVKTDIYVTSFGPVSDVEMEYTMDVFFRQTWIDKRLKYDGPIEILRLNNMMVTKVWTPDTFFRNGKKSVSHNMTAPNKLFRIMRNGTILYTMRLTISAECPMRLVDFPMDGHACPLKFGSYAYPKSEMIYTWTKGPEKSVEVPKESSSLVQYDLIGQTVSSETIKSITGEYIVMTVYFHLRRKMGYFMIQTYIPCIMTVILSQVSFWINKESVPARTVFGITTVLTMTTLSISARHSLPKVSYATAMDWFIAVCFAFVFSALIEFAAVNYFTNIQMQKAKKKISKPPPEVPAAPVLKEKHTETSLQNTHANLNMRKRTNALVHSESDVNSRTEVGNHSSKTTAAQESSETTPKAHLASSPNPFSRANAAETISAAARGLSSAASPSPHGTLQPAPLRSASARPAFGARLGRIKTTVNTTGVPGNVSAT.... The pIC50 is 7.2. (3) The drug is O=c1oc(OCCCCCc2ccccc2)c(Cl)c2ccc([N+](=O)[O-])cc12. The target protein sequence is MKNFLAQQGKITLILTALCVLIYLAQQLGFEDDIMYLMHYPAYEEQDSEVWRYISHTLVHLSNLHILFNLSWFLIFGGMIERTFGSVKLLMLYVVASAITGYVQNYVSGPAFFGLSGVVYAVLGYVFIRDKLNHHLFDLPEGFFTMLLVGIALGFISPLFGVEMGNAAHISGLIVGLIWGFIDSKLRKNSLE. The pIC50 is 5.3. (4) The compound is CCCCC[C@H](O)/C=C/[C@H]1[C@H](O)C[C@H](O)[C@@H]1C/C=C\CCCC(=O)O. The target protein (Q9JHI3) has sequence MPDRSTKATMGAEDIHERKVSMEPRDSHQDAQPRGMFQNIKFFVLCHSILQLAQLMISGYLKSSISTVEKRFGLSSQTSGLLAAFNEVGNISLILFVSYFGSRVHRPRMIGCGAILVAVAGLLMALPHFISEPYRYDHSSPDRSQDFEASLCLPTTMAPASALSNDSCSSRTETKHLTMVGIMFTAQTLLGIGGVPIQPFGISYIDDFAHHSNSPLYLGILFAITMMGPGLAYGLGSLMLRLYVDIDRMPEGGINLTTKDPRWVGAWWLGFLISAGLVVLAASPYFFFPREMPKEKYELHFRQKVLAGGASIGSKGEELSSQHEPLKKQAGLPQIAPDLTVVQFIKVFPRVLLRTLRHPIFLLVVLSQVCTSSMVAGTATFLPKFLERQFSITASFANLLLGCLTIPLAIVGIVVGGVLVKRLHLSPMQCSALCLLGSLLCLLLSLPLFFIGCSTHHIAGITQDLGAQPGPSLFPGCSEPCSCQSDDFNPVCDTSAYVEY.... The pIC50 is 6.5. (5) The drug is C[C@H](O)[C@H](NC(=O)[C@@H]1CCCN1C(=O)[C@@H]1CCCN1C(=O)[C@@H](N)[C@@H](C)O)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CO)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CO)C(=O)O. The target protein (P09790) has sequence MKAKRFKINAISLSIFLAYALTPYSEAALVRDDVDYQIFRDFAENKGKFFVGATDLSVKNKRGQNIGNALSNVPMIDFSVADVNKRIATVVDPQYAVSVKHAKAEVHTFYYGQYNGHNDVADKENEYRVVEQNNYEPHKAWGASNLGRLEDYNMARFNKFVTEVAPIAPTDAGGGLDTYKDKNRFSSFVRIGAGRQLVYEKGVYHQEGNEKGYDLRDLSQAYRYAIAGTPYKDINIDQTMNTEGLIGFGNHNKQYSAEELKQALSQDALTNYGVLGDSGSPLFAFDKQKNQWVFLGTYDYWAGYGKKSWQEWNIYKKEFADKIKQHDNAGTVKGNGEHHWKTTGTNSHIGSTAVRLANNEGDANNGQNVTFEDNGTLVLNQNINQGAGGLFFKGDYTVKGANNDITWLGAGIDVADGKKVVWQVKNPNGDRLAKIGKGTLEINGTGVNQGQLKVGDGTVILNQKADADKKVQAFSQVGIVSGRGTLVLNSSNQINPDNLY.... The pIC50 is 3.3. (6) The compound is C[C@H](NC(=O)[C@H](CS)NC(=O)[C@H](CCCNC(=N)N)NC(=O)[C@@H]1CCCN1C(=O)[C@H](CC(=O)O)NC(=O)[C@H](CO)NC(=O)[C@H](CS)NC(=O)[C@H](CS)NC(=O)CN)C(=O)N[C@@H](Cc1c[nH]c2ccccc12)C(=O)N[C@@H](CCCNC(=N)N)C(N)=O. The target protein (Q9UGM1) has sequence MNWSHSCISFCWIYFAASRLRAAETADGKYAQKLFNDLFEDYSNALRPVEDTDKVLNVTLQITLSQIKDMDERNQILTAYLWIRQIWHDAYLTWDRDQYDGLDSIRIPSDLVWRPDIVLYNKADDESSEPVNTNVVLRYDGLITWDAPAITKSSCVVDVTYFPFDNQQCNLTFGSWTYNGNQVDIFNALDSGDLSDFIEDVEWEVHGMPAVKNVISYGCCSEPYPDVTFTLLLKRRSSFYIVNLLIPCVLISFLAPLSFYLPAASGEKVSLGVTILLAMTVFQLMVAEIMPASENVPLIGKYYIATMALITASTALTIMVMNIHFCGAEARPVPHWARVVILKYMSRVLFVYDVGESCLSPHHSRERDHLTKVYSKLPESNLKAARNKDLSRKKDMNKRLKNDLGCQGKNPQEAESYCAQYKVLTRNIEYIAKCLKDHKATNSKGSEWKKVAKVIDRFFMWIFFIMVFVMTILIIARAD. The pIC50 is 6.7. (7) The compound is Cn1sc(=O)n(-c2ccccc2Br)c1=O. The pIC50 is 5.2. The target is XTSFAESXKPVQQPSAFGS. (8) The small molecule is CS(=O)(=O)C(C(=O)NCCS(N)(=O)=O)c1nc2ccc(-c3ccc(C(=O)NCCCO)cc3)cc2s1. The target protein (P11150) has sequence MDTSPLCFSILLVLCIFIQSSALGQSLKPEPFGRRAQAVETNKTLHEMKTRFLLFGETNQGCQIRINHPDTLQECGFNSSLPLVMIIHGWSVDGVLENWIWQMVAALKSQPAQPVNVGLVDWITLAHDHYTIAVRNTRLVGKEVAALLRWLEESVQLSRSHVHLIGYSLGAHVSGFAGSSIGGTHKIGRITGLDAAGPLFEGSAPSNRLSPDDANFVDAIHTFTREHMGLSVGIKQPIGHYDFYPNGGSFQPGCHFLELYRHIAQHGFNAITQTIKCSHERSVHLFIDSLLHAGTQSMAYPCGDMNSFSQGLCLSCKKGRCNTLGYHVRQEPRSKSKRLFLVTRAQSPFKVYHYQFKIQFINQTETPIQTTFTMSLLGTKEKMQKIPITLGKGIASNKTYSFLITLDVDIGELIMIKFKWENSAVWANVWDTVQTIIPWSTGPRHSGLVLKTIRVKAGETQQRMTFCSENTDDLLLRPTQEKIFVKCEIKSKTSKRKIR. The pIC50 is 7.2. (9) The drug is O=C(O)[C@H]1NC[C@@H]2O[C@@H]21. The target protein (Q9P0Z9) has sequence MAAQKDLWDAIVIGAGIQGCFTAYHLAKHRKRILLLEQFFLPHSRGSSHGQSRIIRKAYLEDFYTRMMHECYQIWAQLEHEAGTQLHRQTGLLLLGMKENQELKTIQANLSRQRVEHQCLSSEELKQRFPNIRLPRGEVGLLDNSGGVIYAYKALRALQDAIRQLGGIVRDGEKVVEINPGLLVTVKTTSRSYQAKSLVITAGPWTNQLLRPLGIEMPLQTLRINVCYWREMVPGSYGVSQAFPCFLWLGLCPHHIYGLPTGEYPGLMKVSYHHGNHADPEERDCPTARTDIGDVQILSSFVRDHLPDLKPEPAVIESCMYTNTPDEQFILDRHPKYDNIVIGAGFSGHGFKLAPVVGKILYELSMKLTPSYDLAPFRISRFPSLGKAHL. The pIC50 is 3.7. (10) The drug is C[C@H](Nc1nc(Nc2cn(C)cn2)c2cc[nH]c2n1)c1ncc(F)cn1. The target protein (P52332) has sequence MQYLNIKEDCNAMAFCAKMRSFKKTEVKQVVPEPGVEVTFYLLDREPLRLGSGEYTAEELCIRAAQECSISPLCHNLFALYDESTKLWYAPNRIITVDDKTSLRLHYRMRFYFTNWHGTNDNEQSVWRHSPKKQKNGYEKKRVPEATPLLDASSLEYLFAQGQYDLIKCLAPIRDPKTEQDGHDIENECLGMAVLAISHYAMMKKMQLPELPKDISYKRYIPETLNKSIRQRNLLTRMRINNVFKDFLKEFNNKTICDSSVHDLKVKYLATLETSTLTKHYGAEIFETSMLLISSENELSRCHSNDSGNVLYEVMVTGNLGIQWRQKPNVVPVEKEKNKLKRKKLEYNKHKKDDERNKLREEWNNFSYFPEITHIVIKESVVSINKQDNKNMELKLSSREEALSFVSLVDGYFRLTADAHHYLCTDVAPPLIVHNIQNGCHGPICTEYAINKLRQEGSEEGMYVLRWSCTDFDNILMTVTCFEKSEVLGGQKQFKNFQIE.... The pIC50 is 6.6.