From a dataset of Drug-target binding data from BindingDB using IC50 measurements. Regression. Given a target protein amino acid sequence and a drug SMILES string, predict the binding affinity score between them. We predict pIC50 (pIC50 = -log10(IC50 in M); higher means more potent). Dataset: bindingdb_ic50. (1) The compound is O[C@@H]1[C@H](O)CN2CCC[C@@H](O)[C@H]12. The target protein (P53624) has sequence MYRISPIGRKSNFHSREKCLIGLVLVTLCFLCFGGIFLLPDNFGSDRVLRVYKHFRKAGPEIFIPAPPLAAHAPHRSEDPHFIGDRQRLEQKIRAELGDMLDEPPAAGGGEPGQFQVLAQQAQAPAPVAALADQPLDQDEGHAAIPVLAAPVQGDNAASQASSHPQSSAQQHNQQQPQLPLGGGGNDQAPDTLDATLEERRQKVKEMMEHAWHNYKLYAWGKNELRPLSQRPHSASIFGSYDLGATIVDGLDTLYIMGLEKEYREGRDWIERKFSLDNISAELSVFETNIRFVGGMLTLYAFTGDPLYKEKAQHVADKLLPAFQTPTGIPYALVNTKTGVAKNYGWASGGSSILSEFGTLHLEFAYLSDITGNPLYRERVQTIRQVLKEIEKPKGLYPNFLNPKTGKWGQLHMSLGALGDSYYEYLLKAWLQSGQTDEEAREMFDEAMLAILDKMVRTSPGGLTYVSDLKFDRLEHKMDHLACFSGGLFALGAATRQNDY.... The pIC50 is 8.3. (2) The drug is COc1cc2c(cc1OCCCCCCNC(C)=O)CC(c1cccs1)n1cc(C(=O)O)c(=O)cc1-2. The target protein (P03138) has sequence MGQNLSTSNPLGFFPDHQLDPAFRANTANPDWDFNPNKDTWPDANKVGAGAFGLGFTPPHGGLLGWSPQAQGILQTLPANPPPASTNRQSGRQPTPLSPPLRNTHPQAMQWNSTTFHQTLQDPRVRGLYFPAGGSSSGTVNPVLTTASPLSSIFSRIGDPALNMENITSGFLGPLLVLQAGFFLLTRILTIPQSLDSWWTSLNFLGGTTVCLGQNSQSPTSNHSPTSCPPTCPGYRWMCLRRFIIFLFILLLCLIFLLVLLDYQGMLPVCPLIPGSSTTSTGPCRTCMTTAQGTSMYPSCCCTKPSDGNCTCIPIPSSWAFGKFLWEWASARFSWLSLLVPFVQWFVGLSPTVWLSVIWMMWYWGPSLYSILSPFLPLLPIFFCLWVYI. The pIC50 is 7.7. (3) The small molecule is CC1(C)CC=C(c2cc(C3(O)CC4CCC(C3)O4)ccc2NC(=O)c2nc(C#N)c[nH]2)CC1. The target protein (P09581) has sequence MELGPPLVLLLATVWHGQGAPVIEPSGPELVVEPGETVTLRCVSNGSVEWDGPISPYWTLDPESPGSTLTTRNATFKNTGTYRCTELEDPMAGSTTIHLYVKDPAHSWNLLAQEVTVVEGQEAVLPCLITDPALKDSVSLMREGGRQVLRKTVYFFSPWRGFIIRKAKVLDSNTYVCKTMVNGRESTSTGIWLKVNRVHPEPPQIKLEPSKLVRIRGEAAQIVCSATNAEVGFNVILKRGDTKLEIPLNSDFQDNYYKKVRALSLNAVDFQDAGIYSCVASNDVGTRTATMNFQVVESAYLNLTSEQSLLQEVSVGDSLILTVHADAYPSIQHYNWTYLGPFFEDQRKLEFITQRAIYRYTFKLFLNRVKASEAGQYFLMAQNKAGWNNLTFELTLRYPPEVSVTWMPVNGSDVLFCDVSGYPQPSVTWMECRGHTDRCDEAQALQVWNDTHPEVLSQKPFDKVIIQSQLPIGTLKHNMTYFCKTHNSVGNSSQYFRAVS.... The pIC50 is 8.6. (4) The compound is CCc1ccc(CCOc2ccc(Cc3sc(=O)[nH]c3O)cc2)nc1. The target protein sequence is MGFLAGKKILITGLLSNKSIAYGIAKAMHREGAELAFTYVGQFKDRVEKLCAEFNPAAVLPCDVISDQEIKDLFVELGKVWDGLDAIVHSIAFAPRDQLEGNFIDCVTREGFSIAHDISAYSFAALAKEGRSMMKNRNASMVALTYIGAEKAMPSYNTMGVAKASLEATVRYTALALGEDGIKVNAVSAGPIKTLAASGISNFKKMLDYNAMVSPLKKNVDIMEVGNTVAFLCSDMATGITGEVVHVDAGYHCVSMGNVL. The pIC50 is 4.1. (5) The drug is N=C(N)NC(=O)Cn1c(-c2ccccc2)ccc1-c1ccccc1. The target protein (P0DJD7) has sequence MKWLLLLGLVALSECIMYKVPLIRKKSLRRTLSERGLLKDFLKKHNLNPARKYFPQWEAPTLVDEQPLENYLDMEYFGTIGIGTPAQDFTVVFDTGSSNLWVPSVYCSSLACTNHNRFNPEDSSTYQSTSETVSITYGTGSMTGILGYDTVQVGGISDTNQIFGLSETEPGSFLYYAPFDGILGLAYPSISSSGATPVFDNIWNQGLVSQDLFSVYLSADDQSGSVVIFGGIDSSYYTGSLNWVPVTVEGYWQITVDSITMNGEAIACAEGCQAIVDTGTSLLTGPTSPIANIQSDIGASENSDGDMVVSCSAISSLPDIVFTINGVQYPVPPSAYILQSEGSCISGFQGMNLPTESGELWILGDVFIRQYFTVFDRANNQVGLAPVA. The pIC50 is 4.3. (6) The drug is C[n+]1ccc(-c2c3nc(c(-c4cc[n+](C)cc4)c4ccc([nH]4)c(-c4cc[n+](C)cc4)c4ccc([nH]4)c(-c4cc[n+](C)cc4)c4nc2C=C4)C=C3)cc1. The target protein sequence is MSSMWSEYTIGGVKIYFPYKAYPSQLAMMNSILRGLNSKQHCLLESPTGSGKSLALLCSALAWQQSLSGKPADEGVSEKAEVQLSCCCACHSKDFTNNDMNQGTSRHFNYPSTPPSERNGTSSTCQDSPEKTTLAAKLSAKKQASIYRDENDDFQVEKKRIRPLETTQQIRKRHCFGTEVHNLDAKVDSGKTVKLNSPLEKINSFSPQKPPGHCSRCCCSTKQGNSQESSNTIKKDHTGKSKIPKIYFGTRTHKQIAQITRELRRTAYSGVPMTILSSRDHTCVHPEVVGNFNRNEKCMELLDGKNGKSCYFYHGVHKISDQHTLQTFQGMCKAWDIEELVSLGKKLKACPYYTARELIQDADIIFCPYNYLLDAQIRESMDLNLKEQVVILDEAHNIEDCARESASYSVTEVQLRFARDELDSMVNNNIRKKDHEPLRAVCCSLINWLEANAEYLVERDYESACKIWSGNEMLLTLHKMGITTATFPILQGHFSAVLQK.... The pIC50 is 5.6. (7) The drug is COc1cc(OC)c(S(=O)(=O)N(C)c2ccccc2)cc1NC(C)=O. The target protein (Q8DQ18) has sequence MSNFAIILAAGKGTRMKSDLPKVLHKVAGISMLEHVFRSVGAIQPEKTVTVVGHKAELVEEVLAEQTEFVTQSEQLGTGHAVMMTEPILEGLSGHTLVIAGDTPLITGESLKNLIDFHINHKNVATILTAETDNPFGYGRIVRNDNAEVLRIVEQKDATDFEKQIKEINTGTYVFDNERLFEALKNINTNNAQGEYYITDVIGIFRETGEKVGAYTLKDFDESLGVNDRVALATAESVMRRRINHKHMVNGVSFVNPEATYIDIDVEIAPEVQIEANVILKGQTKIGAETVLTNGTYVVDSTIGAGAVITNSMIEESSVADGVTVGPYAHIRPNSSLGAQVHIGNFVEVKGSSIGENTKAGHLTYIGNCEVGSNVNFGAGTITVNYDGKNKYKTVIGDNVFVGSNSTIIAPVELGDNSLVGAGSTITKDVPADAIAIGRGRQINKDEYATRLPHHPKNQ. The pIC50 is 3.7. (8) The small molecule is CN[C@@H](C)C(=O)N[C@@H]1C(=O)N(Cc2c(OC)ccc3cc(Br)ccc23)c2ccccc2N(C(=O)c2ccc(C#N)cc2)[C@H]1C. The target protein sequence is MRHHHHHHRDHFALDRPSETHADYLLRTGQVVDISDTIYPRNPAMYSEEARLKSFQNWPDYAHLTPRELASAGLYYTGIGDQVQCFACGGKLKNWEPGDFPNCFFVLGRAWSEHRRHRNLNIRSE. The pIC50 is 8.2.