From a dataset of Drug-target binding data from BindingDB using IC50 measurements. Regression. Given a target protein amino acid sequence and a drug SMILES string, predict the binding affinity score between them. We predict pIC50 (pIC50 = -log10(IC50 in M); higher means more potent). Dataset: bindingdb_ic50. (1) The drug is CNCCCC1=C2CC[C@H]([C@H](C)CCCC(C)C)[C@@]2(C)CCC1. The target protein (Q96WJ0) has sequence MIYGYTEKELEKTDPDGWRLIVEDTGRQRWKYLKTEEERRERPQTYMEKYFLGKNMDLPEQPAAKTPIESARKGFSFYKHLQTSDGNWACEYGGVMFLLPGLIIAMYISKIEFPDEMRIEVIRYLVNHANPEDGGWGIHIEGKSTVFGTALNYVVLRILGLGPDHPVTMKARIRLNELGGAIGCPQWGKFWLAVLNCYGWEGINPILPEFWMLPEWLPIHPSRWWVHTRAVYLPMGYIYGEKFTAPVDPLIESLREELYTQPYSSINFSKHRNTTSPVDVYVPHTRFLRVINSILTFYHTIFRFSWIKDMASKYAYKLIEYENKNTDFLCIGPVNFSIHILAVYWKEGPDSYAFKSHKERMADFLWISKKGMMMNGTNGVQLWDTSFAVQALVESGLAEDPEFKDHMIKALDFLDKCQIQKNCDDQQKCYRHRRKGAWPFSTRQQGYTVSDCTAEALKAVLLLQNLKSFPKRVSYDRLKDSVDVILSLQNKDGGFASYEL.... The pIC50 is 4.7. (2) The drug is C[C@@H](O[C@H]1CN2C(=O)[C@@H]3C[C@@H](NCc4ccccc4)C[C@@H]3[C@H]2[C@@H]1c1ccc(F)cc1)c1cc(C(F)(F)F)cc(C(F)(F)F)c1. The target protein (P25103) has sequence MDNVLPVDSDLSPNISTNTSEPNQFVQPAWQIVLWAAAYTVIVVTSVVGNVVVMWIILAHKRMRTVTNYFLVNLAFAEASMAAFNTVVNFTYAVHNEWYYGLFYCKFHNFFPIAAVFASIYSMTAVAFDRYMAIIHPLQPRLSATATKVVICVIWVLALLLAFPQGYYSTTETMPSRVVCMIEWPEHPNKIYEKVYHICVTVLIYFLPLLVIGYAYTVVGITLWASEIPGDSSDRYHEQVSAKRKVVKMMIVVVCTFAICWLPFHIFFLLPYINPDLYLKKFIQQVYLAIMWLAMSSTMYNPIIYCCLNDRFRLGFKHAFRCCPFISAGDYEGLEMKSTRYLQTQGSVYKVSRLETTISTVVGAHEEEPEDGPKATPSSLDLTSNCSSRSDSKTMTESFSFSSNVLS. The pIC50 is 8.6. (3) The drug is Cc1c(CC(=O)O)c2cccnc2n1Cc1ccc(S(C)(=O)=O)cc1. The target protein (Q9Z2J6) has sequence MANVTLKPLCPLLEEMVQLPNHSNSSLRYIDHVSVLLHGLASLLGLVENGLILFVVGCRMRQTVVTTWVLHLALSDLLAAASLPFFTYFLAVGHSWELGTTFCKLHSSVFFLNMFASGFLLSAISLDRCLQVVRPVWAQNHRTVAVAHRVCLMLWALAVLNTIPYFVFRDTIPRLDGRIMCYYNLLLWNPGPDRDTTCDYRQKALAVSKFLLAFMVPLAIIASSHVAVSLRLHHRGRQRTGRFVRLVAAIVVAFVLCWGPYHIFSLLEARAHSVTTLRQLASRGLPFVTSLAFFNSVVNPLLYVFTCPDMLYKLRRSLRAVLESVLVEDSDQSGGLRNRRRRASSTATPASTLLLADRIPQLRPTRLIGWMRRGSAEVPQRV. The pIC50 is 7.0. (4) The drug is CN1C(=O)[C@@H](NC(=O)c2nnc(Cc3ccccc3)[nH]2)C[C@@H](F)c2ccccc21. The target protein (Q13546) has sequence MQPDMSLNVIKMKSSDFLESAELDSGGFGKVSLCFHRTQGLMIMKTVYKGPNCIEHNEALLEEAKMMNRLRHSRVVKLLGVIIEEGKYSLVMEYMEKGNLMHVLKAEMSTPLSVKGRIILEIIEGMCYLHGKGVIHKDLKPENILVDNDFHIKIADLGLASFKMWSKLNNEEHNELREVDGTAKKNGGTLYYMAPEHLNDVNAKPTEKSDVYSFAVVLWAIFANKEPYENAICEQQLIMCIKSGNRPDVDDITEYCPREIISLMKLCWEANPEARPTFPGIEEKFRPFYLSQLEESVEEDVKSLKKEYSNENAVVKRMQSLQLDCVAVPSSRSNSATEQPGSLHSSQGLGMGPVEESWFAPSLEHPQEENEPSLQSKLQDEANYHLYGSRMDRQTKQQPRQNVAYNREEERRRRVSHDPFAQQRPYENFQNTEGKGTAYSSAASHGNAVHQPSGLTSQPQVLYQNNGLYSSHGFGTRPLDPGTAGPRVWYRPIPSHMPSL.... The pIC50 is 6.3. (5) The drug is O=C(CCN1CCC(n2ncnn2)CC1)N1CCC[C@H]1c1nc2cc(Cl)c(Cl)cc2[nH]1. The target protein (Q7TMR0) has sequence MGCRALLLLSFLLLGAATTIPPRLKTLGSPHLSASPTPDPAVARKYSVLYFEQKVDHFGFADMRTFKQRYLVADKHWQRNGGSILFYTGNEGDIVWFCNNTGFMWDVAEELKAMLVFAEHRYYGESLPFGQDSFKDSQHLNFLTSEQALADFAELIRHLEKTIPGAQGQPVIAIGGSYGGMLAAWFRMKYPHIVVGALAASAPIWQLDGMVPCGEFMKIVTNDFRKSGPYCSESIRKSWNVIDKLSGSGSGLQSLTNILHLCSPLTSEKIPTLKGWIAETWVNLAMVNYPYACNFLQPLPAWPIKEVCQYLKNPNVSDTVLLQNIFQALSVYYNYSGQAACLNISQTTTSSLGSMGWSFQACTEMVMPFCTNGIDDMFEPFLWDLEKYSNDCFNQWGVKPRPHWMTTMYGGKNISSHSNIIFSNGELDPWSGGGVTRDITDTLVAINIHDGAHHLDLRAHNAFDPSSVLLSRLLEVKHMKKWILDFYSNIQ. The pIC50 is 5.5. (6) The compound is C[C@H]1COc2c(N3CCN(N)CC3)c(F)cc3c(=O)c(C(=O)O)cn1c23. The target protein (Q64602) has sequence MNYSRFLTATSLARKTSPIRATVEIMSRAPKDIISLAPGSPNPKVFPFKSAVFTVENGSTIRFEGEMFQRALQYSSSYGIPELLSWLKQLQIKLHNPPTVNYSPNEGQMDLCITSGCQDGLCKVFEMLINPGDTVLVNEPLYSGALFAMKPLGCNFISVPSDDCGIIPEGLKKVLSQWKPEDSKDPTKRTPKFLYTIPNGNNPTGNSLTGDRKKEIYELARKYDFLIIEDDPYYFLQFTKPWEPTFLSMDVDGRVIRADSLSKVISSGLRVGFITGPKSLIQRIVLHTQISSLHPCTLSQLMISELLYQWGEEGFLAHVDRAIDFYKNQRDFILAAADKWLRGLAEWHVPKAGMFLWIKVNGISDAKKLIEEKAIEREILLVPGNSFFVDNSAPSSFFRASFSQVTPAQMDLVFQRLAQLIKDVS. The pIC50 is 6.0. (7) The compound is Clc1ccc(CN(Cc2ccc(Cl)cc2)c2nnn[nH]2)cc1. The target protein (P80276) has sequence MASHLVLYTGAKMPILGLGTWKSPPGKVTEAVKVAIDLGYRHIDCAHVYQNENEVGLGLQEKLQGQVVKREDLFIVSKLWCTDHEKNLVKGACQTTLRDLKLDYLDLYLIHWPTGFKPGKDPFPLDGDGNVVPDESDFVETWEAMEELVDEGLVKAIGVSNFNHLQVEKILNKPGLKYKPAVNQIEVHPYLTQEKLIEYCKSKGIVVTAYSPLGSPDRPWAKPEDPSLLEDPRIKAIAAKYNKTTAQVLIRFPMQRNLIVIPKSVTPERIAENFQVFDFELSPEDMNTLLSYNRNWRVCALMSCASHKDYPFHEEY. The pIC50 is 3.5. (8) The small molecule is O=P(O)(O)C(F)c1cccc(C(F)P(=O)(O)O)n1. The target protein (P00558) has sequence MSLSNKLTLDKLDVKGKRVVMRVDFNVPMKNNQITNNQRIKAAVPSIKFCLDNGAKSVVLMSHLGRPDGVPMPDKYSLEPVAVELKSLLGKDVLFLKDCVGPEVEKACANPAAGSVILLENLRFHVEEEGKGKDASGNKVKAEPAKIEAFRASLSKLGDVYVNDAFGTAHRAHSSMVGVNLPQKAGGFLMKKELNYFAKALESPERPFLAILGGAKVADKIQLINNMLDKVNEMIIGGGMAFTFLKVLNNMEIGTSLFDEEGAKIVKDLMSKAEKNGVKITLPVDFVTADKFDENAKTGQATVASGIPAGWMGLDCGPESSKKYAEAVTRAKQIVWNGPVGVFEWEAFARGTKALMDEVVKATSRGCITIIGGGDTATCCAKWNTEDKVSHVSTGGGASLELLEGKVLPGVDALSNI. The pIC50 is 4.1.