This data is from Drug-target binding data from BindingDB using IC50 measurements. The task is: Regression. Given a target protein amino acid sequence and a drug SMILES string, predict the binding affinity score between them. We predict pIC50 (pIC50 = -log10(IC50 in M); higher means more potent). Dataset: bindingdb_ic50. (1) The compound is NC(=O)c1ccccc1. The target protein (P09874) has sequence MAESSDKLYRVEYAKSGRASCKKCSESIPKDSLRMAIMVQSPMFDGKVPHWYHFSCFWKVGHSIRHPDVEVDGFSELRWDDQQKVKKTAEAGGVTGKGQDGIGSKAEKTLGDFAAEYAKSNRSTCKGCMEKIEKGQVRLSKKMVDPEKPQLGMIDRWYHPGCFVKNREELGFRPEYSASQLKGFSLLATEDKEALKKQLPGVKSEGKRKGDEVDGVDEVAKKKSKKEKDKDSKLEKALKAQNDLIWNIKDELKKVCSTNDLKELLIFNKQQVPSGESAILDRVADGMVFGALLPCEECSGQLVFKSDAYYCTGDVTAWTKCMVKTQTPNRKEWVTPKEFREISYLKKLKVKKQDRIFPPETSASVAATPPPSTASAPAAVNSSASADKPLSNMKILTLGKLSRNKDEVKAMIEKLGGKLTGTANKASLCISTKKEVEKMNKKMEEVKEANIRVVSEDFLQDVSASTKSLQELFLAHILSPWGAEVKAEPVEVVAPRGKSG.... The pIC50 is 5.3. (2) The drug is CC(C)C[C@H](N)C(=O)NS(=O)(=O)CC(=O)N[C@@]1(C(=O)O)[C@@H](O)C[C@H]2C(C(N)=O)=CN(C)C[C@H]21. The target protein (P26640) has sequence MSTLYVSPHPDAFPSLRALIAARYGEAGEGPGWGGAHPRICLQPPPTSRTPFPPPRLPALEQGPGGLWVWGATAVAQLLWPAGLGGPGGSRAAVLVQQWVSYADTELIPAACGATLPALGLRSSAQDPQAVLGALGRALSPLEEWLRLHTYLAGEAPTLADLAAVTALLLPFRYVLDPPARRIWNNVTRWFVTCVRQPEFRAVLGEVVLYSGARPLSHQPGPEAPALPKTAAQLKKEAKKREKLEKFQQKQKIQQQQPPPGEKKPKPEKREKRDPGVITYDLPTPPGEKKDVSGPMPDSYSPRYVEAAWYPWWEQQGFFKPEYGRPNVSAANPRGVFMMCIPPPNVTGSLHLGHALTNAIQDSLTRWHRMRGETTLWNPGCDHAGIATQVVVEKKLWREQGLSRHQLGREAFLQEVWKWKEEKGDRIYHQLKKLGSSLDWDRACFTMDPKLSAAVTEAFVRLHEEGIIYRSTRLVNWSCTLNSAISDIEVDKKELTGRTL.... The pIC50 is 6.5. (3) The small molecule is O=C(O)c1ccnc2cc(Cc3ccc(Cl)c(C(F)(F)F)c3)[nH]c12. The target protein (O75164) has sequence MASESETLNPSARIMTFYPTMEEFRNFSRYIAYIESQGAHRAGLAKVVPPKEWKPRASYDDIDDLVIPAPIQQLVTGQSGLFTQYNIQKKAMTVREFRKIANSDKYCTPRYSEFEELERKYWKNLTFNPPIYGADVNGTLYEKHVDEWNIGRLRTILDLVEKESGITIEGVNTPYLYFGMWKTSFAWHTEDMDLYSINYLHFGEPKSWYSVPPEHGKRLERLAKGFFPGSAQSCEAFLRHKMTLISPLMLKKYGIPFDKVTQEAGEFMITFPYGYHAGFNHGFNCAESTNFATRRWIEYGKQAVLCSCRKDMVKISMDVFVRKFQPERYKLWKAGKDNTVIDHTLPTPEAAEFLKESELPPRAGNEEECPEEDMEGVEDGEEGDLKTSLAKHRIGTKRHRVCLEIPQEVSQSELFPKEDLSSEQYEMTECPAALAPVRPTHSSVRQVEDGLTFPDYSDSTEVKFEELKNVKLEEEDEEEEQAAAALDLSVNPASVGGRLV.... The pIC50 is 6.3. (4) The drug is COc1ccc(C(OCCN2CCCC(C(=O)O)C2)(c2ccccc2)c2ccccc2)cc1. The target protein (P31646) has sequence MDNRVSGTTSNGETKPVCPVMEKVEEDGTLEREQWTNKMEFVLSVAGEIIGLGNVWRFPYLCYKNGGGAFFIPYLIFLFTCGIPVFFLETALGQYTNQGGITAWRKICPIFEGIGYASQMIVSLLNVYYIVVLAWALFYLFSSFTTDLPWGSCSHEWNTENCVEFQKTNNSLNVTSENATSPVIEFWERRVLKISDGIQHLGSLRWELVLCLLLAWIICYFCIWKGVKSTGKVVYFTATFPYLMLVVLLIRGVTLPGAAQGIQFYLYPNITRLWDPQVWMDAGTQIFFSFAICLGCLTALGSYNKYHNNCYRDCVALCILNSSTSFVAGFAIFSILGFMSQEQGVPISEVAESGPGLAFIAYPRAVVMLPFSPLWACCFFFMVVLLGLDSQFVCVESLVTALVDMYPRVFRKKNRREILILIVSVVSFFIGLIMLTEGGMYVFQLFDYYAASGMCLLFVAIFESLCVAWVYGASRFYDNIEDMIGYKPWPLIKYCWLFFT.... The pIC50 is 3.9. (5) The small molecule is C[C@@H](N)[C@H]1CC[C@H](C(=O)Nc2ccncc2)CC1. The target protein sequence is MGNAAAAKKGSEQESVKEFLAKAKEDFLKKWENPAQNTAHLDQFERIKTIGTGSFGRVMLVKHMETGNHYAMKILDKQKVVKLKQIEHTLNEKRILQAVNFPFLVKLEFSFKDNSNLYMVMEYMPGGEMFSHLRRIGRFSEPHARFYAAQIVLTFEYLHSLDLIYRDLKPENLLIDQQGYIKVADFGFAKRVKGRTWTLCGTPEYLAPEIILSKGYNKAVDWWALGVLIYEMAAGYPPFFADQPIQIYEKIVSGKVRFPSHFSSDLKDLLRNLLQVDLTKRFGNLKNGVNDIKNHKWFATTDWIAIYQRKVEAPFIPKFKGPGDTSNFDDYEEEEIRVSINEKCGKEFSEF. The pIC50 is 4.7. (6) The small molecule is NCC(CC(=O)O)Cc1ccsc1. The target protein sequence is MAAGCLLALTLTLFQSLLIGPSSQEPFPSAVTIKSWVDKMQEDLVTLAKTASGVNQLVDIYEKYQDLYTVEPNNARQLVEIAARDIEKLLSNRSKALVRLALEAEKVQAAHQWREDFASNEVVYYNAKDDLDPEKNDSEPGSQRIKPVFIDDANFGRQISYQHAAVHIPTDIYEGSTIVLNELNWTSALDEVFKKNREEDPSLLWQVFGSATGLARYYPASPWVDNSRTPNKIDLYDVRRRPWYIQGAASPKDMLILVDVSGSVSGLTLKLIRTSVSEMLETLSDDDFVNVASFNSNAQDVSCFQHLVQANVRNKKVLKDAVNNITAKGITDYKKGFSFAFEQLLNYNVSRANCNKIIMLFTDGGEERAQEIFAKYNKDKKVRVFTFSVGQHNYDRGPIQWMACENKGYYYEIPSIGAIRINTQEYLDVLGRPMVLAGDKAKQVQWTNVYLDALELGLVITGTLPVFNITGQNENKTNLKNQLILGVMGVDVSLEDIKRL.... The pIC50 is 6.1. (7) The drug is O=C(O)C[C@H]1CCC(=O)[C@@H]1C/C=C/CCO[C@H]1OC(CO)[C@H](O)C(O)[C@H]1O. The target protein (P38417) has sequence MFPFGQKGQKIKGTMVVMQKNVLDINSITSVGGIVDQGLGFIGSAVDALTFAATKISIQLISATKADGGKGKIGKSTNLRGKITLPTLGAGEQAYDVNFEWDSDFGIPGAFYIKNFMQNEFYLKSLILEDIPNHGTIHFVCNSWVYNSKNYKTDRIFFANNTYLPSETPAPLLKYREEELKNVRGDGTGERKEWDRIYDYDVYNDLGNPDSGDKYARPVLGGSALPYPRRERTGRGKTRKDPNSEKPSDFVYLPRDEAFGHLKSSDFLAYGIKSVSQDVLPVLTDAFDGNILSLEFDNFAEVHKLYEGGVTLPTNFLSKIAPIPVIKEIFRTDGEQFLKYPPPKVMQVDKSAWMTDEEFARETIAGLNPNVIKIIEEFPLSSKLDTQAYGDHTCIIAKEHLEPNLGGLTVEQAIQNKKLFILDHHDYLIPYLRKINANTTKTYATRTIFFLKDDGTLTPLAIELSKPHPQGEEYGPVSEVYVPASEGVEAYIWLLAKAYV.... The pIC50 is 3.2. (8) The drug is COc1ccc(NS(=O)(=O)c2ccc(-c3ccc(-c4nnc(C)o4)cc3C)cc2)cc1N1CCN(C)CC1. The target protein (P28564) has sequence MEEQGIQCAPPPPATSQTGVPLANLSHNCSADDYIYQDSIALPWKVLLVALLALITLATTLSNAFVIATVYRTRKLHTPANYLIASLAVTDLLVSILVMPISTMYTVTGRWTLGQVVCDFWLSSDITCCTASIMHLCVIALDRYWAITDAVDYSAKRTPKRAAIMIVLVWVFSISISLPPFFWRQAKAEEEVLDCFVNTDHVLYTVYSTVGAFYLPTLLLIALYGRIYVEARSRILKQTPNKTGKRLTRAQLITDSPGSTSSVTSINSRVPEVPSESGSPVYVNQVKVRVSDALLEKKKLMAARERKATKTLGIILGAFIVCWLPFFIISLVMPICKDACWFHMAIFDFFNWLGYLNSLINPIIYTMSNEDFKQAFHKLIRFKCTG. The pIC50 is 7.2. (9) The small molecule is N#Cc1c(-c2cccs2)nc(SCc2cccc(C(=O)O)c2)[nH]c1=O. The target protein (Q8TDX5) has sequence MKIDIHSHILPKEWPDLKKRFGYGGWVQLQHHSKGEAKLLKDGKVFRVVRENCWDPEVRIREMDQKGVTVQALSTVPVMFSYWAKPEDTLNLCQLLNNDLASTVVSYPRRFVGLGTLPMQAPELAVKEMERCVKELGFPGVQIGTHVNEWDLNAQELFPVYAAAERLKCSLFVHPWDMQMDGRMAKYWLPWLVGMPAETTIAICSMIMGGVFEKFPKLKVCFAHGGGAFPFTVGRISHGFSMRPDLCAQDNPMNPKKYLGSFYTDALVHDPLSLKLLTDVIGKDKVILGTDYPFPLGELEPGKLIESMEEFDEETKNKLKAGNALAFLGLERKQFE. The pIC50 is 7.9. (10) The target protein sequence is MPPADGTSQWLRKTVDSAAVILFSKTTCPYCKKVKDVLAEAKIKHATIELDQLSNGSAIQKCLASFSKIETVPQMFVRGKFIGDSQTVLKYYSNDELAGIVNESKYDYDLIVIGGGSGGLAAGKEAAKYGAKTAVLDYVEPTPIGTTWGLGGTCVNVGCIPKKLMHQAGLLSHALEDAEHFGWSLDRSKISHNWSTMVEGVQSHIGSLNWGYKVALRDNQVTYLNAKGRLISPHEVQITDKNQKVSTITGNKIILATGERPKYPEIPGAVEYGITSDDLFSLPYFPGKTLVIGASYVALECAGFLASLGGDVTVMVRSILLRGFDQQMAEKVGDYMENHGVKFAKLCVPDEIKQLKVVDTENNKPGLLLVKGHYTDGKKFEEEFETVIFAVGREPQLSKVLCETVGVKLDKNGRVVCTDDEQTTVSNVYAIGDINAGKPQLTPVAIQAGRYLARRLFAGATELTDYSNVATTVFTPLEYGACGLSEEDAIEKYGDKDIEV.... The drug is COc1cccc(-c2no[n+]([O-])c2C#N)c1. The pIC50 is 5.0.