From a dataset of Drug-target binding data from BindingDB using IC50 measurements. Regression. Given a target protein amino acid sequence and a drug SMILES string, predict the binding affinity score between them. We predict pIC50 (pIC50 = -log10(IC50 in M); higher means more potent). Dataset: bindingdb_ic50. (1) The compound is C[C@H]1CN(S(=O)(=O)C[C@]23CC[C@H](C[C@@H]2O)C3(C)C)CCN1c1ccc(C(F)(F)F)cn1. The target protein (O88410) has sequence MYLEVSERQVLDASDFAFLLENSTSPYDYGENESDFSDSPPCPQDFSLNFDRTFLPALYSLLFLLGLLGNGAVAAVLLSQRTALSSTDTFLLHLAVADVLLVLTLPLWAVDAAVQWVFGPGLCKVAGALFNINFYAGAFLLACISFDRYLSIVHATQIYRRDPRVRVALTCIVVWGLCLLFALPDFIYLSANYDQRLNATHCQYNFPQVGRTALRVLQLVAGFLLPLLVMAYCYAHILAVLLVSRGQRRFRAMRLVVVVVAAFAVCWTPYHLVVLVDILMDVGVLARNCGRESHVDVAKSVTSGMGYMHCCLNPLLYAFVGVKFREQMWMLFTRLGRSDQRGPQRQPSSSRRESSWSETTEASYLGL. The pIC50 is 6.7. (2) The target protein (Q9NPA2) has sequence MRLRLRLLALLLLLLAPPARAPKPSAQDVSLGVDWLTRYGYLPPPHPAQAQLQSPEKLRDAIKVMQRFAGLPETGRMDPGTVATMRKPRCSLPDVLGVAGLVRRRRRYALSGSVWKKRTLTWRVRSFPQSSQLSQETVRVLMSYALMAWGMESGLTFHEVDSPQGQEPDILIDFARAFHQDSYPFDGLGGTLAHAFFPGEHPISGDTHFDDEETWTFGSKDGEGTDLFAVAVHEFGHALGLGHSSAPNSIMRPFYQGPVGDPDKYRLSQDDRDGLQQLYGKAPQTPYDKPTRKPLAPPPQPPASPTHSPSFPIPDRCEGNFDAIANIRGETFFFKGPWFWRLQPSGQLVSPRPARLHRFWEGLPAQVRVVQAAYARHRDGRILLFSGPQFWVFQDRQLEGGARPLTELGLPPGEEVDAVFSWPQNGKTYLVRGRQYWRYDEAAARPDPGYPRDLSLWEGAPPSPDDVTVSNAGDTYFFKGAHYWRFPKNSIKTEPDAPQP.... The small molecule is CCOc1ccc(-c2ccccc2C(=O)NO)cc1. The pIC50 is 5.6. (3) The drug is NC(=O)[C@@H]1CCCN1C(=O)[C@H](Cc1nc(I)n(Cc2ccccc2)c1I)NC(=O)c1cnccn1. The target protein sequence is MDGPSNVSLVHGDTTLGLPEYKVVSVLLVLLVCTVGIVGNAMVVLVVLTSRDMHTPTNCYLVSLALADLIVLLAAGLPNVSDSLVGHWIYGHAGCLGITYFQYLGINVSSCSILAFTVERYIAICHPMRAQTVCTVARARRIIAGIWGVTSLYCLLWFFLVDLNVRDNQRLECGYKVSRGLYLPIYLLDFAVFFIAPLLGTLVLYGFIGRILFQSPLSQEAWQKERQSHGQSEGTPGNCSRSKSSMSSRKQ. The pIC50 is 4.3. (4) The small molecule is CCCCCCCCCCCC(=O)O[C@@H]1[C@H](OC)[C@@H]([C@@H](O[C@H]2OC(C(=O)N[C@H]3CCC[C@@H](C)NC3=O)=C[C@H](O)[C@@H]2O)C(N)=O)O[C@H]1n1ccc(=O)[nH]c1=O. The target protein (P0A6W3) has sequence MLVWLAEHLVKYYSGFNVFSYLTFRAIVSLLTALFISLWMGPRMIAHLQKLSFGQVVRNDGPESHFSKRGTPTMGGIMILTAIVISVLLWAYPSNPYVWCVLVVLVGYGVIGFVDDYRKVVRKDTKGLIARWKYFWMSVIALGVAFALYLAGKDTPATQLVVPFFKDVMPQLGLFYILLAYFVIVGTGNAVNLTDGLDGLAIMPTVFVAGGFALVAWATGNMNFASYLHIPYLRHAGELVIVCTAIVGAGLGFLWFNTYPAQVFMGDVGSLALGGALGIIAVLLRQEFLLVIMGGVFVVETLSVILQVGSFKLRGQRIFRMAPIHHHYELKGWPEPRVIVRFWIISLMLVLIGLATLKVR. The pIC50 is 5.7. (5) The small molecule is CNCCCOc1c(Br)cc(/C=C/C(=O)NCCCOc2c(Br)cc(CCN(C)C)cc2Br)cc1Br. The target protein (Q12778) has sequence MAEAPQVVEIDPDFEPLPRPRSCTWPLPRPEFSQSNSATSSPAPSGSAAANPDAAAGLPSASAAAVSADFMSNLSLLEESEDFPQAPGSVAAAVAAAAAAAATGGLCGDFQGPEAGCLHPAPPQPPPPGPLSQHPPVPPAAAGPLAGQPRKSSSSRRNAWGNLSYADLITKAIESSAEKRLTLSQIYEWMVKSVPYFKDKGDSNSSAGWKNSIRHNLSLHSKFIRVQNEGTGKSSWWMLNPEGGKSGKSPRRRAASMDNNSKFAKSRSRAAKKKASLQSGQEGAGDSPGSQFSKWPASPGSHSNDDFDNWSTFRPRTSSNASTISGRLSPIMTEQDDLGEGDVHSMVYPPSAAKMASTLPSLSEISNPENMENLLDNLNLLSSPTSLTVSTQSSPGTMMQQTPCYSFAPPNTSLNSPSPNYQKYTYGQSSMSPLPQMPIQTLQDNKSSYGGMSQYNCAPGLLKELLTSDSPPHNDIMTPVDPGVAQPNSRVLGQNVMMGP.... The pIC50 is 4.7. (6) The small molecule is NC(=O)CCN1CCN(CCOC2Cc3cc(Cl)ccc3Sc3ccccc32)CC1. The pIC50 is 4.8. The target protein (O60240) has sequence MAVNKGLTLLDGDLPEQENVLQRVLQLPVVSGTCECFQKTYTSTKEAHPLVASVCNAYEKGVQSASSLAAWSMEPVVRRLSTQFTAANELACRGLDHLEEKIPALQYPPEKIASELKDTISTRLRSARNSISVPIASTSDKVLGAALAGCELAWGVARDTAEFAANTRAGRLASGGADLALGSIEKVVEYLLPPDKEESAPAPGHQQAQKSPKAKPSLLSRVGALTNTLSRYTVQTMARALEQGHTVAMWIPGVVPLSSLAQWGASVAMQAVSRRRSEVRVPWLHSLAAAQEEDHEDQTDTEGEDTEEEEELETEENKFSEVAALPGPRGLLGGVAHTLQKTLQTTISAVTWAPAAVLGMAGRVLHLTPAPAVSSTKGRAMSLSDALKGVTDNVVDTVVHYVPLPRLSLMEPESEFRDIDNPPAEVERREAERRASGAPSAGPEPAPRLAQPRRSLRSAQSPGAPPGPGLEDEVATPAAPRPGFPAVPREKPKRRVSDSF.... (7) The small molecule is COc1ccc(Cn2c(CCc3c[nH]c4ccccc34)nnc2[C@@H](Cc2c[nH]c3ccccc23)NC(=O)C(C)(C)N)cc1. The target protein (Q92847) has sequence MWNATPSEEPGFNLTLADLDWDASPGNDSLGDELLQLFPAPLLAGVTATCVALFVVGIAGNLLTMLVVSRFRELRTTTNLYLSSMAFSDLLIFLCMPLDLVRLWQYRPWNFGDLLCKLFQFVSESCTYATVLTITALSVERYFAICFPLRAKVVVTKGRVKLVIFVIWAVAFCSAGPIFVLVGVEHENGTDPWDTNECRPTEFAVRSGLLTVMVWVSSIFFFLPVFCLTVLYSLIGRKLWRRRRGDAVVGASLRDQNHKQTVKMLAVVVFAFILCWLPFHVGRYLFSKSFEPGSLEIAQISQYCNLVSFVLFYLSAAINPILYNIMSKKYRVAVFRLLGFEPFSQRKLSTLKDESSRAWTESSINT. The pIC50 is 8.2. (8) The small molecule is CC#CCn1c(N2CCC[C@@H](N)C2)nc2c1c(=O)n(Cc1nc(C)c3ccccc3n1)c(=O)n2C. The target protein (Q8TCC7) has sequence MTFSEILDRVGSMGHFQFLHVAILGLPILNMANHNLLQIFTAATPVHHCRPPHNASTGPWVLPMGPNGKPERCLRFVHPPNASLPNDTQRAMEPCLDGWVYNSTKDSIVTEWDLVCNSNKLKEMAQSIFMAGILIGGLVLGDLSDRFGRRPILTCSYLLLAASGSGAAFSPTFPIYMVFRFLCGFGISGITLSTVILNVEWVPTRMRAIMSTALGYCYTFGQFILPGLAYAIPQWRWLQLTVSIPFFVFFLSSWWTPESIRWLVLSGKSSKALKILRRVAVFNGKKEEGERLSLEELKLNLQKEISLAKAKYTASDLFRIPMLRRMTFCLSLAWFATGFAYYSLAMGVEEFGVNLYILQIIFGGVDVPAKFITILSLSYLGRHTTQAAALLLAGGAILALTFVPLDLQTVRTVLAVFGKGCLSSSFSCLFLYTSELYPTVIRQTGMGVSNLWTRVGSMVSPLVKITGEVQPFIPNIIYGITALLGGSAALFLPETLNQPL.... The pIC50 is 4.0.