From a dataset of Drug-target binding data from BindingDB using IC50 measurements. Regression. Given a target protein amino acid sequence and a drug SMILES string, predict the binding affinity score between them. We predict pIC50 (pIC50 = -log10(IC50 in M); higher means more potent). Dataset: bindingdb_ic50. (1) The compound is CC(C)(C)c1ccc(CN(Cc2cccc(CCC(=O)O)c2)S(=O)(=O)c2ccccn2)cc1. The target protein (Q62928) has sequence MDNSFNDSRRVENCESRQYLLSDESPAISSVMFTAGVLGNLIALALLARRWRGDTGCSAGSRTSISLFHVLVTELVLTDLLGTCLISPVVLASYSRNQTLVALAPESRACTYFAFTMTFFSLATMLMLFAMALERYLAIGHPYFYRRRVSRRGGLAVLPAIYGVSLLFCSLPLLNYGEYVQYCPGTWCFIQHGRTAYLQLYATVLLLLIVAVLGCNISVILNLIRMQLRSKRSRCGLSGSSLRGPGSRRRGERTSMAEETDHLILLAIMTITFAVCSLPFTIFAYMDETSSRKEKWDLRALRFLSVNSIIDPWVFVILRPPVLRLMRSVLCCRTSLRAPEAPGASCSTQQTDLCGQL. The pIC50 is 7.3. (2) The compound is C[C@H](NC(=O)[C@H](C)NC(=O)[C@H](CS)NC(=O)[C@H](C)NC(=O)[C@H]1CCCN1C(=O)[C@H](Cc1cnc[nH]1)NC(=O)[C@H](CO)NC(=O)[C@H](CS)NC(=O)[C@H](CS)NC(=O)CNC(=O)CN)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC(=O)O)C(=O)N[C@@H](Cc1ccc(O)cc1)C(=O)N[C@@H](CS)C(=O)O. The target protein (P30277) has sequence MALRVTRNTKINTENKAKVSMAGAKRVPVAVAASKPLLRSRTALGDIGNKVSEQSRIPLKKETKKLGSGTVTVKALPKPVDKVPVCEPEVELDEPEPEPVMEVKHSPEPILVDTPSPSPMETSGCAPAEEYLCQAFSDVILAVSDVDADDGGDPNLCSEYVKDIYAYLRQLEEEQSVRPKYLLGREVTGNMRAILIDWLIQVQMKFRLLQETMYMTVSIIDRFMQDSCVPKKMLQLVGVTAMFIASKYEEMYPPEIGDFAFVTNNTYTKHQIRQMEMKILRVLNFSLGRPLPLHFLRRASKIGEVDVEQHTLAKYLMELSMLDYDMVHFAPSQIAAGAFCLALKILDNGEWTPTLQHYLSHTEESLLPVMQHLAKNIVMVNRGLTKHMTIKNKYATSKHAKISTLAQLNCTLVQNLSKAVTKA. The pIC50 is 6.4. (3) The pIC50 is 4.6. The target protein (Q8NFI3) has sequence MEAAAVTVTRSATRRRRRQLQGLAAPEAGTQEEQEDQEPRPRRRRPGRSIKDEEEETVFREVVSFSPDPLPVRYYDKDTTKPISFYLSSLEELLAWKPRLEDGFNVALEPLACRQPPLSSQRPRTLLCHDMMGGYLDDRFIQGSVVQTPYAFYHWQCIDVFVYFSHHTVTIPPVGWTNTAHRHGVCVLGTFITEWNEGGRLCEAFLAGDERSYQAVADRLVQITQFFRFDGWLINIENSLSLAAVGNMPPFLRYLTTQLHRQVPGGLVLWYDSVVQSGQLKWQDELNQHNRVFFDSCDGFFTNYNWREEHLERMLGQAGERRADVYVGVDVFARGNVVGGRFDTDKSLELIRKHGFSVALFAPGWVYECLEKKDFFQNQDKFWGRLERYLPTHSICSLPFVTSFCLGMGARRVCYGQEEAVGPWYHLSAQEIQPLFGEHRLGGDGRGWVRTHCCLEDAWHGGSSLLVRGVIPPEVGNVAVRLFSLQAPVPPKIYLSMVYK.... The small molecule is COc1ccc2[nH]c([S+]([O-])Cc3ncc(C)c(OC)c3C)nc2c1. (4) The compound is CN(CC(=O)NNS(C)(=O)=O)S(=O)(=O)c1ccc(Cl)cc1. The target protein sequence is MLQLKWVFSVMWLFSRLTVCKTYVVRRSGKAASPDESLSKSFSDQPFTSLAGSCKKRCFELVEADPPNCRCDNLCKTYNMCCSDFDDHCLKTAGGFECSKERCGENRNEQHACHCSEDCMAKGDCCTNYRSLCKGDVPWLQEECEEIKNHECPAGFVRPPVIMLSVDGFRASYMKRGGTVIPNIEKLRSCGTHAPYMRPMYPTKTYPNLYTITTGLYPESHGIVGNSIHDPSFDANFNFRGKEKLNHRWWGGQPIWITAMKQGVKAGSFFWPVAIPMERRVLTMLQWLHLPDAERPYLYAMHSEQLDSYGHKLGPHSTELNSALRDVDKVIGQLMNGLKQMKLHRCINIILVGDHGMEEAHCDKTEFLSSYMSNTEDLILIPGSLGRIRARNPNNSKFDAKAVVANLTCKKPDQHFKPYLKQHLPKRLHYANNDRIEEIHLMVERKWHIARKIMKTKRNHEKCGFAGDHGYDNKINSMQTIFLGYGPAFKFKTKIPPFEN.... The pIC50 is 8.0. (5) The small molecule is FC(F)(F)Oc1ccc(CN2CCC3(CC2)OC(c2cc(Cl)ccn2)c2ccccc23)cc1. The target protein (Q7TMR0) has sequence MGCRALLLLSFLLLGAATTIPPRLKTLGSPHLSASPTPDPAVARKYSVLYFEQKVDHFGFADMRTFKQRYLVADKHWQRNGGSILFYTGNEGDIVWFCNNTGFMWDVAEELKAMLVFAEHRYYGESLPFGQDSFKDSQHLNFLTSEQALADFAELIRHLEKTIPGAQGQPVIAIGGSYGGMLAAWFRMKYPHIVVGALAASAPIWQLDGMVPCGEFMKIVTNDFRKSGPYCSESIRKSWNVIDKLSGSGSGLQSLTNILHLCSPLTSEKIPTLKGWIAETWVNLAMVNYPYACNFLQPLPAWPIKEVCQYLKNPNVSDTVLLQNIFQALSVYYNYSGQAACLNISQTTTSSLGSMGWSFQACTEMVMPFCTNGIDDMFEPFLWDLEKYSNDCFNQWGVKPRPHWMTTMYGGKNISSHSNIIFSNGELDPWSGGGVTRDITDTLVAINIHDGAHHLDLRAHNAFDPSSVLLSRLLEVKHMKKWILDFYSNIQ. The pIC50 is 9.6. (6) The small molecule is CC[C@@H](CS(=O)(=O)CC)N1C(=O)[C@@](CC)(CC(=O)O)C[C@H](c2cccc(Cl)c2)[C@H]1c1ccc(Cl)cc1. The target protein sequence is MCNTNMSVPTDGAVTTSQIPASEQETLVRPKPLLLKLLKSVGAQKDTYTMKEVLFYLGQYIMTKRLYDEKQQHIVYCSNDLLGDLFGVPSFSVKEHRKIYTMIYRNLVVVNQQESSDSGTSVSENRCHLEGGSDQKDLVQELQEEKPSSSHLVSRPSTSSRRRAISETEENSDELSGERQRKRHKSDS. The pIC50 is 9.3. (7) The pIC50 is 6.1. The target protein (P51576) has sequence MARRLQDELSAFFFEYDTPRMVLVRNKKVGVIFRLIQLVVLVYVIGWVFVYEKGYQTSSGLISSVSVKLKGLAVTQLQGLGPQVWDVADYVFPAHGDSSFVVMTNFIMTPQQAQGHCAENPEGGICQDDSGCTPGKAERKAQGIRTGNCVPFNGTVKTCEIFGWCPVEVDDKIPSPALLHEAENFTLFIKNSISFPRFKVNRRNLVEEVNGTYMKKCLYHKILHPLCPVFSLGYVVRESGQDFRSLAEKGGVVGITIDWECDLDWHVRHCKPIYQFHGLYGEKNLSPGFNFRFARHFVQNGTNRRHLFKVFGIRFDILVDGKAGKFDIIPTMTTIGSGIGIFGVATVLCDLLLLHILPKRHYYKQKKFKYAEDMGPGEGERDPAATSSTLGLQENMRTS. The small molecule is Cc1nc(/C=C/c2ccccc2)c(C(=O)O)c(C(=O)O)c1O.