From a dataset of Drug-target binding data from BindingDB using IC50 measurements. Regression. Given a target protein amino acid sequence and a drug SMILES string, predict the binding affinity score between them. We predict pIC50 (pIC50 = -log10(IC50 in M); higher means more potent). Dataset: bindingdb_ic50. (1) The small molecule is CCCC[C@@H](O)/C=C(C)/C=C/C=C\C(=O)N1CCCC1=O. The target protein (Q9QXE2) has sequence MDPQGIVKAFPKRKKSHADLSSKALAKIPKREVGEARGWLSSLRAHIMPAGIGRARAELFEKQIIHHGGQVCSAQAPGVTHIVVDEDMDYERALRLLRLPQLPPGAQLVKSTWLSLCLQEGRLTDTEGFSLPMPKRSLDEPQPSKSGQDASAPGTQRDLPRTTLSLSPPHTRAVSPPPTAEKPSRTQAQLSSEDETSDGEGPQVSSADLQALITGHYPTPPEEDGGPDPAPEALDKWVCAQPSSQKATNYNLHITEKLEVLAKAYSVQGDKWRALGYAKAINALKSFHKPVSSYQEACSIPGIGKRMAEKVMEILESGHLRKLDHISDSVPVLELFSNIWGAGTKTAQMWYHQGFRNLEDLQSLGSLTAQQAIGLKHYDDFLDRMPREEAAEIEQTVRISAQAFNPGLLCVACGSYRRGKMTCGDVDVLITHPDGRSHRGIFSCLLDSLRQQGFLTDDLVSQEENGQQQKYLGVCRLPGPGKRHRRLDIIVVPYCEFACA.... The pIC50 is 3.7. (2) The drug is O=C(O)Cc1ccccc1Nc1c(Cl)cccc1Cl. The target protein (P19224) has sequence MACLLRSFQRISAGVFFLALWGMVVGDKLLVVPQDGSHWLSMKDIVEVLSDRGHEIVVVVPEVNLLLKESKYYTRKIYPVPYDQEELKNRYQSFGNNHFAERSFLTAPQTEYRNNMIVIGLYFINCQSLLQDRDTLNFFKESKFDALFTDPALPCGVILAEYLGLPSVYLFRGFPCSLEHTFSRSPDPVSYIPRCYTKFSDHMTFSQRVANFLVNLLEPYLFYCLFSKYEELASAVLKRDVDIITLYQKVSVWLLRYDFVLEYPRPVMPNMVFIGGINCKKRKDLSQEFEAYINASGEHGIVVFSLGSMVSEIPEKKAMAIADALGKIPQTVLWRYTGTRPSNLANNTILVKWLPQNDLLGHPMTRAFITHAGSHGVYESICNGVPMVMMPLFGDQMDNAKRMETKGAGVTLNVLEMTSEDLENALKAVINDKSYKENIMRLSSLHKDRPVEPLDLAVFWVEFVMRHKGAPHLRPAAHDLTWYQYHSLDVIGFLLAVVLT.... The pIC50 is 4.3. (3) The drug is Cc1nc(NCCN2CCOCC2)cc(Nc2cc(-c3cccc(NS(=O)(=O)c4ccccc4)c3)n[nH]2)n1. The target protein (P36888) has sequence MPALARDGGQLPLLVVFSAMIFGTITNQDLPVIKCVLINHKNNDSSVGKSSSYPMVSESPEDLGCALRPQSSGTVYEAAAVEVDVSASITLQVLVDAPGNISCLWVFKHSSLNCQPHFDLQNRGVVSMVILKMTETQAGEYLLFIQSEATNYTILFTVSIRNTLLYTLRRPYFRKMENQDALVCISESVPEPIVEWVLCDSQGESCKEESPAVVKKEEKVLHELFGTDIRCCARNELGRECTRLFTIDLNQTPQTTLPQLFLKVGEPLWIRCKAVHVNHGFGLTWELENKALEEGNYFEMSTYSTNRTMIRILFAFVSSVARNDTGYYTCSSSKHPSQSALVTIVEKGFINATNSSEDYEIDQYEEFCFSVRFKAYPQIRCTWTFSRKSFPCEQKGLDNGYSISKFCNHKHQPGEYIFHAENDDAQFTKMFTLNIRRKPQVLAEASASQASCFSDGYPLPSWTWKKCSDKSPNCTEEITEGVWNRKANRKVFGQWVSSST.... The pIC50 is 7.2. (4) The compound is Clc1ccc2ccccc2c1. The target protein sequence is MLASGMLLVALLVCLTVMVLMSVWQQRKSKGKLPPGPTPLPFIGNYLQLNTEQMYNSLMKISERYGPVFTIHLGPRRVVVLCGHDAVREALVDQAEEFSGRGEQATFDWVFKGYGVVFSNGERAKQLRRFSIATLRDFGVGKRGIEERIQEEAGFLIDALRGTGGANIDPTFFLSRTVSNVISSIVFGDRFDYKDKEFLSLLRMMLGIFQFTSTSTGQLYEMFSSVMKHLPGPQQQAFQLLQGLEDFIAKKVEHNQRTLDPNSPRDFIDSFLIRMQEEEKNPNTEFYLKNLVMTTLNLFIGGTETVSTTLRYGFLLLMKHPEVEAKVHEEIDRVIGKNRQPKFEDRAKMPYMEAVIHEIQRFGDVIPMSLARRVKKDTKFRDFFLPKGTEVFPMLGSVLRDPSFFSNPQDFNPQHFLNEKGQFKKSDAFVPFSIGKRNCFGEGLARMELFLFFTTVMQNFRLKSSQSPKDIDVSPKHVGFATIPRNYTMSFLPR. The pIC50 is 5.3. (5) The compound is CC1=C(CC/C(C)=C/CCC2=CC[C@H](C3=CC(=O)O[C@H]3O)O[C@H]2O)C(C)(C)CCC1. The target protein (P14423) has sequence MKVLLLLAVVIMAFGSIQVQGSLLEFGQMILFKTGKRADVSYGFYGCHCGVGGRGSPKDATDWCCVTHDCCYNRLEKRGCGTKFLTYKFSYRGGQISCSTNQDSCRKQLCQCDKAAAECFARNKKSYSLKYQFYPNKFCKGKTPSC. The pIC50 is 4.4. (6) The drug is CC(C)C[C@H](NC(=O)[C@H](Cc1ccccc1)NC(=O)[C@H](Cn1cc(Cc2ccccc2)nn1)NC(=O)[C@H](CO)NC(=O)CN)C(N)=O. The target protein (Q96T53) has sequence MEWLWLFFLHPISFYQGAAFPFALLFNYLCIMDSFSTRARYLFLLTGGGALAVAAMGSYAVLVFTPAVCAVALLCSLAPQQVHRWTFCFQMSWQTLCHLGLHYTEYYLHEPPSVRFCITLSSLMLLTQRVTSLSLDICEGKVKAASGGFRSRSSLSEHVCKALPYFSYLLFFPALLGGSLCSFQRFQARVQGSSALHPRHSFWALSWRGLQILGLECLNVAVSRVVDAGAGLTDCQQFECIYVVWTTAGLFKLTYYSHWILDDSLLHAAGFGPELGQSPGEEGYVPDADIWTLERTHRISVFSRKWNQSTARWLRRLVFQHSRAWPLLQTFAFSAWWHGLHPGQVFGFVCWAVMVEADYLIHSFANEFIRSWPMRLFYRTLTWAHTQLIIAYIMLAVEVRSLSSLWLLCNSYNSVFPMVYCILLLLLAKRKHKCN. The pIC50 is 5.2. (7) The drug is [NH3+][Pt]1([NH3+])OC(=O)C2(CCC2)C(=O)O1. The target protein (P0A7G6) has sequence MAIDENKQKALAAALGQIEKQFGKGSIMRLGEDRSMDVETISTGSLSLDIALGAGGLPMGRIVEIYGPESSGKTTLTLQVIAAAQREGKTCAFIDAEHALDPIYARKLGVDIDNLLCSQPDTGEQALEICDALARSGAVDVIVVDSVAALTPKAEIEGEIGDSHMGLAARMMSQAMRKLAGNLKQSNTLLIFINQIRMKIGVMFGNPETTTGGNALKFYASVRLDIRRIGAVKEGENVVGSETRVKVVKNKIAAPFKQAEFQILYGEGINFYGELVDLGVKEKLIEKAGAWYSYKGEKIGQGKANATAWLKDNPETAKEIEKKVRELLLSNPNSTPDFSVDDSEGVAETNEDF. The pIC50 is 4.5. (8) The target protein sequence is PISPIEPVPVKLKPGMDGPKVKQWPLTEEKIKALVEICTEMEKEGKISKIGPENPYNTPVFAIKKKDSTRWRKLVDFRELNKRTQDFWEVQLGIPHPAGLKKKRSVTVLDVGDAYFSVPLDKEFRKYTAFTIPSINNETPGIRYQYNVLPQGWKGSPAIFQSSMTKILEPFRKQNPDIVIYQYMDDLYVGSDLEIGQHRTKIEELRQHLLKWGFTTPDKKHQKEPPFLWMGYEHHPDKWTVQPIVLPEKDSWTVNDIQK. The compound is Cn1c2ccccc2c2c1cc(C#N)c(=O)n2-c1ccc([N+](=O)[O-])cc1. The pIC50 is 7.7. (9) The compound is CC1CCCCN1C(=O)C(=O)O. The target protein sequence is MSSVKEQLIENLIEEDEVSQSKITIVGTGAVGMACAICILLKDLADELALVDVVTDKLKGETMDLQHGSLFFNTPKIVSGKDYTVSANSKLVIITAGARQQEGESRLNLVQRNVDIMKSVIPAIVQNSPDCKMLIVSNPVDILTYVVWKLSGLPATRVIGSGCNLDSARFRYLIGQKLGVHPSSCHGWIIGEHGDSSVPLWSGVNVAGVALKSLDPKLGSDSDKDSWKNIHKEVVGSAYEIIKLKGYTSWGIGLSVTDLVKSILKNLRRVHPVSTMVKGSYGIKEEIFLSIPCVLGRNGVSDVVKVNLNSEEEALLKKSASTLWNVQKDLKF. The pIC50 is 4.4.