This data is from Drug-target binding data from BindingDB using IC50 measurements. The task is: Regression. Given a target protein amino acid sequence and a drug SMILES string, predict the binding affinity score between them. We predict pIC50 (pIC50 = -log10(IC50 in M); higher means more potent). Dataset: bindingdb_ic50. (1) The compound is [NH3+][C@H]1CN(c2ccc3nnc(C(F)(F)F)n3n2)CC[C@@H]1c1cc(F)c(F)cc1F. The target protein (Q9UHL4) has sequence MGSAPWAPVLLLALGLRGLQAGARRAPDPGFQERFFQQRLDHFNFERFGNKTFPQRFLVSDRFWVRGEGPIFFYTGNEGDVWAFANNSAFVAELAAERGALLVFAEHRYYGKSLPFGAQSTQRGHTELLTVEQALADFAELLRALRRDLGAQDAPAIAFGGSYGGMLSAYLRMKYPHLVAGALAASAPVLAVAGLGDSNQFFRDVTADFEGQSPKCTQGVREAFRQIKDLFLQGAYDTVRWEFGTCQPLSDEKDLTQLFMFARNAFTVLAMMDYPYPTDFLGPLPANPVKVGCDRLLSEAQRITGLRALAGLVYNASGSEHCYDIYRLYHSCADPTGCGTGPDARAWDYQACTEINLTFASNNVTDMFPDLPFTDELRQRYCLDTWGVWPRPDWLLTSFWGGDLRAASNIIFSNGNLDPWAGGGIRRNLSASVIAVTIQGGAHHLDLRASHPEDPASVVEARKLEATIIGEWVKAARREQQPALRGGPRLSL. The pIC50 is 4.0. (2) The drug is CC(=O)N1c2ccc(C(N)=O)cc2C(C)(c2ccccc2)CC1(C)C. The target protein (P01223) has sequence MTATFLMSMIFGLACGQAMSFCIPTEYMMHVERKECAYCLTINTTVCAGYCMTRDVNGKLFLPKYALSQDVCTYRDFMYKTAEIPGCPRHVTPYFSYPVAISCKCGKCNTDYSDCIHEAIKTNYCTKPQKSYMVGFSI. The pIC50 is 6.4. (3) The compound is O=C(c1n[nH]c2c1CN(C(=O)N1CCNCC1)C2)N1CCC(c2ccc(F)c(F)c2C(F)(F)F)CC1. The target protein (P02753) has sequence MKWVWALLLLAALGSGRAERDCRVSSFRVKENFDKARFSGTWYAMAKKDPEGLFLQDNIVAEFSVDETGQMSATAKGRVRLLNNWDVCADMVGTFTDTEDPAKFKMKYWGVASFLQKGNDDHWIVDTDYDTYAVQYSCRLLNLDGTCADSYSFVFSRDPNGLPPEAQKIVRQRQEELCLARQYRLIVHNGYCDGRSERNLL. The pIC50 is 7.5. (4) The drug is CC(C)=CCc1cccc2ccc(=O)oc12. The target protein (O14980) has sequence MPAIMTMLADHAARQLLDFSQKLDINLLDNVVNCLYHGEGAQQRMAQEVLTHLKEHPDAWTRVDTILEFSQNMNTKYYGLQILENVIKTRWKILPRNQCEGIKKYVVGLIIKTSSDPTCVEKEKVYIGKLNMILVQILKQEWPKHWPTFISDIVGASRTSESLCQNNMVILKLLSEEVFDFSSGQITQVKSKHLKDSMCNEFSQIFQLCQFVMENSQNAPLVHATLETLLRFLNWIPLGYIFETKLISTLIYKFLNVPMFRNVSLKCLTEIAGVSVSQYEEQFVTLFTLTMMQLKQMLPLNTNIRLAYSNGKDDEQNFIQNLSLFLCTFLKEHDQLIEKRLNLRETLMEALHYMLLVSEVEETEIFKICLEYWNHLAAELYRESPFSTSASPLLSGSQHFDVPPRRQLYLPMLFKVRLLMVSRMAKPEEVLVVENDQGEVVREFMKDTDSINLYKNMRETLVYLTHLDYVDTERIMTEKLHNQVNGTEWSWKNLNTLCWA.... The pIC50 is 5.2. (5) The drug is c1cc(-c2cnn3cc(-c4ccc(OCCN5CCCCC5)cc4)cnc23)ccn1. The pIC50 is 5.5. The target protein (Q61271) has sequence MAESAGASSFFPLVVLLLAGSGGSGPRGIQALLCACTSCLQTNYTCETDGACMVSIFNLDGVEHHVRTCIPKVELVPAGKPFYCLSSEDLRNTHCCYIDFCNKIDLRVPSGHLKEPAHPSMWGPVELVGIIAGPVFLLFLIIIIVFLVINYHQRVYHNRQRLDMEDPSCEMCLSKDKTLQDLVYDLSTSGSGSGLPLFVQRTVARTIVLQEIIGKGRFGEVWRGRWRGGDVAVKIFSSREERSWFREAEIYQTVMLRHENILGFIAADNKDNGTWTQLWLVSDYHEHGSLFDYLNRYTVTIEGMIKLALSAASGLAHLHMEIVGTQGKPGIAHRDLKSKNILVKKNGMCAIADLGLAVRHDAVTDTIDIAPNQRVGTKRYMAPEVLDETINMKHFDSFKCADIYALGLVYWEIARRCNSGGVHEDYQLPYYDLVPSDPSIEEMRKVVCDQKLRPNVPNWWQSYEALRVMGKMMRECWYANGAARLTALRIKKTLSQLSVQ.... (6) The drug is NC1CC(c2ccccc2)(c2ccccc2)C1. The target protein (Q60857) has sequence METTPLNSQKVLSECKDKEDCQENGVLQKGVPTPADKAGPGQISNGYSAVPSTSAGDEAPHSTPAATTTLVAEIHQGERETWGKKMDFLLSVIGYAVDLGNIWRFPYICYQNGGGAFLLPYTIMAIFGGIPLFYMELALGQYHRNGCISIWKKICPIFKGIGYAICIIAFYIASYYNTIIAWALYYLISSFTDQLPWTSCKNSWNTGNCTNYFAQDNITWTLHSTSPAEEFYLRHVLQIHQSKGLQDLGTISWQLALCIMLIFTIIYFSIWKGVKTSGKVVWVTATFPYIVLSVLLVRGATLPGAWRGVVFYLKPNWQKLLETGVWVDAAAQIFFSLGPGFGVLLAFASYNKFNNNCYQDALVTSVVNCMTSFVSGFVIFTVLGYMAEMRNEDVSEVAKDAGPSLLFITYAEAIANMPASTFFAIIFFLMLITLGLDSTFAGLEGVITAVLDEFPHIWAKRREWFVLIVVITCILGSLLTLTSGGAYVVTLLEEYATGPA.... The pIC50 is 5.5. (7) The small molecule is CC(C)CCS(=O)(=O)Nc1cc(-c2ccc(=O)n(C)c2)nc(Oc2c(F)cccc2Cl)n1. The target protein (Q92793) has sequence MAENLLDGPPNPKRAKLSSPGFSANDSTDFGSLFDLENDLPDELIPNGGELGLLNSGNLVPDAASKHKQLSELLRGGSGSSINPGIGNVSASSPVQQGLGGQAQGQPNSANMASLSAMGKSPLSQGDSSAPSLPKQAASTSGPTPAASQALNPQAQKQVGLATSSPATSQTGPGICMNANFNQTHPGLLNSNSGHSLINQASQGQAQVMNGSLGAAGRGRGAGMPYPTPAMQGASSSVLAETLTQVSPQMTGHAGLNTAQAGGMAKMGITGNTSPFGQPFSQAGGQPMGATGVNPQLASKQSMVNSLPTFPTDIKNTSVTNVPNMSQMQTSVGIVPTQAIATGPTADPEKRKLIQQQLVLLLHAHKCQRREQANGEVRACSLPHCRTMKNVLNHMTHCQAGKACQVAHCASSRQIISHWKNCTRHDCPVCLPLKNASDKRNQQTILGSPASGIQNTIGSVGTGQQNATSLSNPNPIDPSSMQRAYAALGLPYMNQPQTQL.... The pIC50 is 6.3. (8) The small molecule is CC(=O)N[C@@H](CC(=O)O)C(=O)N[C@@H](CCC(=O)O)C(=O)N[C@H](C(=O)N[C@H]1CC(=O)OC1O)C(C)C. The target protein (P29452) has sequence MADKILRAKRKQFINSVSIGTINGLLDELLEKRVLNQEEMDKIKLANITAMDKARDLCDHVSKKGPQASQIFITYICNEDCYLAGILELQSAPSAETFVATEDSKGGHPSSSETKEEQNKEDGTFPGLTGTLKFCPLEKAQKLWKENPSEIYPIMNTTTRTRLALIICNTEFQHLSPRVGAQVDLREMKLLLEDLGYTVKVKENLTALEMVKEVKEFAACPEHKTSDSTFLVFMSHGIQEGICGTTYSNEVSDILKVDTIFQMMNTLKCPSLKDKPKVIIIQACRGEKQGVVLLKDSVRDSEEDFLTDAIFEDDGIKKAHIEKDFIAFCSSTPDNVSWRHPVRGSLFIESLIKHMKEYAWSCDLEDIFRKVRFSFEQPEFRLQMPTADRVTLTKRFYLFPGH. The pIC50 is 7.2. (9) The drug is Cc1ccc(C(=O)Nc2cccc(C)c2N2CCC3(CC2)OCCO3)o1. The target protein sequence is MSKSKVDNQFYSVEVGDSTFTVLKRYQNLKPIGSGAQGIVCAAYDAVLDRNVAIKKLSRPFQNQTHAKRAYRELVLMKCVNHKNIISLLNVFTPQKTLEEFQDVYLVMELMDANLCQVIQMELDHERMSYLLYQMLCGIKHLHSAGIIHRDLKPSNIVVKSDCTLKILDFGLARTAGTSFMMTPYVVTRYYRAPEVILGMGYKENVDIWSVGCIMGEMVRHKILFPGRDYIDQWNKVIEQLGTPCPEFMKKLQPTVRNYVENRPKYAGLTFPKLFPDSLFPADSEHNKLKASQARDLLSKMLVIDPAKRISVDDALQHPYINVWYDPAXXXXXDEREHTIEEWKELIYKEVMNSE. The pIC50 is 6.3. (10) The target protein (P04578) has sequence MRVKEKYQHLWRWGWRWGTMLLGMLMICSATEKLWVTVYYGVPVWKEATTTLFCASDAKAYDTEVHNVWATHACVPTDPNPQEVVLVNVTENFNMWKNDMVEQMHEDIISLWDQSLKPCVKLTPLCVSLKCTDLKNDTNTNSSSGRMIMEKGEIKNCSFNISTSIRGKVQKEYAFFYKLDIIPIDNDTTSYKLTSCNTSVITQACPKVSFEPIPIHYCAPAGFAILKCNNKTFNGTGPCTNVSTVQCTHGIRPVVSTQLLLNGSLAEEEVVIRSVNFTDNAKTIIVQLNTSVEINCTRPNNNTRKRIRIQRGPGRAFVTIGKIGNMRQAHCNISRAKWNNTLKQIASKLREQFGNNKTIIFKQSSGGDPEIVTHSFNCGGEFFYCNSTQLFNSTWFNSTWSTEGSNNTEGSDTITLPCRIKQIINMWQKVGKAMYAPPISGQIRCSSNITGLLLTRDGGNSNNESEIFRPGGGDMRDNWRSELYKYKVVKIEPLGVAPTK.... The small molecule is COCc1cc(C)n(CC(=O)Nc2ccc(Cl)cc2Cl)c(=O)c1C#N. The pIC50 is 4.7.