This data is from Drug-target binding data from BindingDB using IC50 measurements. The task is: Regression. Given a target protein amino acid sequence and a drug SMILES string, predict the binding affinity score between them. We predict pIC50 (pIC50 = -log10(IC50 in M); higher means more potent). Dataset: bindingdb_ic50. (1) The target protein (P30083) has sequence MRPPSPPHVRWLCVLAGALACALRPAGSQAASPQHECEYLQLIEIQRQQCLEEAQLENETTGCSKMWDNLTCWPTTPRGQAVVLDCPLIFQLFAPIHGYNISRSCTEEGWSQLEPGPYHIACGLNDRASSLDEQQQTKFYNTVKTGYTIGYSLSLASLLVAMAILSLFRKLHCTRNYIHMHLFMSFILRATAVFIKDMALFNSGEIDHCSEASVGCKAAVVFFQYCVMANFFWLLVEGLYLYTLLAVSFFSERKYFWGYILIGWGVPSVFITIWTVVRIYFEDFGCWDTIINSSLWWIIKAPILLSILVNFVLFICIIRILVQKLRPPDIGKNDSSPYSRLAKSTLLLIPLFGIHYVMFAFFPDNFKAQVKMVFELVVGSFQGFVVAILYCFLNGEVQAELRRKWRRWHLQGVLGWSSKSQHPWGGSNGATCSTQVSMLTRVSPSARRSSSFQAEVSLV. The pIC50 is 4.3. The compound is Cc1ccccc1-c1cc(C(C)(C)C)cc(C(C)C)c1CO. (2) The drug is Cc1[nH]c(=O)c(C#N)cc1-c1ccncc1. The target protein sequence is PWLVGWWDQFKRMLNRELTHLSEMSRSGNQVSEYISTTFLDKQNEVDIPSPTMKDHEKQQAPRQRPSQQPPPPGPQFQPMSQITGVKKLMHSSSLNEDSSIPRFGVKTDQEELLAQEL. The pIC50 is 4.9. (3) The drug is CCCOc1cc(C2(C)CNC(=O)O2)ccc1OC. The target protein sequence is HMKVSDDEYTKLLHDGIQPVAAIDSNFASFTYTPRSLPEDDTSMAILSMLQDMNFINNYKIDCPTLARFCLMVKKGYRDPPYHNWMHAFSVSHFCYLLYKNLELTNYLEDIEIFALFISCMCHDLDHRGTNNSFQVASKSVLAALYSSEGSVMERHHFAQAIAILNTHGCNIFDHFSRKDYQRMLDLMRDIILATDLAHHLRIFKDLQKMAEVGYDRNNKQHHRLLLCLLMTSCDLSDQTKGWKTTRKIAELIYKEFFSQGDLEKAMGNRPMEMMDREKAYIPELQISFMEHIAMPIYKLLQDLFPKAAELYERVASNREHWTKVSHKFTIRGLPSNNSLDFLDEEYEVPDLDGTRAPINGCCSLDAE. The pIC50 is 3.7. (4) The small molecule is O=C(CN1C(=O)[C@@H](COC(=O)Nc2ccc(Cl)cc2)N=C(c2ccccc2)c2ccccc21)N[C@H]1CC(=O)OC1O. The target protein sequence is QINYDAVIKKYKGNENFDHAAYDWRLHSGVTPVKDQKNCGSCWAFSSIGSVESQYAIRKNKLITLSEQELVDCSFKNYGCNGGLINNAFEDMIELGGICTDDDYPYVSDAPNLCNIDRCTEKYGIKNYLSVPDNKLKEALRFLGPISISIAVSDDFPFYKEGIFDGECGDELNHAVMLVGFGMKEIVNPLTKKGEKHYYYIIKNSWGQQWGERGFINIETDESGLMRKCGLGTDAFIPLIE. The pIC50 is 5.0. (5) The small molecule is CC(CC(=O)O)CC(=O)c1ccc2oc3ccccc3c2c1. The target protein (P51639) has sequence MLSRLFRMHGLFVASHPWEVIVGTVTLTICMMSMNMFTGNNKICGWNYECPKFEEDVLSSDIIILTITRCIAILYIYFQFQNLRQLGSKYILGIAGLFTIFSSFVFSTVVIHFLDKELTGLNEALPFFLLLIDLSRASALAKFALSSNSQDEVRENIARGMAILGPTFTLDALVECLVIGVGTMSGVRQLEIMCCFGCMSVLANYFVFMTFFPACVSLVLELSRESREGRPIWQLSHFARVLEEEENKPNPVTQRVKMIMSLGLVLVHAHSRWIADPSPQNSTAEQSKVSLGLAEDVSKRIEPSVSLWQFYLSKMISMDIEQVITLSLALLLAVKYIFFEQAETESTLSLKNPITSPVVTPKKAQDNCCRREPLLVRRNQKLSSVEEDPGVNQDRKVEVIKPLVAEAETSGRATFVLGASAASPPLALGAQEPGIELPSEPRPNEECLQILESAEKGAKFLSDAEIIQLVNAKHIPAYKLETLMETHERGVSIRRQLLSA.... The pIC50 is 3.4. (6) The drug is COC(=O)[C@@H](NC(C)=O)[C@@H](C)O[C@H]1O[C@H](CO)[C@H](O)[C@H](O[C@@H]2O[C@H](CO)[C@H](O)[C@H](O[C@]3(C(=O)[O-])C[C@H](O)[C@@H](NC(C)=O)[C@H]([C@H](O)[C@H](O)CNC(=O)c4ccc(F)cc4)O3)[C@H]2O)[C@H]1NC(C)=O. The target protein (P20916) has sequence MIFLTALPLFWIMISASRGGHWGAWMPSSISAFEGTCVSIPCRFDFPDELRPAVVHGVWYFNSPYPKNYPPVVFKSRTQVVHESFQGRSRLLGDLGLRNCTLLLSNVSPELGGKYYFRGDLGGYNQYTFSEHSVLDIVNTPNIVVPPEVVAGTEVEVSCMVPDNCPELRPELSWLGHEGLGEPAVLGRLREDEGTWVQVSLLHFVPTREANGHRLGCQASFPNTTLQFEGYASMDVKYPPVIVEMNSSVEAIEGSHVSLLCGADSNPPPLLTWMRDGTVLREAVAESLLLELEEVTPAEDGVYACLAENAYGQDNRTVGLSVMYAPWKPTVNGTMVAVEGETVSILCSTQSNPDPILTIFKEKQILSTVIYESELQLELPAVSPEDDGEYWCVAENQYGQRATAFNLSVEFAPVLLLESHCAAARDTVQCLCVVKSNPEPSVAFELPSRNVTVNESEREFVYSERSGLVLTSILTLRGQAQAPPRVICTARNLYGAKSLE.... The pIC50 is 7.1. (7) The small molecule is CCCCCCCCCC(=O)C(O)c1cccc(C)c1. The target protein (Q9KM66) has sequence MIVSMDVIKRVYQYAEPNLSLVGWMGMLGFPAYYFIWEYWFPQSYENLGLRCAAAVLFGGLVFRDSMPKKWQRYMPGYFLFTIGFCLPFFFAFMMLMNDWSTIWAMSFMASIFLHILLVHDTRVMALQALFSVLVAYLAVYGLTDFHPTTLIEWQYIPIFLFTYVFGNLCFFRNQISHETKVSIAKTFGAGIAHEMRNPLSALKTSIDVVRTMIPKPQTAAHTDYSLDAQELDLLHQILNEADDVIYSGNNAIDLLLTSIDENRVSPASFKKHSVVDVIEKAVKTFPYKNAADQHSVELEVHQPFDFFGSDTLLTYALFNLLKNAFYYQKEHFSVCISIEQTSEHNLIRVRDNGVGIAPEMLEDIFRDFYTFGKNGSYGLGLPFCRKVMSAFGGTIRCASQQGQWTEFVLSFPRYDSDTVNEIKTELLKTKSLIYIGSNQAIVRELNQLAVEDEFGFTAISAQQAVRRQDYEFEFDLILLDLDDATAQGELLPKLEGTLS.... The pIC50 is 4.3. (8) The pIC50 is 8.2. The target protein sequence is MAKATSGAAGLRLLLLLLLPLLGKVALGLYFSRDAYWEKLYVDQAAGTPLLYVHALRDAPEEVPSFRLGQHLYGTYRTRLHENNWICIQEDTGLLYLNRSLDHSSWEKLSVRNRGFPLLTVYLKVFLSPTSLREGECQWPGCARVYFSFFNTSFPACSSLKPRELCFPETRPSFRIRENRPPGTFHQFRLLPVQFLCPNISVAYRLLEGEGLPFRCAPDSLEVSTRWALDREQREKYELVAVCTVHAGAREEVVMVPFPVTVYDEDDSAPTFPAGVDTASAVVEFKRKEDTVVATLRVFDADVVPASGELVRRYTSTLLPGDTWAQQTFRVEHWPNETSVQANGSFVRATVHDYRLVLNRNLSISENRTMQLAVLVNDSDFQGPGAGVLLLHFNVSVLPVSLHLPSTYSLSVSRRARRFAQIGKVCVENCQAFSGINVQYKLHSSGANCSTLGVVTSAEDTSGILFVNDTKALRRPKCAELHYMVVATDQQTSRQAQAQL.... The drug is COCc1ccc(NC(=O)c2nn(C3CCC(C)C3)c3ncnc(N)c23)cc1. (9) The pIC50 is 5.2. The target protein (P0A6M2) has sequence MLRFLNQCSQGRGAWLLMAFTALALELTALWFQHVMLLKPCVLCIYERCALFGVLGAALIGAIAPKTPLRYVAMVIWLYSAFRGVQLTYEHTMLQLYPSPFATCDFMVRFPEWLPLDKWVPQVFVASGDCAERQWDFLGLEMPQWLLGIFIAYLIVAVLVVISQPFKAKKRDLFGR. The small molecule is CC1=C2CC(C(C)(C)O)CCC2(C)CCC1=O. (10) The target protein (P04351) has sequence MYSMQLASCVTLTLVLLVNSAPTSSSTSSSTAEAQQQQQQQQQQQQHLEQLLMDLQELLSRMENYRNLKLPRMLTFKFYLPKQATELKDLQCLEDELGPLRHVLDLTQSKSFQLEDAENFISNIRVTVVKLKGSDNTFECQFDDESATVVDFLRRWIAFCQSIISTSPQ. The drug is C/C=C/C[C@@H](C)[C@@H](O)[C@@H]1C(=O)N[C@H](CCCO)C(=O)N(C)CC(=O)N(C)[C@H](CC(C)C)C(=O)N[C@H](C(C)C)C(=O)N(C)[C@H](CC(C)C)C(=O)N[C@H](C)C(=O)N[C@@H](C)C(=O)N(C)[C@H](CC(C)C)C(=O)N(C)[C@H](CC(C)C)C(=O)N(C)[C@H](C(C)C)C(=O)N1C. The pIC50 is 7.2.