This data is from Drug-target binding data from BindingDB using IC50 measurements. The task is: Regression. Given a target protein amino acid sequence and a drug SMILES string, predict the binding affinity score between them. We predict pIC50 (pIC50 = -log10(IC50 in M); higher means more potent). Dataset: bindingdb_ic50. (1) The small molecule is O=C(O)[C@@H]1CCNC[C@H]1O. The target protein (P48056) has sequence MDRKVTVHEDGCPVVSWVPEEGEMMDQKDKDQVKDRGQWTNKMEFVLSVAGEIIGLGNVWRFPYLCYKNGGGAFFIPYFIFFFSCGIPVFFLEVALGQYSSQGSVTAWRKICPLLQGIGMASVVIESYLNIYYIIILAWALFYLFSSFTWELPWTTCTNSWNTEHCVDFLNYSSTRAASYSENFTSPVMEFWERRVLGITSGIHDLGSLRWELALCLLLAWIICYFCIWKGVKSTGKVVYFTATFPYLMLIILLIRGVTLPGAYQGIVFYLKPDLLRLKDPQVWMDAGTQIFFSFAICQGCLTALGSYNKYHNNCYRDSIALCFLNSATSFVAGFVVFSILGFMAQEQGVPISEVAESGPGLAFIAFPKAVTMMPLSQLWSCLFFLMLLFLGLDSQFVCMECLVTASMDMFPQQLRKRGRRELLILAVAIVCYLMGLLLVTEGGMYIFQLFDYYASSGICLLFLSLFEVICIGWVYGADRFYDNVEDMIGYRPWPLVKIS.... The pIC50 is 3.5. (2) The drug is CCCn1c(=O)n(CC)c(=O)c2cnc3c(OC)cccc3c21. The target protein (Q99732) has sequence MSVPGPYQAATGPSSAPSAPPSYEETVAVNSYYPTPPAPMPGPTTGLVTGPDGKGMNPPSYYTQPAPIPNNNPITVQTVYVQHPITFLDRPIQMCCPSCNKMIVSQLSYNAGALTWLSCGSLCLLGCIAGCCFIPFCVDALQDVDHYCPNCRALLGTYKRL. The pIC50 is 6.2. (3) The compound is Clc1ccc(C(Nc2ccnc3cc(Cl)ccc23)c2ccc(CN3CCNCC3)cc2)cc1. The target protein (Q9N623) has sequence MKFASKKNNQKNSSKNDERYRELDNLVQEGNGSRLGGGSCLGKCAHVFKLIFKEIKDNIFIYILSIIYLSVCVMNKIFAKRTLNKIGNYSFVTSETHNFICMIMFFIVYSLFGNKKGNSKERHRSFNLQFFAISMLDACSVILAFIGLTRTTGNIQSFVLQLSIPINMFFCFLILRYRYHLYNYLGAVIIVVTIALVEMKLSFETQEENSIIFNLVLISALIPVCFSNMTREIVFKKYKIDILRLNAMVSFFQLFTSCLILPVYTLPFLKQLHLPYNEIWTNIKNGFACLFLGRNTVVENCGLGMAKLCDDCDGAWKTFALFSFFNICDNLITSYIIDKFSTMTYTIVSCIQGPAIAIAYYFKFLAGDVVREPRLLDFVTLFGYLFGSIIYRVGNIILERKKMRNEENEDSEGELTNVDSIITQ. The pIC50 is 5.0. (4) The small molecule is O=C(O)[C@H]1O[C@@H](Oc2ccc([C@@H]3[C@@H](CC[C@H](O)c4ccc(F)cc4)C(=O)N3c3ccc(F)cc3)cc2)[C@H](O)[C@@H](O)[C@@H]1O. The target protein sequence is MADTGLRGWLLWALLLHVAQSELYTPIHQPGYCAFYDECGKNPELSGGLAPLSNVSCLSNTPALRVTGEHLTLLQRICPRLYTGTTTYACCSPKQLLSLETSLAVTKALLTRCPTCSDNFVNLHCQNTCSPNQSLFINVTRVAGGGGGRPQAVVAYEAFYQDTFAQQTYDSCSRVRIPAAATLAVGTMCGVYGSTLCNAQRWLNFQGDTSNGLAPLDITFHLMEPGQALGSGMQALTGEIRPCNESQGNGTVACSCQDCAASCPTIPQPQALDSTFYLGGLEGGLALVIILCSAFALLTTFLVGTRLASSCGKDKTPDPKAGMSLSDKLSLSTNVILSQCFQNWGTWVASWPLTILLVSIAVVLALSGGLAFVELTTDPVELWSAPSSQARSEKAFHDQHFGPFLRTNQVILTAPNRPSYHYDSLLLGPKNFSGVLASDLLLELLELQETLRHLQVWSPEEQRHISLQDICFAPLNPHNASLSDCCINSLLQYFQSNRTH.... The pIC50 is 7.2. (5) The compound is Nc1nc2c(nc(N3CCCC3)n2[C@@H]2O[C@H](COP(=O)([O-])OP(=O)([O-])OP(=O)([O-])O)C(O)C2O)c(=O)[nH]1. The target protein (O66809) has sequence MEEFVNPCKIKVIGVGGGGSNAVNRMYEDGIEGVELYAINTDVQHLSTLKVPNKIQIGEKVTRGLGAGAKPEVGEEAALEDIDKIKEILRDTDMVFISAGLGGGTGTGAAPVIAKTAKEMGILTVAVATLPFRFEGPRKMEKALKGLEKLKESSDAYIVIHNDKIKELSNRTLTIKDAFKEVDSVLSKAVRGITSIVVTPAVINVDFADVRTTLEEGGLSIIGMGEGRGDEKADIAVEKAVTSPLLEGNTIEGARRLLVTIWTSEDIPYDIVDEVMERIHSKVHPEAEIIFGAVLEPQEQDFIRVAIVATDFPEEKFQVGEKEVKFKVIKKEEKEEPKEEPKPLSDTTYLEEEEIPAVIRRKNKRLL. The pIC50 is 4.8. (6) The small molecule is CC(C)[C@H](NC(=O)[C@H](CCC(=O)O)NC(=O)[C@H](Cc1ccc(O)cc1)NC(=O)OCc1ccccc1)C(=O)N[C@H](/C=C/S(C)(=O)=O)CC(=O)O. The target protein (P51878) has sequence MAEDSGKKKRRKNFEAMFKGILQSGLDNFVINHMLKNNVAGQTSIQTLVPNTDQKSTSVKKDNHKKKTVKMLEYLGKDVLHGVFNYLAKHDVLTLKEEEKKKYYDTKIEDKALILVDSLRKNRVAHQMFTQTLLNMDQKITSVKPLLQIEAGPPESAESTNILKLCPREEFLRLCKKNHDEIYPIKKREDRRRLALIICNTKFDHLPARNGAHYDIVGMKRLLQGLGYTVVDEKNLTARDMESVLRAFAARPEHKSSDSTFLVLMSHGILEGICGTAHKKKKPDVLLYDTIFQIFNNRNCLSLKDKPKVIIVQACRGEKHGELWVRDSPASLALISSQSSENLEADSVCKIHEEKDFIAFCSSTPHNVSWRDRTRGSIFITELITCFQKYSCCCHLMEIFRKVQKSFEVPQAKAQMPTIERATLTRDFYLFPGN. The pIC50 is 5.5. (7) The compound is NCCCn1cc(C2=C(c3c[nH]c4ccccc34)C(=O)NC2=O)c2ccccc21. The target protein sequence is MDGTAAEPRPGAGSLQHAQPPPQPRKKRPEDFKFGKILGEGSFSTVVLARELATSREYAIKILEKRHIIKENKVPYVTRERDVMSRLDHPFFVKLYFTFQDDEKLYFGLSYAKNGELLKYIRKIGSFDETCTRFYTAEIVSALEYLHGKGIIHRDLKPENILLNEDMHIQIADFGTAKVLSPESKQARANSFVGTAQYVSPELLTEKSACKSSDLWALGCIIYQLVAGLPPFRAGNEYLIFQKIIKLEYDFPEKFFPKARDLVEKLLVLDATKRLGCEEMEGYGPLKAHPFFESVTWENLHQQTPPKLT. The pIC50 is 5.4. (8) The drug is CC1(C)S[C@@H]2[C@H](CS)C(=O)N2[C@H]1C(=O)O. The target protein (P14488) has sequence MKNTLLKLGVCVSLLGITPFVSTISSVQAERTVEHKVIKNETGTISISQLNKNVWVHTELGYFSGEAVPSNGLVLNTSKGLVLVDSSWDDKLTKELIEMVEKKFKKRVTDVIITHAHADRIGGMKTLKERGIKAHSTALTAELAKKNGYEEPLGDLQSVTNLKFGNMKVETFYPGKGHTEDNIVVWLPQYQILAGGCLVKSASSKDLGNVADAYVNEWSTSIENVLKRYGNINLVVPGHGEVGDRGLLLHTLDLLK. The pIC50 is 5.5. (9) The target protein (P97584) has sequence MVQAKTWTLKKHFEGFPTDSNFELRTTELPPLNNGEVLLEALFLSVDPYMRVAAKKLKEGDSMMGEQVARVVESKNSAFPTGTIVVALLGWTSHSISDGNGLRKLPAEWPDKLPLSLALGTVGMPGLTAYFGLLDICGLKGGETVLVNAAAGAVGSVVGQIAKLKGCKVVGTAGSDEKVAYLKKLGFDVAFNYKTVKSLEEALRTASPDGYDCYFDNVGGEFSNTVILQMKTFGRIAICGAISQYNRTGPCPPGPSPEVIIYQQLRMEGFIVTRWQGEVRQKALTDLMNWVSEGKIRYHEYITEGFEKMPAAFMGMLKGDNLGKTIVKA. The small molecule is CC1=C(CO)C2=C(C)C3(CC3)[C@](C)(O)C(=O)C2=C1. The pIC50 is 5.8.