From a dataset of Drug-target binding data from BindingDB using IC50 measurements. Regression. Given a target protein amino acid sequence and a drug SMILES string, predict the binding affinity score between them. We predict pIC50 (pIC50 = -log10(IC50 in M); higher means more potent). Dataset: bindingdb_ic50. (1) The drug is OC[C@H]1O[C@@H](c2ccc(Cl)c(Cc3ccc(OCCOC4CCC4)cc3)c2)[C@H](O)[C@@H](O)[C@@H]1O. The target protein (P13866) has sequence MDSSTWSPKTTAVTRPVETHELIRNAADISIIVIYFVVVMAVGLWAMFSTNRGTVGGFFLAGRSMVWWPIGASLFASNIGSGHFVGLAGTGAASGIAIGGFEWNALVLVVVLGWLFVPIYIKAGVVTMPEYLRKRFGGQRIQVYLSLLSLLLYIFTKISADIFSGAIFINLALGLNLYLAIFLLLAITALYTITGGLAAVIYTDTLQTVIMLVGSLILTGFAFHEVGGYDAFMEKYMKAIPTIVSDGNTTFQEKCYTPRADSFHIFRDPLTGDLPWPGFIFGMSILTLWYWCTDQVIVQRCLSAKNMSHVKGGCILCGYLKLMPMFIMVMPGMISRILYTEKIACVVPSECEKYCGTKVGCTNIAYPTLVVELMPNGLRGLMLSVMLASLMSSLTSIFNSASTLFTMDIYAKVRKRASEKELMIAGRLFILVLIGISIAWVPIVQSAQSGQLFDYIQSITSYLGPPIAAVFLLAIFWKRVNEPGAFWGLILGLLIGISRM.... The pIC50 is 8.5. (2) The drug is Cc1sc(Nc2cc3c(cc2F)OCO3)nc1C(=O)N1CCCC(C)C1C. The target protein (Q13507) has sequence MREKGRRQAVRGPAFMFNDRGTSLTAEEERFLDAAEYGNIPVVRKMLEESKTLNVNCVDYMGQNALQLAVGNEHLEVTELLLKKENLARIGDALLLAISKGYVRIVEAILNHPGFAASKRLTLSPCEQELQDDDFYAYDEDGTRFSPDITPIILAAHCQKYEVVHMLLMKGARIERPHDYFCKCGDCMEKQRHDSFSHSRSRINAYKGLASPAYLSLSSEDPVLTALELSNELAKLANIEKEFKNDYRKLSMQCKDFVVGVLDLCRDSEEVEAILNGDLESAEPLEVHRHKASLSRVKLAIKYEVKKFVAHPNCQQQLLTIWYENLSGLREQTIAIKCLVVLVVALGLPFLAIGYWIAPCSRLGKILRSPFMKFVAHAASFIIFLGLLVFNASDRFEGITTLPNITVTDYPKQIFRVKTTQFTWTEMLIMVWVLGMMWSECKELWLEGPREYILQLWNVLDFGMLSIFIAAFTARFLAFLQATKAQQYVDSYVQESDLSE.... The pIC50 is 6.8. (3) The small molecule is CC1(C)[C@H](C(=O)O)N2C(=O)C[C@H]2S1(=O)=O. The target protein sequence is MRYIRLCIISLLATLPLAVHASPQPLEQIKQSESQLSGRVGMIEMDLASGRTLTAWRADERFPMMSTFKVVLCGAVLARVDAGDEQLERKIHYRQQDLVDYSPVSEKHLADGMTVGELCAAAITMSDNSAANLLLATVGGPAGLTAFLRQIGDNVTRLDRWETELNEALPGDARDTTTPASMAATLRKLLTSQRLSARSQRQLLQWMVDDRVAGPLIRSVLPAGWFIADKTGAGERGARGIVALLGPNNKAERIVVIYLRDTPASMAERNQQIAGIGAALIEHWQR. The pIC50 is 5.6. (4) The drug is O=C(CN1C(=O)C(NC(=O)Oc2ccc(Cl)cc2)CS1(=O)=O)OCc1ccccc1. The target protein sequence is MAKQKIKIKKNKIGAVLLVGLFGLLFFILVLRISYIMITGHSNGQDLVMKANEKYLVKNAQQPERGKIYDRNGKVLAEDVERYKLVAVIDKKASANSKKPRHVVDKKETAKKLSTVINMKPEEIEKRLSQKKAFQIEFGRKGTNLTYQDKLKIEKMNLPGISLLPETERFYPNGNFASHLIGRAQKNPDTGELKGALGVEKIFDSYLSGSKGSLRYIHDIWGYIAPNTKKEKQPKRGDDVHLTIDSNIQVFVEEALDGMVERYQPKDLFAVVMDAKTGEILAYSQRPTFNPETGKDFGKKWANDLYQNTYEPGSTFKSYGLAAAIQEGAFDPDKKYKSGHRDIMGSRISDWNRVGWGEIPMSLGFTYSSNTLMMHLQDLVGADKMKSWYERFGFGKSTKGMFDGEAPGQIGWSNELQQKTSSFGQSTTVTPVQMLQAQSAFFNDGNMLKPWFVNSVENPVSKRQFYKGQKQIAGKPITKDTAEKVEKQLDLVVNSKKSHA.... The pIC50 is 3.3. (5) The small molecule is CSc1ccc(/C=C(/C(=O)NCCCCCCC(=O)NO)c2ccc(F)cc2)cc1. The target protein (O09106) has sequence MAQTQGTKRKVCYYYDGDVGNYYYGQGHPMKPHRIRMTHNLLLNYGLYRKMEIYRPHKANAEEMTKYHSDDYIKFLRSIRPDNMSEYSKQMQRFNVGEDCPVFDGLFEFCQLSTGGSVASAVKLNKQQTDIAVNWAGGLHHAKKSEASGFCYVNDIVLAILELLKYHQRVLYIDIDIHHGDGVEEAFYTTDRVMTVSFHKYGEYFPGTGDLRDIGAGKGKYYAVNYPLRDGIDDESYEAIFKPVMSKVMEMFQPSAVVLQCGSDSLSGDRLGCFNLTIKGHAKCVEFVKSFNLPMLMLGGGGYTIRNVARCWTYETAVALDTEIPNELPYNDYFEYFGPDFKLHISPSNMTNQNTNEYLEKIKQRLFENLRMLPHAPGVQMQAIPEDAIPEESGDEDEEDPDKRISICSSDKRIACEEEFSDSDEEGEGGRKNSSNFKKAKRVKTEDEKEKDPEEKKEVTEEEKTKEEKPEAKGVKEEVKLA. The pIC50 is 7.2. (6) The compound is OCCN(CCO)c1nc(N2CCCCC2)c2nc(N(CCO)CCO)nc(N3CCCCC3)c2n1. The target protein sequence is MGLYLLRAGVRLPLAVALLAACCGGEALVQIGLGVGEDHLLSLPAATWLVLRLRLGVLMIALTSAVRTVSLISLERFKVAWRPYLAYLAGVLGILLARYVEQILPQSAGAAPREHFGSQLLAGTKEDIPEFKRRRRSSSVVSAEMSGCSSKSHRRTSLPCIPREQLMGHSEWDHKRGPRGSQSSGTSITVDIAVMGEAHGLITDLLADPSLPPNVCTSLRAVSNLLSTQLTFQAIHKPRVNPAVSFSENYTCSDSEESAEKDKLAIPKRLRRSLPPGLLRRVSSTWTTTTSATGLPTLEPSPVRRDRSASIKLHEAPSSSAINPDSWKNPVMMTLTKSRSFTSSYAVSASNHVKAKKQSRPGSLVKISPLSSPCSSALQGTPASSPVSKISTVQFPEPADATAKQGLSSHKALTYTQSAPDLSPHILTPPVICSSCGRPYSQGNPADGPLERSGPAIQAQSRTDDTAQVTSDYETNNNSDSSDIVQNEDETECSREPLRK.... The pIC50 is 4.4. (7) The compound is O=C1OC(=Cc2cccc3ccccc23)C(=O)C1c1ccc(C(F)(F)F)cc1. The target protein sequence is MADRNLRDLLAPWVPDAPSRALREMTLDSRVAAAGDLFVAVVGHQADGRRYIPQAIAQGVAAIIAEAKDEATDGEIREMHGVPVIYLSQLNERLSALAGRFYHEPSDNLRLVGVTGTNGKTTTTQLLAQWSQLLGETSAVMGTVGNGLLGKVIPTENTTGSAVDVQHELAGLVDQGATFCAMEVSSHGLVQHRVAALKFAASVFTNLSRDHLDYHGDMEHYEAAKWLLYSEHHCGQAIINADDEVGRRWLAKLPDAVAVSMEDHINPNCHGRWLKATEVNYHDSGATIRFSSSWGDGEIESHLMGAFNVSNLLLALATLLALGYPLADLLKTAARLQPVCGRMEVFTAPGKPTVVVDYAHTPDALEKALQAARLHCAGKLWCVFGCGGDRDKGKRPLMGAIAEEFADVAVVTDDNPRTEEPRAIINDILAGMLDAGHAKVMEGRAEAVTCAVMQAKENDVVLVAGKGHEDYQIVGNQRLDYSDRVTVARLLGVIA. The pIC50 is 4.8. (8) The drug is Cc1ccc(NC(=O)Nc2cccc(Cl)c2)cc1Nc1nccc(-c2cccnc2)n1. The target protein (P11274) has sequence MVDPVGFAEAWKAQFPDSEPPRMELRSVGDIEQELERCKASIRRLEQEVNQERFRMIYLQTLLAKEKKSYDRQRWGFRRAAQAPDGASEPRASASRPQPAPADGADPPPAEEPEARPDGEGSPGKARPGTARRPGAAASGERDDRGPPASVAALRSNFERIRKGHGQPGADAEKPFYVNVEFHHERGLVKVNDKEVSDRISSLGSQAMQMERKKSQHGAGSSVGDASRPPYRGRSSESSCGVDGDYEDAELNPRFLKDNLIDANGGSRPPWPPLEYQPYQSIYVGGMMEGEGKGPLLRSQSTSEQEKRLTWPRRSYSPRSFEDCGGGYTPDCSSNENLTSSEEDFSSGQSSRVSPSPTTYRMFRDKSRSPSQNSQQSFDSSSPPTPQCHKRHRHCPVVVSEATIVGVRKTGQIWPNDGEGAFHGDADGSFGTPPGYGCAADRAEEQRRHQDGLPYIDDSPSSSPHLSSKGRGSRDALVSGALESTKASELDLEKGLEMRK.... The pIC50 is 6.0. (9) The pIC50 is 6.2. The target protein sequence is MSIQHFRVALIPFFAAFCLPVFAHPETLVKVKDAEDQLGARVGYIELDLNSGKILESFRPEERFPMMSTFKVLLCGAVLSRVDAGQEQLGRRIHYSQNDLVEYSPVTEKHLTDGMTVRELCSAAITMSDNTAANLLLTTIGGPKELTAFLHNMGDHVTRLDRWEPELNEAIPNDERDTTMPAAMATTLRKLLTGELLTLASRQQLIDWMEADKVAGPLLRSALPAGWFIADKSGAGERGSRGIIAALGPDGKPSRIVVIYTDGESGNYG. The drug is CCN(CC)c1ccc2cc(-c3nc4ccccc4[nH]3)c(=O)oc2c1. (10) The drug is Cc1noc(-c2cc3cc(-c4nc(C(=O)NC5CCOCC5)[nH]c4C)ccc3[nH]2)n1. The target protein (P33435) has sequence MHSAILATFFLLSWTPCWSLPLPYGDDDDDDLSEEDLVFAEHYLKSYYHPATLAGILKKSTVTSTVDRLREMQSFFGLEVTGKLDDPTLDIMRKPRCGVPDVGEYNVFPRTLKWSQTNLTYRIVNYTPDMSHSEVEKAFRKAFKVWSDVTPLNFTRIYDGTADIMISFGTKEHGDFYPFDGPSGLLAHAFPPGPNYGGDAHFDDDETWTSSSKGYNLFIVAAHELGHSLGLDHSKDPGALMFPIYTYTGKSHFMLPDDDVQGIQFLYGPGDEDPNPKHPKTPEKCDPALSLDAITSLRGETMIFKDRFFWRLHPQQVEAELFLTKSFWPELPNHVDAAYEHPSRDLMFIFRGRKFWALNGYDILEGYPRKISDLGFPKEVKRLSAAVHFENTGKTLFFSENHVWSYDDVNQTMDKDYPRLIEEEFPGIGNKVDAVYEKNGYIYFFNGPIQFEYSIWSNRIVRVMPTNSILWC. The pIC50 is 7.0.