This data is from Drug-target binding data from BindingDB using IC50 measurements. The task is: Regression. Given a target protein amino acid sequence and a drug SMILES string, predict the binding affinity score between them. We predict pIC50 (pIC50 = -log10(IC50 in M); higher means more potent). Dataset: bindingdb_ic50. (1) The drug is CN(C)c1ccc2nc3ccc(=[N+](C)C)cc-3sc2c1. The target protein (O15770) has sequence MVYDLIVIGGGSGGMAAARRAARHNAKVALVEKSRLGGTCVNVGCVPKKIMFNAASVHDILENSRHYGFDTKFSFNLPLLVERRDKYIQRLNNIYRQNLSKDKVDLYEGTASFLSENRILIKGTKDNNNKDNGPLNEEILEGRNILIAVGNKPVFPPVKGIENTISSDEFFNIKESKKIGIVGSGYIAVELINVIKRLGIDSYIFARGNRILRKFDESVINVLENDMKKNNINIVTFADVVEIKKVSDKNLSIHLSDGRIYEHFDHVIYCVGRSPDTENLNLEKLNVETNNNYIVVDENQRTSVNNIYAVGDCCMVKKSKEIEDLNLLKLYNEETYLNKKENVTEDIFYNVQLTPVAINAGRLLADRLFLKKTRKTNYKLIPTVIFSHPPIGTIGLSEEAAIQIYGKENVKIYESKFTNLFFSVYDIEPELKEKTYLKLVCVGKDELIKGLHIIGLNADEIVQGFAVALKMNATKKDFDETIPIHPTAAEEFLTLQPWMK.... The pIC50 is 5.2. (2) The drug is CC(=O)Nc1cc(Oc2ccc3c(C(=O)Nc4ccc(CN5CCN(C)CC5)c(C(F)(F)F)c4)cccc3c2)ncn1. The target protein sequence is MENFQKVEKIGEGTYGVVYKARNKLTGEVVALKKIRXDTETEGVPSTAIREISLLKELNHPNIVKLLDVIHTENKLYLVFEFLHQDLKKFMDASALTGIPLPLIKSYLFQLLQGLAFLHSHRVLHRDLKPQNLLINTEGAIKLADFGLARAFGVPVRTYTHEVVTLWYRAPEILLGCKYYSTAVDIWSLGCIFAEMVTRRALFPGDSEIDQLFRIFRTLGTPDEVVWPGVTSMPDYKPSFPKWARQDFSKVVPPLDEDGRSLLSQMLHYDPNKRISAKAALAHPFFQDVTKPVPHLRL. The pIC50 is 5.0. (3) The compound is NC(=O)[C@H](CC(=O)O)NC(=O)[C@H](Cc1ccc2c(c1)COP(=O)(O)O2)NC(=O)OCC1c2ccccc2-c2ccccc21. The target protein (O60880) has sequence MDAVAVYHGKISRETGEKLLLATGLDGSYLLRDSESVPGVYCLCVLYHGYIYTYRVSQTETGSWSAETAPGVHKRYFRKIKNLISAFQKPDQGIVIPLQYPVEKKSSARSTQGTTGIREDPDVCLKAP. The pIC50 is 4.8. (4) The drug is C=C1C(=O)O[C@H]2C[C@H](C)[C@]3(/C=C/C(C)=O)O[C@@H]3C[C@H]12. The target protein (P01120) has sequence MPLNKSNIREYKLVVVGGGGVGKSALTIQLTQSHFVDEYDPTIEDSYRKQVVIDDEVSILDILDTAGQEEYSAMREQYMRNGEGFLLVYSITSKSSLDELMTYYQQILRVKDTDYVPIVVVGNKSDLENEKQVSYQDGLNMAKQMNAPFLETSAKQAINVEEAFYTLARLVRDEGGKYNKTLTENDNSKQTSQDTKGSGANSVPRNSGGHRKMSNAANGKNVNSSTTVVNARNASIESKTGLAGNQATNGKTQTDRTNIDNSTGQAGQANAQSANTVNNRVNNNSKAGQVSNAKQARKQQAAPGGNTSEASKSGSGGCCIIS. The pIC50 is 4.0. (5) The compound is CCCCCCCCCCCCNc1ccnc(NCCN(C)C)n1. The target protein (P50570) has sequence MGNRGMEELIPLVNKLQDAFSSIGQSCHLDLPQIAVVGGQSAGKSSVLENFVGRDFLPRGSGIVTRRPLILQLIFSKTEHAEFLHCKSKKFTDFDEVRQEIEAETDRVTGTNKGISPVPINLRVYSPHVLNLTLIDLPGITKVPVGDQPPDIEYQIKDMILQFISRESSLILAVTPANMDLANSDALKLAKEVDPQGLRTIGVITKLDLMDEGTDARDVLENKLLPLRRGYIGVVNRSQKDIEGKKDIRAALAAERKFFLSHPAYRHMADRMGTPHLQKTLNQQLTNHIRESLPALRSKLQSQLLSLEKEVEEYKNFRPDDPTRKTKALLQMVQQFGVDFEKRIEGSGDQVDTLELSGGARINRIFHERFPFELVKMEFDEKDLRREISYAIKNIHGVRTGLFTPDLAFEAIVKKQVVKLKEPCLKCVDLVIQELINTVRQCTSKLSSYPRLREETERIVTTYIREREGRTKDQILLLIDIEQSYINTNHEDFIGFANAQ.... The pIC50 is 5.2. (6) The compound is CC(C)c1ccc(NC(=O)CCc2nc(O)c3c4c(sc3n2)CCC4)cc1. The target protein (Q9H999) has sequence MKIKDAKKPSFPWFGMDIGGTLVKLSYFEPIDITAEEEQEEVESLKSIRKYLTSNVAYGSTGIRDVHLELKDLTLFGRRGNLHFIRFPTQDLPTFIQMGRDKNFSTLQTVLCATGGGAYKFEKDFRTIGNLHLHKLDELDCLVKGLLYIDSVSFNGQAECYYFANASEPERCQKMPFNLDDPYPLLVVNIGSGVSILAVHSKDNYKRVTGTSLGGGTFLGLCSLLTGCESFEEALEMASKGDSTQADKLVRDIYGGDYERFGLPGWAVASSFGNMIYKEKRESVSKEDLARATLVTITNNIGSVARMCAVNEKINRVVFVGNFLRVNTLSMKLLAYALDYWSKGQLKALFLEHEGYFGAVGALLGLPNFS. The pIC50 is 6.9. (7) The drug is N#Cc1ccc(Nc2ccc3[nH]c(=O)[nH]c3c2)c(Cl)c1. The target protein (P41182) has sequence MASPADSCIQFTRHASDVLLNLNRLRSRDILTDVVIVVSREQFRAHKTVLMACSGLFYSIFTDQLKCNLSVINLDPEINPEGFCILLDFMYTSRLNLREGNIMAVMATAMYLQMEHVVDTCRKFIKASEAEMVSAIKPPREEFLNSRMLMPQDIMAYRGREVVENNLPLRSAPGCESRAFAPSLYSGLSTPPASYSMYSHLPVSSLLFSDEEFRDVRMPVANPFPKERALPCDSARPVPGEYSRPTLEVSPNVCHSNIYSPKETIPEEARSDMHYSVAEGLKPAAPSARNAPYFPCDKASKEEERPSSEDEIALHFEPPNAPLNRKGLVSPQSPQKSDCQPNSPTESCSSKNACILQASGSPPAKSPTDPKACNWKKYKFIVLNSLNQNAKPEGPEQAELGRLSPRAYTAPPACQPPMEPENLDLQSPTKLSASGEDSTIPQASRLNNIVNRSMTGSPRSSSESHSPLYMHPPKCTSCGSQSPQHAEMCLHTAGPTFPEE.... The pIC50 is 4.0. (8) The small molecule is COCCCOc1cc(C(=O)N(C[C@@H]2CNC[C@H]2NCc2ccc3ccccc3c2)C(C)C)ccc1OC. The target protein (P0DJD9) has sequence MKWLLLLGLVALSECIMYKVPLIRKKSLRRTLSERGLLKDFLKKHNLNPARKYFPQWEAPTLVDEQPLENYLDMEYFGTIGIGTPAQDFTVVFDTGSSNLWVPSVYCSSLACTNHNRFNPEDSSTYQSTSETVSITYGTGSMTGILGYDTVQVGGISDTNQIFGLSETEPGSFLYYAPFDGILGLAYPSISSSGATPVFDNIWNQGLVSQDLFSVYLSADDKSGSVVIFGGIDSSYYTGSLNWVPVTVEGYWQITVDSITMNGETIACAEGCQAIVDTGTSLLTGPTSPIANIQSDIGASENSDGDMVVSCSAISSLPDIVFTINGVQYPVPPSAYILQSEGSCISGFQGMNVPTESGELWILGDVFIRQYFTVFDRANNQVGLAPVA. The pIC50 is 5.0. (9) The drug is COc1cc2c(cc1O)[C@@H](Cc1ccc(O)c(Oc3ccc(C[C@@H]4c5cc(OC)c(OC)cc5CCN4C)cc3)c1)N(C)CC2. The target protein (Q28705) has sequence MIPPNATAVMPFLTTLGEETAHLQGSSATSLARRGPLGDDGQMEALYILMVLGFFGFFTLGIMLSYIRSQKLEHSHDPFNVYIEANDWQEKDRAYFQARVLESCRGCYVLENQLAVEHPDTHLPELKPSL. The pIC50 is 4.3.