This data is from Drug-target binding data from BindingDB using IC50 measurements. The task is: Regression. Given a target protein amino acid sequence and a drug SMILES string, predict the binding affinity score between them. We predict pIC50 (pIC50 = -log10(IC50 in M); higher means more potent). Dataset: bindingdb_ic50. (1) The compound is OCC(CO)N[C@H]1C[C@](O)(CO)[C@@H](O)[C@H](O)[C@H]1O. The pIC50 is 4.2. The target protein (P35574) has sequence MGNSFDFGVLLILLKYFKSSRSQNGHSKQIRILLLNEMEKLEKTLFRLEQGFELQFRLGPTLQGKPVTVFTNYPFPGETFNREKFRSLEWENPTEREDDSDKYCKLNLQQSGSFQYYFLQGNEKSGGGYIVVDPILRVGADNHMLHLDCVTLQTFLAKCLGPFDEWESRLRVAKESGYNMIHFTPLQTLGLSRSCYSLADQLELNPDFSRPHKKYTWSDVGQLVEKLKREWNVLCITDVVYNHTAANSKWIQEHPECAYNLVNSPHLKPAWVLDRALWHFSCDVAEGKYKNRGVPALIENDHHLNCIRKVIWEDIFPKLHLWEFFQVDVYKAVEKFRGLLTQETWRVIKSDPKQHLKIIQDPEYRRFGCTVDMNIALATFIPHDNGPAAIEECCNWFRKRIEELNSEKHQLMNYHQEQAVNCLLGNVFYERLAGHGPKLGPVTRKYPLVTRYFTFPFEEMPVSTEETMIHLPNKACFFMAHNGWVMGDDPLRNFAEPGSD.... (2) The drug is O=C(c1cc2c(Cl)cccc2[nH]1)N1C[C@]2(CCN(C3CCNC3)C2)c2ccccc21. The target protein sequence is MAGSAVDSANHLTYLFGNITREEAEDYLVQGGMTDGLYLLRQSRNYLGGFALSVAHNRKAHHYTIERELNGTYAISGGRAHASPADLCHYHSQEPDGLICLLKKPFNRPPGVQPKTGPFEDLKENLIREYVKQTWNLQGQALEQAIISQKPQLEKLIATTAHEKMPWFHGNISRDESEQTVLIGSKTNGKFLIRARDNSGSYALCLLHEGKVLHYRIDRDKTGKLSIPEGKKFDTLWQLVEHYSYKPDGLLRVLTVPCQKIGAQMGHPGSPNAHPVTWSPGGIISRIKSYSFPKPGHKKPAPPQGSRPESTVSFNPYEPTGGPWGPDRGLQREALPMDTEVYESPYADPEEIRPKEVYLDRSLLTLEDNELGSGNFGTVKKGYYQMKKVVKTVAVKILKNEANDPALKDELLAEANVMQQLDNPYIVRMIGICEAESWMLVMEMAELGPLNKYLQQNRHIKDKNIIELVHQVSMGMKYLEESNFVHRDLAARNVLLVTQH.... The pIC50 is 6.2. (3) The drug is O=C(Nc1ccc(NS(=O)(=O)c2cccc(F)c2)cc1)Nc1ccccc1[N+](=O)[O-]. The target protein (Q81RP3) has sequence MTLQEQIMKALHVQPVIDPKAEIRKRVDFLKDYVKKTGAKGFVLGISGGQDSTLAGRLAQLAVEEIRNEGGNATFIAVRLPYKVQKDEDDAQLALQFIQADQSVAFDIASTVDAFSNQYENLLDESLTDFNKGNVKARIRMVTQYAIGGQKGLLVIGTDHAAEAVTGFFTKFGDGGADLLPLTGLTKRQGRALLQELGADERLYLKMPTADLLDEKPGQADETELGITYDQLDDYLEGKTVPADVAEKIEKRYTVSEHKRQVPASMFDDWWK. The pIC50 is 3.2. (4) The drug is CCC[C@@H]1C[C@@H](NCc2ccccc2)C[C@@]2(O1)C(=O)N(Cc1ccccc1)c1ccccc12. The target protein (O77759) has sequence MEPGGARLRLQRTEGPGGEREHQPCRDGNTETHRAPDLVKWTRHMEAVKAQLLEQAQGQLRELLDRAMWEAIQSYPSQDKPPPLPPPDSLSRTQEPSLGKQKVFIIRKSLLDELMEVQHFRTIYHMFIAGLCVFIISTLAIDFIDEGRLLLEFDLLIFSFGQLPLALVTWVPMFLSTLLAPYQALRLWARPGARGTWTLGAGLGCALLAAHALVLCALPVHVAVEHQLPPASRCVLVFEQVRFLMKSYSFLREAVPGTLRARRGEGIQAPSFSSYLYFLFCPTLIYRETYPRTPYIRWNYVAKNFAQALGCVLYACFILGRLCVPVFANMSREPFSTRALVLSILHATLPGIFMLLLIFFAFLHCWLNAFAEMLRFGDRMFYRDWWNSTSFSNYYRTWNVVVHDWLYSYVYQDGLWLLGAQARGVAMLGVFLVSAVAHEYIFCFVLGFFYPVMLILFLVIGGMLNFMMHDQHTGPAWNVLMWTMLFLGQGIQVSLYCQEW.... The pIC50 is 5.0. (5) The drug is O=C([C@@H]1CC[C@@H]2CN1C(=O)N2OS(=O)(=O)O)N1CCNCC1. The target protein sequence is MRDTRFPCLCGIAASTLLFATTPAIADEAPADRLKALVDAAVQPVMKANDIPGLAVAISLKGEPHYFSYGLASKEDGRRVTPETLFEIGSVSKTFTVTLAGYALAQDKMRLDDRASQHWPALQGSRFDGISLLDLATYTAGGLPLQFPDSVQKDQAQIRDYYRQWQPTYAPGSQRLYSNPSIGLFGYLAARSLGQPFERLMEQQLFPALGLEQTHLDVPEAALAQYAQGYGKDDRPLRVGPGPLDAEGYGVKTSAADLLRFVDANLHPERLDRPWAQALDATHRGYYKVGDMTQGLGWEAYDWPISLKRLQAGNSTPMALQPHRIARLPAPQALEGQRLLNKTGSTNGFGAYVAFVPGRDLGLVILANRNYPNAERVKIAYAILSGLEQQAKVPLKR. The pIC50 is 5.7. (6) The small molecule is C=CC(=O)N1CCC(C)(c2onc(-c3c(Cl)cc(Cl)cc3Cl)c2C(=O)n2cc(C)c3c(/C=C/C(=O)O)cccc32)CC1. The target protein sequence is MSKKISGGSVVEMQGDEMTRIIWELIKEKLIFPYVELDLHSYDLGIENRDATNDQVTKDAAEAIKKHNVGVKCATITPDEKRVEEFKLKQMWKSPNGTIRNILGGTVFREAIICKNIPRLVSGWVKPIIIGCHAYGDQYRATDFVVPGPGKVEITYTPSDGTQKVTYLVHNFEEGGGVAMGMYNQDKSIEDFAHSSFQMALSKGWPLYLSTKNTILKKYDGRFKDIFQEIYDKQYKSQFEAQKIWYEHRLIDDMVAQAMKSEGGFIWACKNYDGDVQSDSVAQGYGSLGMMTSVLVCPDGKTVEAEAAHGTVTRHYRMYQKGQETSTNPIASIFAWTRGLAHRAKLDNNKELAFFANALEEVSIETIEAGFMTKDLAACIKGLPNVQRSDYLNTFEFMDKLGENLKIKLAQAKL. The pIC50 is 7.0. (7) The compound is [NH2+]=C1CCCC(O)(O)C2NC(=[NH2+])NC2C(COC(=O)NCCNC(=O)c2ccc(C(=O)c3ccccc3)cc2)N1. The target protein (P15390) has sequence MASSSLPNLVPPGPHCLRPFTPESLAAIEQRAVEEEARLQRNKQMEIEEPERKPRSDLEAGKNLPLIYGDPPPEVIGIPLEDLDPYYSDKKTFIVLNKGKAIFRFSATPALYLLSPFSIVRRVAIKVLIHALFSMFIMITILTNCVFMTMSNPPSWSKHVEYTFTGIYTFESLIKMLARGFCIDDFTFLRDPWNWLDFSVITMAYVTEFVDLGNISALRTFRVLRALKTITVIPGLKTIVGALIQSVKKLSDVMILTVFCLSVFALVGLQLFMGNLRQKCVRWPPPMNDTNTTWYGNDTWYSNDTWYGNDTWYINDTWNSQESWAGNSTFDWEAYINDEGNFYFLEGSNDALLCGNSSDAGHCPEGYECIKAGRNPNYGYTSYDTFSWAFLALFRLMTQDYWENLFQLTLRAAGKTYMIFFVVIIFLGSFYLINLILAVVAMAYAEQNEATLAEDQEKEEEFQQMLEKYKKHQEELEKAKAAQALESGEEADGDPTHNKD.... The pIC50 is 7.1.