Dataset: Drug-target binding data from BindingDB using Ki measurements. Task: Regression. Given a target protein amino acid sequence and a drug SMILES string, predict the binding affinity score between them. We predict pKi (pKi = -log10(Ki in M); higher means stronger inhibition). Dataset: bindingdb_ki. (1) The drug is CC(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@H](CCCNC(=N)N)C(=O)c1nccs1. The target protein (P56677) has sequence MGSNRGRKAGGGSQDFGAGLKYNSRLENMNGFEEGVEFLPANNAKKVEKRGPRRWVVLVAVLFSFLLLSLMAGLLVWHFHYRNVRVQKVFNGHLRITNEIFLDAYENSTSTEFISLASQVKEALKLLYNEVPVLGPYHKKSAVTAFSEGSVIAYYWSEFSIPPHLAEEVDRAMAVERVVTLPPRARALKSFVLTSVVAFPIDPRMLQRTQDNSCSFALHAHGAAVTRFTTPGFPNSPYPAHARCQWVLRGDADSVLSLTFRSFDVAPCDEHGSDLVTVYDSLSPMEPHAVVRLCGTFSPSYNLTFLSSQNVFLVTLITNTDRRHPGFEATFFQLPKMSSCGGFLSDTQGTFSSPYYPGHYPPNINCTWNIKVPNNRNVKVRFKLFYLVDPNVPVGSCTKDYVEINGEKYCGERSQFVVSSNSSKITVHFHSDHSYTDTGFLAEYLSYDSNDPCPGMFMCKTGRCIRKELRCDGWADCPDYSDERYCRCNATHQFTCKNQF.... The pKi is 8.1. (2) The target protein (P48169) has sequence MVSAKKVPAIALSAGVSFALLRFLCLAVCLNESPGQNQKEEKLCTENFTRILDSLLDGYDNRLRPGFGGPVTEVKTDIYVTSFGPVSDVEMEYTMDVFFRQTWIDKRLKYDGPIEILRLNNMMVTKVWTPDTFFRNGKKSVSHNMTAPNKLFRIMRNGTILYTMRLTISAECPMRLVDFPMDGHACPLKFGSYAYPKSEMIYTWTKGPEKSVEVPKESSSLVQYDLIGQTVSSETIKSITGEYIVMTVYFHLRRKMGYFMIQTYIPCIMTVILSQVSFWINKESVPARTVFGITTVLTMTTLSISARHSLPKVSYATAMDWFIAVCFAFVFSALIEFAAVNYFTNIQMEKAKRKTSKPPQEVPAAPVQREKHPEAPLQNTNANLNMRKRTNALVHSESDVGNRTEVGNHSSKSSTVVQESSKGTPRSYLASSPNPFSRANAAETISAARALPSASPTSIRTGYMPRKASVGSASTRHVFGSRLQRIKTTVNTIGATGKLS.... The pKi is 8.6. The small molecule is CN1CC2C(C(=O)OC(C)(C)C)=NCN2c2ccc(C#C[Si](C)(C)C)cc2C1=O. (3) The compound is CC(C)(C)NC(=O)N[C@H](C(=O)N1C[C@H]2[C@@H]([C@H]1C(=O)NC(CC1CCC1)C(=O)C(N)=O)C2(C)C)C(C)(C)C. The target protein sequence is SPITAYSQQTRGLLGCIITSLTGRDKNQVEGEVQVVSTATQSFLATCVNGVCWTVFHGAGSKTLAGPKGPITQMYTNVDQDLVGWMAPPGARSMTPCTCGSSDLYLVTRHADVIPVRRRGDGRGSLLSPRPVSYLKGSSGGPLLCPSGHVVGIFRAAVCTRGVAKAVDFVPVESMETTMRSPVFTDNSSPPAVPQTFQVAHLHAPTGSGKSTKVPAAYAAQGYKVLVLNPSVAATLGFGAYMSKAGTDPNIRTGVRTITTGAPITYSTYGKFLADGGCSGGAYDIICDECHSTDSTTLGIGTVLDQAETAGARLVVLATATPPGSTVPHPNIEEVALSTTGEIPFYGKAIPIETIKGGRHLIFCHSKKKCDELAGKLSALGLNAVAYYRGLDVSVIPTSGDVVVVATDALMTGYTGDFDSVIDCNTCVTQTVDFSLDPTFTIETTTVPQDAVSRSQRRGRTGRGRGIYRFVTPGERPSGMFDSSVLCECYDAGCAWYELT.... The pKi is 7.8. (4) The small molecule is [N-]=[N+]=N[C@@H]1CC[C@H]2[C@H]3Cc4ccc(O)c5c4[C@@]2(CCN3CC2CC2)[C@@H]1O5. The target protein sequence is MCFNLTMKKKKECCAPACPSSCFPNTSWLLGWDDHDNVSAYPDLPLNEGNHTSISPTISVIITAVYSMVFVVGLVGNALVMFVIIRYTKMKTATNIYIFNLALADALVTTTMPFQSTSFLMNSWPFGDVLCKIVVSIDYYNMFTSIFTLTMMSVDRYIAVCHPVKALDFRTPLKAKCINICIWMLSSSVGISAIVLGGTKISDGSTECALQFPTHYWYWDTVMKMCVFIFAFIIPVFIITICYTLMILRLKSVRLLSGSREKDRNLRRITRLVLVVVAVFIVCWTPIHIFVLVEALVDVPQSIAVVSIYYFCIALGYTNSSLNPILYAFLDENFKRCFKDFCFPSKHRLDRQPNSRVGNTVQDPACNRHGSQKPV. The pKi is 9.7. (5) The compound is CS(=O)(=O)N1CCC(Oc2ccc(CC(=O)N3CCC(N4C(=O)CCc5ccccc54)CC3)c(OC/C=C/I)c2)CC1. The target protein sequence is MEGTPAANWSFELDLGSGVSPGVEGNLTAGPPQRNEALARVEVAVLCLILFLALSATVRAAGLRTTTRHKHSRLFFFMKHLSIADLVVAVFQVLPQLLWDITFRFYGPDLLCRLVKYLQVVGMFASTYLLLLMSLDRCLAICQPLRSLRRRTDRLAVLATWLGCLVASAPQVHIFSLREVADGVFDCWAVFIQPWGPNAYVTWITLAVYIVPVIVLAACYGLISFKIWQNLRLKTAAAAAAAEGTEGSAAGGAARGAGS. The pKi is 8.4. (6) The compound is O=C(O)/C(=C/CCO)[C@@H](O)C(=O)O. The target protein sequence is MAKYRICLIEGDGIGHEVIPAAKRVLEAAGFDAEYVHAEAGYEYFLDHGTSVPEATYDAVENTDATLFGAATSPSGEKPAGFFGAIRHLRQKYNLYANVRPTKTRPVPHSYENVDLVIVRENTQGLYVEQERRYGDTAIADTVITREASDRIGKFAADLAMKRSKRLTVVHKSNVLPVTQGLFMNTILDHTKTVEGLSTSTMIVDNAAMQLVRNPQQFDVMVMTNMFGDILSDLAAGLVGGLGIAASGNVGDQFGIFESVHGSAPDIAGQGISNPTATILAAVIMLDHLGDHETARRLDNAINKVLAEXPRTRDLGGTAGTQEFTEAVIKALA. The pKi is 2.3. (7) The pKi is 4.3. The target protein (P00366) has sequence MYRYLGEALLLSRAGPAALGSASADSAALLGWARGQPAAAPQPGLVPPARRHYSEAAADREDDPNFFKMVEGFFDRGASIVEDKLVEDLKTRETEEQKRNRVRSILRIIKPCNHVLSLSFPIRRDDGSWEVIEGYRAQHSQHRTPCKGGIRYSTDVSVDEVKALASLMTYKCAVVDVPFGGAKAGVKINPKNYTDNELEKITRRFTMELAKKGFIGPGVDVPAPDMSTGEREMSWIADTYASTIGHYDINAHACVTGKPISQGGIHGRISATGRGVFHGIENFINEASYMSILGMTPGFGDKTFVVQGFGNVGLHSMRYLHRFGAKCITVGESDGSIWNPDGIDPKELEDFKLQHGTILGFPKAKIYEGSILEVDCDILIPAASEKQLTKSNAPRVKAKIIAEGANGPTTPEADKIFLERNIMVIPDLYLNAGGVTVSYFEWLNNLNHVSYGRLTFKYERDSNYHLLMSVQESLERKFGKHGGTIPIVPTAEFQDRISGA.... The small molecule is NC(=O)c1cccc([C@@H]2O[C@H](COP(=O)([O-])O[P@](=O)(O)OC[C@H]3O[C@@H](n4cnc5c(N)ncnc54)[C@H](O)[C@@H]3O)[C@@H](O)[C@H]2O)[nH+]1. (8) The pKi is 4.1. The target protein (P23724) has sequence MATIASEYSSEASNTPIEHQFNPYGDNGGTILGIAGEDFAVLAGDTRNITDYSINSRYEPKVFDCGDNIVMSANGFAADGDALVKRFKNSVKWYHFDHNDKKLSINSAARNIQHLLYGKRFFPYYVHTIIAGLDEDGKGAVYSFDPVGSYEREQCRAGGAAASLIMPFLDNQVNFKNQYEPGTNGKVKKPLKYLSVEEVIKLVRDSFTSATERHIQVGDGLEILIVTKDGVRKEFYELKRD. The small molecule is CCCCCCCCCC(=O)N[C@@H](Cc1ccc(O)cc1)C(=O)N[C@@H](CCC(=O)O)C(=O)N[C@@H](CC(C)C)C(=O)O. (9) The drug is Oc1cc2c(cc1O)[C@@H]1c3ccccc3CN[C@@H]1CC2. The pKi is 5.0. The target protein (Q8VDA6) has sequence MGTPASVVSEPPLWQVSTAQPRDRGRGRKQASANIFQDAELVQIQGLFQRSGDQLAEERAQIIWECAGDHRVAEALRRLRRKRPPRQNHCSRLRVPELGSTAADPQASTTDTASSEQFGNSRRTSARVHRNWNKPGPTGYLHQIRH.