This data is from Drug-target binding data from BindingDB using Ki measurements. The task is: Regression. Given a target protein amino acid sequence and a drug SMILES string, predict the binding affinity score between them. We predict pKi (pKi = -log10(Ki in M); higher means stronger inhibition). Dataset: bindingdb_ki. (1) The compound is CCc1c(C(=O)NN2CCCCC2)nn(-c2ccc(Cl)cc2Cl)c1-c1ccc(Br)cc1. The target protein (P30990) has sequence MMAGMKIQLVCMLLLAFSSWSLCSDSEEEMKALEADFLTNMHTSKISKAHVPSWKMTLLNVCSLVNNLNSPAEETGEVHEEELVARRKLPTALDGFSLEAMLTIYQLHKICHSRAFQHWELIQEDILDTGNDKNGKEEVIKRKIPYILKRQLYENKPRRPYILKRDSYYY. The pKi is 6.0. (2) The compound is N[C@@](CF)(Cc1c[nH]c2ccc(O)cc12)C(=O)O. The target protein (O88533) has sequence MDSREFRRRGKEMVDYIADYLDGIEGRPVYPDVEPGYLRPLIPATAPQEPETYEDIIKDIEKIIMPGVTHWHSPYFFAYFPTASSYPAMLADMLCGAIGCIGFSWAASPACTELETVMMDWLGKMLELPEAFLAGRAGEGGGVIQGSASEATLVALLAARTKVIRQLQAASPEFTQAAIMEKLVAYTSDQAHSSVERAGLIGGIKLKAVPSDGNFSMRASALREALERDKAAGLIPFFVVATLGTTSCCSFDNLLEVGPICNQEGVWLHIDAAYAGSAFICPEFRYLLNGVEFADSFNFNPHKWLLVNFDCSAMWVKRRTDLTGAFNMDPVYLKHSHQDSGFITDYRHWQIPLGRRFRSLKMWFVFRMYGVKGLQAYIRKHVELSHEFESLVRQDPRFEICTEVILGLVCFRLKGSNELNETLLQRINSAKKIHLVPCRLRDKFVLRFAVCARTVESAHVQLAWEHISDLASSVLRAEKE. The pKi is 4.6. (3) The small molecule is CC(C)CC(=O)OC[C@H](CO)OC(=O)C=C(C(C)C)C(C)C. The target protein (P04409) has sequence MADVFPAAEPAAPQDVANRFARKGALRQKNVHEVKNHRFIARFFKQPTFCSHCTDFIWGFGKQGFQCQVCCFVVHKRCHEFVTFSCPGADKGPDTDDPRSKHKFKIHTYGSPTFCDHCGSLLYGLIHQGMKCDTCDMNVHKQCVINVPSLCGMDHTEKRGRIYLKAEVTDEKLHVTVRDAKNLIPMDPNGLSDPYVKLKLIPDPKNESKQKTKTIRSTLNPRWDESFTFKLKPSDKDRRLSEEIWDWDRTTRNDFMGSLSFGVSELMKMPASGWYKLLNQEEGEYYNVPIPEGDEEGNVELRQKFEKAKLGPAGNKVISPSEDRRQPSNNLDRVKLTDFNFLMVLGKGSFGKVMLADRKGTEELYAIKILKKDVVIQDDDVECTMVEKRVLALLDKPPFLTQLHSCFQTVDRLYFVMEYVNGGDLMYHIQQVGKFKEPQAVFYAAEISIGLFFLHKRGIIYRDLKLDNVMLDSEGHIKIADFGMCKEHMMDGVTTRTFCG.... The pKi is 6.5. (4) The compound is O=C(Nc1ccc([C@@H]2CNCCO2)nc1)Nc1cccc(C(F)(F)F)c1. The target protein (Q923Y8) has sequence MHLCHAITNISHRNSDWSREVQASLYSLMSLIILATLVGNLIVIISISHFKQLHTPTNWLLHSMAIVDFLLGCLIMPCSMVRTVERCWYFGEILCKVHTSTDIMLSSASIFHLAFISIDRYCAVCDPLRYKAKINISTILVMILVSWSLPAVYAFGMIFLELNLKGVEELYRSQVSDLGGCSPFFSKVSGVLAFMTSFYIPGSVMLFVYYRIYFIAKGQARSINRTNVQVGLEGKSQAPQSKETKAAKTLGIMVGVFLVCWCPFFLCTVLDPFLGYVIPPSLNDALYWFGYLNSALNPMVYAFFYPWFRRALKMVLLGKIFQKDSSRSKLFL. The pKi is 8.4. (5) The drug is O=c1[nH]c(O)c(Cc2ccc3c(c2)OC(F)(F)O3)s1. The target protein (Q9NZ45) has sequence MSLTSSSSVRVEWIAAVTIAAGTAAIGYLAYKRFYVKDHRNKAMINLHIQKDNPKIVHAFDMEDLGDKAVYCRCWRSKKFPFCDGAHTKHNEETGDNVGPLIIKKKET. The pKi is 6.7. (6) The compound is O=c1ccn([C@@H]2O[C@H](CO)[C@@H](O)[C@H]2OP(=O)(O)O)c(=O)[nH]1. The target protein (P00669) has sequence MALKSLVVLPLLVLVLLLVRVQPSLGKESAAAKFERQHMDSGNSPSSSSNYCNLMMCCRKMTQGKCKPVNTFVHESLADVKAVCSQKKVTCKNGQTNCYQSKSTMRITDCRETGSSKYPNCAYKTTQVEKHIIVACGGKPSVPVHFDASV. The pKi is 4.4. (7) The compound is CC(C)CC(=O)OCC1(CO)C/C(=C/CC(CC(C)C)CC(C)C)C(=O)O1. The target protein (P28867) has sequence MAPFLRISFNSYELGSLQVEDEASQPFCAVKMKEALSTERGKTLVQKKPTMYPEWKTTFDAHIYEGRVIQIVLMRAAEDPVSEVTVGVSVLAERCKKNNGKAEFWLDLQPQAKVLMCVQYFLEDGDCKQSMRSEEEAKFPTMNRRGAIKQAKIHYIKNHEFIATFFGQPTFCSVCKEFVWGLNKQGYKCRQCNAAIHKKCIDKIIGRCTGTATNSRDTIFQKERFNIDMPHRFKVYNYMSPTFCDHCGSLLWGLVKQGLKCEDCGMNVHHKCREKVANLCGINQKLLAEALNQVTQRSSRKLDTTESVGIYQGFEKKPEVSGSDILDNNGTYGKIWEGSTRCTLENFTFQKVLGKGSFGKVLLAELKGKDKYFAIKCLKKDVVLIDDDVECTMVEKRVLALAWESPFLTHLICTFQTKDHLFFVMEFLNGGDLMFHIQDKGRFELYRATFYAAEIICGLQFLHSKGIIYRDLKLDNVMLDRDGHIKIADFGMCKENIFGE.... The pKi is 9.0. (8) The compound is CC(C)OC(=O)[C@@H]1C2CC[C@H](C[C@@H]1c1ccc(I)cc1)N2C. The target is MLLARMKPQVQPELGGADQ. The pKi is 6.5. (9) The small molecule is CN1C[C@H](CNC(=O)OCc2ccccc2)C[C@@H]2c3cccc4c3c(cn4C)C[C@H]21. The target protein sequence is MNLTNYTTEASVAVKPKTVTEKMLICMTLVIITTLTMLLNSAVIMAICTTRKLHQPANYLICSLAVTDLLVAVLVMPLSVMYIVMDNWRLGYFICEVWLSVDMTCCTCSILHLCVIALDRYWAITKAIEYARKRTARRAGLMILTVWTISIFISMPPLFWRSHRQVSPPPSQCTIQHDHVIYTIYSTLGAFYIPLTLILILYYRIYHAAKSLYQKRGSSRHLSNRSTDSQNSFASCKLTQTFCVSDFSTSDPTTEFEKIHTSIRIPPFDNDLDQPGERQQISSTRERKAARILGLILGAFILSWLPFFIKELIVGLSIYTVSSEVGDFLTWLGYVNSLINPLLYTSFNEDFKLAFKKLIRCREHT. The pKi is 6.0. (10) The compound is OP(O)(=S)CSc1ccccc1. The target protein (P03772) has sequence MRYYEKIDGSKYRNIWVVGDLHGCYTNLMNKLDTIGFDNKKDLLISVGDLVDRGAENVECLELITFPWFRAVRGNHEQMMIDGLSERGNVNHWLLNGGGWFFNLDYDKEILAKALAHKADELPLIIELVSKDKKYVICHADYPFDEYEFGKPVDHQQVIWNRERISNSQNGIVKEIKGADTFIFGHTPAVKPLKFANQMYIDTGAVFCGNLTLIQVQGEGA. The pKi is 3.3.