This data is from Drug-target binding data from BindingDB using Ki measurements. The task is: Regression. Given a target protein amino acid sequence and a drug SMILES string, predict the binding affinity score between them. We predict pKi (pKi = -log10(Ki in M); higher means stronger inhibition). Dataset: bindingdb_ki. (1) The pKi is 7.5. The target protein (P14272) has sequence MILFKQVGYFVSLFATVSCGCLSQLYANTFFRGGDLAAIYTPDAQHCQKMCTFHPRCLLFSFLAVSPTKETDKRFGCFMKESITGTLPRIHRTGAISGHSLKQCGHQLSACHQDIYEGLDMRGSNFNISKTDSIEECQKLCTNNIHCQFFTYATKAFHRPEYRKSCLLKRSSSGTPTSIKPVDNLVSGFSLKSCALSEIGCPMDIFQHFAFADLNVSQVVTPDAFVCRTVCTFHPNCLFFTFYTNEWETESQRNVCFLKTSKSGRPSPPIIQENAVSGYSLFTCRKARPEPCHFKIYSGVAFEGEELNATFVQGADACQETCTKTIRCQFFTYSLLPQDCKAEGCKCSLRLSTDGSPTRITYEAQGSSGYSLRLCKVVESSDCTTKINARIVGGTNSSLGEWPWQVSLQVKLVSQNHMCGGSIIGRQWILTAAHCFDGIPYPDVWRIYGGILNLSEITNKTPFSSIKELIIHQKYKMSEGSYDIALIKLQTPLNYTEFQK.... The compound is CC(=O)N[C@H]1CSCc2cc3cc(c2)CSC[C@@H](C(=O)O)NC(=O)[C@H](CC(C)C)NC(=O)[C@H](CC(=O)O)NC(=O)[C@H](CCC(N)=O)NC(=O)[C@H](Cc2cnc[nH]2)NC[C@H](C)NC(=O)[C@@H](CSC3)NC(=O)[C@H](CCCCNC(=N)N)NC(=O)[C@H](Cc2ccc(O)cc2)NC(=O)C2CCN2C(=O)[C@H](Cc2ccccc2)NC(=O)[C@H](CO)NC1=O. (2) The compound is CCN(CC)C(=O)[C@@H]1C=C2c3cccc4[nH]cc(c34)C[C@H]2N(C)C1. The target protein (P50407) has sequence MMGVNSSGRPDLYGHLHSILLPGRGLPDWSPDGGADPGVSTWTPRLLSGVPEVAASPSPSWDGTWDNVSGCGEQINYGRAEKVVIGSILTLITLLTIAGNCLVVISVCFVKKLRQPSNYLIVSLALADLSVAVAVIPFVSVTDLIGGKWIFGHFFCNVFIAMDVMCCTASIMTLCVISIDRYLGITRPLTYPVRQNGKCMPKMILSVWLLSASITLPPLFGWAQNVNDDKVCLISQDFGYTIYSTAVAFYIPMSVMLFMYYRIYKAARKSAAKHKFPGFPRVQPESIISLNGMVKLQKEVEECANLSRLLKHERKNISIFKREQKAATTLGIIVGAFTVCWLPFFLLSTARPFICGTACSCIPLWVERTCLWLGYANSLINPFIYAFFNRDLRTTYRSLLQCQYRNINRKLSAAGMHEALKLAERPERPECVLQNSDYCRKKGHDS. The pKi is 9.3. (3) The small molecule is COC(=O)C(N)CCS(N)(=O)=O. The target protein sequence is MCGIFGYVNFLVDKSRGEIIDNLIEGLQRLEYRGYDSAGIAVDGKLTKDPSNGDEEYMDSIIVKTTGKVKVLKQKIIDDQIDRSAIFDNHVGIAHTRWATHGQPKTENCHPHKSDPKGEFIVVHNGIITNYAALRKYLLSKGHVFESETDTECIAKLFKHFYDLNVKAGVFPDLNELTKQVLHELEGSYGLLVKSYHYPGEVCGTRKGSPLLVGVKTDKKLKVDFVDVEFEAQQQHRPQQPQINHNGATSAAELGFIPVAPGEQNLRTSQSRAFLSEDDLPMPVEFFLSSDPASVVQHTKKVLFLEDDDIAHIYDGELRIHRASTKSAGESTVRPIQTLEMELNEIMKGPYKHFMQKEIFEQPDSAFNTMRGRIDFENCVVTLGGLKSWLSTIRRCRRIIMIACGTSYHSCLATRSIFEELTEIPVSVELASDFLDRRSPVFRDDTCVFVSQSGETADSILALQYCLERGALTVGIVNSVGSSMSRQTHCGVHINAGPEI.... The pKi is 2.5. (4) The drug is CCCC(CCC)C(=O)OCC1(CO)C/C(=C\c2cccc3c2ccn3C)C(=O)O1. The target protein (O95267) has sequence MGTLGKAREAPRKPSHGCRAASKARLEAKPANSPFPSHPSLAHITQFRMMVSLGHLAKGASLDDLIDSCIQSFDADGNLCRSNQLLQVMLTMHRIVISSAELLQKVITLYKDALAKNSPGLCLKICYFVRYWITEFWVMFKMDASLTDTMEEFQELVKAKGEELHCRLIDTTQINARDWSRKLTQRIKSNTSKKRKVSLLFDHLEPEELSEHLTYLEFKSFRRISFSDYQNYLVNSCVKENPTMERSIALCNGISQWVQLMVLSRPTPQLRAEVFIKFIQVAQKLHQLQNFNTLMAVIGGLCHSSISRLKETSSHVPHEINKVLGEMTELLSSSRNYDNYRRAYGECTDFKIPILGVHLKDLISLYEAMPDYLEDGKVNVHKLLALYNHISELVQLQEVAPPLEANKDLVHLLTLSLDLYYTEDEIYELSYAREPRNHRAPPLTPSKPPVVVDWASGVSPKPDPKTISKHVQRMVDSVFKNYDHDQDGYISQEEFEKIAA.... The pKi is 8.8. (5) The target protein (P30550) has sequence MALNDCFLLNLEVDHFMHCNISSHSADLPVNDDWSHPGILYVIPAVYGVIILIGLIGNITLIKIFCTVKSMRNVPNLFISSLALGDLLLLITCAPVDASRYLADRWLFGRIGCKLIPFIQLTSVGVSVFTLTALSADRYKAIVRPMDIQASHALMKICLKAAFIWIISMLLAIPEAVFSDLHPFHEESTNQTFISCAPYPHSNELHPKIHSMASFLVFYVIPLSIISVYYYFIAKNLIQSAYNLPVEGNIHVKKQIESRKRLAKTVLVFVGLFAFCWLPNHVIYLYRSYHYSEVDTSMLHFVTSICARLLAFTNSCVNPFALYLLSKSFRKQFNTQLLCCQPGLIIRSHSTGRSTTCMTSLKSTNPSVATFSLINGNICHERYV. The drug is CC(C)c1ccc(NC(=O)N[C@@](C)(Cc2c[nH]c3ccccc23)C(=O)NCC2(c3ccccn3)CCCCC2)cc1. The pKi is 6.6. (6) The drug is CCONC(=O)C(=O)[C@H](Cc1ccccc1)NC(=O)[C@H](CC(C)C)NC(=O)OCc1ccccc1. The target protein (P04574) has sequence MFLVNSFLKGGGGGGGGGGGLGGGLGNVLGGLISGAGGGGGGGGGGGGGGGGGGTAMRILGGVISAISEAAAQYNPEPPPPRTHYSNIEANESEEVRQFRRLFAQLAGDDMEVSATELMNILNKVVTRHPDLKTDGFGIDTCRSMVAVMDSDTTGKLGFEEFKYLWNNIKKWQAIYKQFDVDRSGTIGSSELPGAFEAAGFHLNEHLYSMIIRRYSDEGGNMDFDNFISCLVRLDAMFRAFKSLDKDGTGQIQVNIQEWLQLTMYS. The pKi is 7.2.