From a dataset of Drug-target binding data from BindingDB using Ki measurements. Regression. Given a target protein amino acid sequence and a drug SMILES string, predict the binding affinity score between them. We predict pKi (pKi = -log10(Ki in M); higher means stronger inhibition). Dataset: bindingdb_ki. (1) The compound is COc1ccc2c(O[C@@H]3C[C@@H](C(=O)NC4(C(=O)O)CC4)N(C(=O)[C@@H](NC(=O)OC(C)(C)C)C(C)C)C3)cc(-c3ccccc3)nc2c1. The target protein sequence is VFTDNSSPPAVPQSFQVAHLHAPTGSGKSTKVPAAYAAQGYKVLVLNPSVAATLGFGAYMSKAHGVDPNIRTGVRTITTGSPITYSTYGKFLADGGCSGGAYDIIICDECHSTDATSILGIGTVLDQAETAGARLVVLATATPPGSVTVSHPNIEEVALSTTGEIPFYGKAIPLEVIKGGRHLIFCHSKKKCDELAAKLVALGINAVAYYRGLDVSVIPTSGDVVVVSTDALMTGFTGDFDSVIDCNTCVTQTVDFSLDPTFTIETTTLPQDAVSRTQRRGRTGRGKPGIYRFVAPGERPSGMFDSSVLCECYDAGCAWYELTPAETTVRLRAYMNTPGLPVCQDHLEFWEGVFTGLTHIDAHFLSQTKQSGENFPYLVAYQATVCARAQAPPPSWDQMWKCLIRLKPTLHGPTPLLYRLGAVQNEVT. The pKi is 6.0. (2) The small molecule is CC(C)CN(Cc1cc(F)c2c(c1)OCCCO2)C(=O)C1CN(Cc2ccccc2)CCO1. The target protein sequence is MPPPAWMEPTVGALGENTTDTSTSFLSLVNARGAQAASFPFTLSYGDYDTALGEEEDVTKSWTFFAARIVIGMALVAIMLVCGVGNFIFITTLARYKKLRNLTNLLIANLAISDFLVAIVCCPFEMDYYVVRQLSWEHGHVLCASVNYLRTVSLYVSTNALLAIAIDRYLAIVHPLRPRMKCQTAAGLIFLVWSVSILIAIPAAYFTTETVLVIVESQEKIFCGQIWPVDQQVYYRSYFLLVFGLEFVGPVVAMTLCYARVSRELWFKAVPGFQTEQIRRRLRCRRRTVLGLVCVLSAYVLCWAPFYGFTIVRDFFPSVFVKEKHYLTAFYVVECIAMSNSMINTLCFVSVRNNTSKYLKRILRLQWRASPSGSKASADLDLRTTGMPATEEVDCIGLK. The pKi is 6.8. (3) The compound is CNCC[C@H](Oc1cccc2ccccc12)c1cccs1. The target is MLLARMKPQVQPELGGADQ. The pKi is 7.8. (4) The drug is CCCCCCCCC#Cc1ccc2c(c1)C[C@@H](CO)NC(=O)[C@H](C(C)C)N2C. The target protein (Q9R1K8) has sequence MGTLGKAREAPRKPCHGSRAGPKGRLEAKSTNSPLPAQPSLAQITQFRMMVSLGHLAKGASLDDLIDSCIQSFDADGNLCRSNQLLQVMLTMHRIIISSAELLQKLMNLYKDALEKNSPGICLKICYFVRYWITEFWIMFKMDASLTSTMEEFQDLVKANGEESHCHLIDTTQINSRDWSRKLTQRIKSNTSKKRKVSLLFDHLEPEELSEHLTYLEFKSFRRISFSDYQNYLVNSCVKENPTMERSIALCNGISQWVQLMVLSRPTPQLRAEVFIKFIHVAQKLHQLQNFNTLMAVIGGLCHSSISRLKETSSHVPHEINKVLGEMTELLSSCRNYDNYRRAYGECTHFKIPILGVHLKDLISLYEAMPDYLEDGKVNVQKLLALYNHINELVQLQDVAPPLDANKDLVHLLTLSLDLYYTEDEIYELSYAREPRNHRAPPLTPSKPPVVVDWASGVSPKPDPKTISKHVQRMVDSVFKNYDLDQDGYISQEEFEKIAA.... The pKi is 8.2. (5) The compound is CSCC[C@H](NC(=O)[C@H](CC(C)C)NC(=O)CNC(=O)[C@H](Cc1ccc(O)cc1)NC(=O)[C@H](Cc1ccccc1)NC(=O)[C@H](CCCCN)NC(=O)[C@H](CC(N)=O)NC(=O)[C@@H]1CCCN1C(=O)[C@H](CC(=O)O)NC(=O)[C@H](C)NC(=O)C1CCC(=O)N1)C(N)=O. The target protein (P41539) has sequence MKILVAVAVFFLVSTQLFAEEIDANDDLNYWSDWSDSDQIKEAMPEPFEHLLQRIARRPKPQQFFGLMGKRDADSSVEKQVALLKALYGHGQISHKRHKTDSFVGLMGKRALNSVAYERSAMQNYERRRK. The pKi is 6.8. (6) The drug is c1ccc2nc(N3CCNCC3)ccc2c1. The target protein sequence is SNRSLNATATQGAWDPGTLQALKIALVVLLSIITLATVLSNAFVLTTIFLTRKLHTPANCLIGSLAMTDLLVSILVMPISIAYTTTHTWSFGQLLCDIWLSSDITCCTASILHLCVIAL. The pKi is 5.9. (7) The target protein (A7MBE0) has sequence MLTVDDVLEQVGEFGWFQKQTFLILCLLSAAFAPIYVGIVFLAFTPDHRCRSPGVAELSRRCGWSLAEELNYTVPGPGPESQCLRYEVDWNQSTLGCLDPLASLATNGSPLPLGPCEQGWVYDTPGSSIVTEFNLVCDDSWKVDLFQSCVNLGFFLGSLGVGYIADRFGRKVCLLATTLTCASLGVLTAVAPDYTSLLIFRLLQGLVSKGSWTAGYTLITEFVGLGYRRTVAILYQMAFTVGLVLLSGLAYILPHWRWLQLAVSLPIFLLLFRFWFVPESPRWLLSQKRNTEAIKIMDHIAQKNGKLPPADLKMLSLEEDVTEKLSPSFIDLFRTPNLRKYTFILMYLWFTSSVVYQGLIMHVGATGGNLYLDFLYSALVEFPAGFIILVTIDRFGRRYPLATSNLAAGLACFLMIFIPHDLPWLNIMVACVGRMGITIVFQMVCLVNAELFPTFIRNLGMMVCSSLCDLGGVLTPFLVFRLMEVWQGSPLILFAALGLV.... The compound is COC(=O)C1C(C)=NC(C)=C(C(=O)OCCCN2CCC(c3ccccc3)(c3ccccc3)CC2)C1c1cccc([N+](=O)[O-])c1. The pKi is 8.0. (8) The compound is CCCB(O)O. The target protein sequence is MGMRTVLTGLAGMLLGSMMPVQADMPRPTGLAADIRWTAYGVPHIRAKDERGLGYGIGYAYARDNACLLAEEIVTARGERARYFGSEGKSSAELDNLPSDIFYAWLNQPEALQAFWQAQTPAVRQLLEGYAAGFNRFLREADGKTTSCLGQPWLRAIATDDLLRLTRRLLVEGGVGQFADALVAAAPPGTEKVALSGEQAFQVAEQRRQRFRLERGSNAIAVGSERSADGKGMLLANPHFPWNGAMRFYQMHLTIPGRLDVMGASLPGLPVVNIGFSRHLAWTHTVDTSSHFTLYRLALDPKDPRRYLVDGRSLPLEEKSVAIEVRGADGKLSRVEHKVYQSIYGPLVVWPGKLDWNRSEAYALRDANLENTRVLQQWYSINQASDVADLRRRVEALQGIPWVNTLAADEQGNALYMNQSVVPYLKPELIPACAIPQLVAEGLPALQGQDSRCAWSRDPAAAQAGITPAAQLPVLLRRDFVQNSNDSAWLTNPASPLQGF.... The pKi is 2.3.