Task: Regression. Given a target protein amino acid sequence and a drug SMILES string, predict the binding affinity score between them. We predict pKi (pKi = -log10(Ki in M); higher means stronger inhibition). Dataset: bindingdb_ki.. Dataset: Drug-target binding data from BindingDB using Ki measurements (1) The drug is CCOc1cc2c(cc1/C(C)=C\C=C\C(C)=C\C(=O)O)C(C)(C)CCC2(C)C. The target protein (P48443) has sequence MYGNYSHFMKFPAGYGGSPGHTGSTSMSPSAALSTGKPMDSHPSYTDTPVSAPRTLSAVGTPLNALGSPYRVITSAMGPPSGALAAPPGINLVAPPSSQLNVVNSVSSSEDIKPLPGLPGIGNMNYPSTSPGSLVKHICAICGDRSSGKHYGVYSCEGCKGFFKRTIRKDLIYTCRDNKDCLIDKRQRNRCQYCRYQKCLVMGMKREAVQEERQRSRERAESEAECATSGHEDMPVERILEAELAVEPKTESYGDMNMENSTNDPVTNICHAADKQLFTLVEWAKRIPHFSDLTLEDQVILLRAGWNELLIASFSHRSVSVQDGILLATGLHVHRSSAHSAGVGSIFDRVLTELVSKMKDMQMDKSELGCLRAIVLFNPDAKGLSNPSEVETLREKVYATLEAYTKQKYPEQPGRFAKLLLRLPALRSIGLKCLEHLFFFKLIGDTPIDTFLMEMLETPLQIT. The pKi is 7.8. (2) The compound is CCN(Cc1ccncc1)C(=O)C(CO)c1ccccc1. The target protein sequence is MANFTPVNGSSSNQSVRLVTSAHNRYETVEMVFIATVTGSLSLVTVVGNVLVMLSIKVNRQLQTVNNYFLFSLACADLIIGAFSMNLYTVYIIKGYWPLGAVVCDLWLALDYVVSNASVMNLLIISFDRYFCVTKPLTYPARRTTKMAGLMIAAAWVLSFVLWAPAILFWQFVVGKRTVPDNQCFIQFLSNPAVTFGTAIAAFYLPVVIMTVLYIHISLASRSRVHKHRPEGQKEKKAKTLAFLKSPLMKQSVKKPPPGEAAREELRNGKLEEAPPPALPPPPRPMADKDTSNESSSGSATQNTKERPATELSTAEATTPAMSAPPLQPRTLNPASKWSKIQIVTKQTGNECVTAIEIVPATPAGMRPAANVARKFASIARNQVRKKRQMAARERKVTRTIFAILLAFILTWTPYNVMVLVNTFCQSCIPDTVWSIGYWLCYVNSTINPACYALCNATFKKTFRHLLLCQYRNIGTAR. The pKi is 7.8. (3) The target protein sequence is MGVEIETISPGDGRTFPKKGQTCVVHYTGMLQNGKKFDSSRDRNKPFRFKIGRQEVIKGFEEGVTQMSLGQRAKLTCTPEMAYGATGHPGVIPPNATLLFDVELLRLE. The small molecule is CCC(C)(C)C(=O)C(=O)N1CCCCC1C(=O)OCCCc1cc(OC)c(OC)c(OC)c1. The pKi is 7.6. (4) The drug is O=C(O)c1ccn[nH]1. The target protein (Q80Z39) has sequence MSKQNHFLVINGKNCCVFRDENIAKVLPPVLGLEFVFGLLGNGLALWIFCFHLKSWKSSRIFLFNLAVADFLLIICLPFLTDNYVQNWDWRFGSIPCRVMLFMLAMNRQGSIIFLTVVAVDRYFRVVHPHHFLNKISNRTAAIISCFLWGITIGLTVHLLYTDMMTRNGDANLCSSFSICYTFRWHDAMFLLEFFLPLGIILFCSGRIIWSLRQRQMDRHVKIKRAINFIMVVAIVFVICFLPSVAVRIRIFWLLYKHNVRNCDIYSSVDLAFFTTLSFTYMNSMLDPVVYYFSSPSFPNFFSTCINRCLRRKTLGEPDNNRSTSVELTGDPSTIRSIPGALMTDPSEPGSPPYLASTSR. The pKi is 6.2. (5) The small molecule is C[C@H]1N[C@@H](CO)[C@H](O)[C@H]1O. The target protein (Q2KIM0) has sequence MRSWVVGARLLLLLQLVLVLGAVRLPPCTDPRHCTDPPRYTPDWPSLDSRPLPAWFDEAKFGVFVHWGVFSVPAWGSEWFWWHWQGEKLPQYESFMKENYPPDFSYADFGPRFTARFFNPDSWADLFKAAGAKYVVLTTKHHEGYTNWPSPVSWNWNSKDVGPHRDLVGELGTAIRKRNIRYGLYHSLLEWFHPLYLRDKKNGFKTQYFVNAKTMPELYDLVNRYKPDLIWSDGEWECPDTYWNSTDFLAWLYNDSPVKDEVVVNDRWGQNCSCHHGGYYNCKDKFQPETLPDHKWEMCTSIDQRSWGYRRDMEMADITNESTIISELVQTVSLGGNYLLNVGPTKDGLIVPIFQERLLAVGKWLSINGEAIYASKPWRVQSEKNSVWYTSKGLAVYAILLHWPEYGILSLISPIATSTTKVTMLGIQKDLKWSLNPSGKGLLVFLPQLPPAALPTEFAWTIKLTGVK. The pKi is 6.3. (6) The compound is COc1ccc2c(c1)CN(CC(CO)N1CCC(c3noc4cc(F)ccc34)CC1)C2. The target protein sequence is MNPDLDTGHNTSAPAHWGELKDDNFTGPNQTSSNSTLPQLDVTRAISVGLVLGAFILFAIVGNILVILSVACNRHLRTPTNYFIVNLAIADLLLSFTVLPFSATLEVLGYWVLGRIFCDIWAAVDVLCCTASILSLCAISIDRYIGVRYSLQYPTLVTRRKAILALLSVWVLSTVISIGPLLGWKEPAPNDDKECGVTEEPFYALFSSLGSFYIPLAVILVMYCRVYIVAKRTTKNLEAGVMKEMSNSKELTLRIHSKNFHEDTLSSTKAKGHNPRSSIAVKLFKFSREKKAAKTLGIVVGMFILCWLPFFIALPLGSLFSTLKPPDAVFKVVFWLGYFNSCLNPIIYPCSSKEFKRAFMRILGCQCRGGRRRRRRRRLGACAYTYRPWTRGGSLERSQSRKDSLDDSGSCMSGTQRTLPSASPSPGYLGRGTQPPVELCAFPEWKPGALLSLPEPPGRRGRLDSGPLFTFKLLGDPESPGTEGDTSNGGCDTTTDLANG.... The pKi is 7.7. (7) The compound is COc1ccc2c(c1)C(CNC(C)=O)C2. The target protein (P51050) has sequence GNAFVVSLALADLVVALYPYPLVLLAIFHNGWTLGEMHCKVSGFVMGLSVIGSIFNITAIAINRYCYICHSFAYDKVYSCWNTMLYVSLIWVLTVIATVPNFFVGSLKYDPRIYSCTFVQTASSYYTIAVVVIHFIVPITVVSFCYLRIWVLVLQVRRRVKSETKPRLKPSDFRNFLTMFVVFVIFAFCWAPLNFIGLAVAINPSEMAPKVPEWLFIISYFMAYFNSCLNAIIYGLLNQNFRNEYKRILMSLWMPRLFFQDTSKGGTDGQKSKPSPALNNNDQMKTDTL. The pKi is 6.7. (8) The drug is COc1ccc2c(O[C@@H]3C[C@H]4C(=O)N[C@]5(C(=O)O)CC5/C=C\CCCCN(C)C(=O)[C@@H]4C3)cc(-c3nc(C(C)C)cs3)nc2c1Cl. The target protein sequence is APXTAYAQQTRGLLGCIXTSLTGRDKNQVEGEVQIVSTAAQTFLATCINGVCWTVYHGAGTRTIASPKGPVIQMYTNVDKDLVGWPAPQGARSLTPCXCGSSDLYLVTRHADVIPVRRRGDXRGSLLSPRPISYLKGSSGGPLLCPAGHAVGIFRAAVCTRGVAKAVDFIPVESLETTMRSPVFTDNSSPPAVPQSFQVAHLHAPTGSGKSTRVPAAYAAQGYKVLVLNPSVAATLGFGAYMSKAHGVDPNVRTGVRTITTGSPITYSTYGKFLADGGCSGGAYDIIICDECHSTDATSILGIGTVLDQAETAGARLVVLATATPPGSVTVXHPNIEEVALSTTGEIPFYGKAIPLEXIKGGRHLIFCHSKKKCDELAAKLXXLGINAVAYYRGLDVSVIPTSGDVVVVATDALMTGFTGDFDSVIDCNTCVTQTVDFSLDPTFTLETTTLPQDAVXRTQRRGRTGRGRPGIYRFVTPGERPSGMFDSSVLCECYDAGCA.... The pKi is 8.3. (9) The drug is NC(=O)[C@H](CCC(=O)O)NC(=O)[C@H](CCC(=O)O)NC(=O)[C@H](Cc1cc(-c2ccc(-c3cccc(Cl)c3)cc2)no1)CP(=O)(O)c1ccc(Br)cc1. The target protein (P24347) has sequence MAPAAWLRSAAARALLPPMLLLLLQPPPLLARALPPDAHHLHAERRGPQPWHAALPSSPAPAPATQEAPRPASSLRPPRCGVPDPSDGLSARNRQKRFVLSGGRWEKTDLTYRILRFPWQLVQEQVRQTMAEALKVWSDVTPLTFTEVHEGRADIMIDFARYWHGDDLPFDGPGGILAHAFFPKTHREGDVHFDYDETWTIGDDQGTDLLQVAAHEFGHVLGLQHTTAAKALMSAFYTFRYPLSLSPDDCRGVQHLYGQPWPTVTSRTPALGPQAGIDTNEIAPLEPDAPPDACEASFDAVSTIRGELFFFKAGFVWRLRGGQLQPGYPALASRHWQGLPSPVDAAFEDAQGHIWFFQGAQYWVYDGEKPVLGPAPLTELGLVRFPVHAALVWGPEKNKIYFFRGRDYWRFHPSTRRVDSPVPRRATDWRGVPSEIDAAFQDADGYAYFLRGRLYWKFDPVKVKALEGFPRLVGPDFFGCAEPANTFL. The pKi is 4.7. (10) The small molecule is CC(=N)[PH](O)(O)CC[C@H](N)C(=O)O. The target protein (P0A9C5) has sequence MSAEHVLTMLNEHEVKFVDLRFTDTKGKEQHVTIPAHQVNAEFFEEGKMFDGSSIGGWKGINESDMVLMPDASTAVIDPFFADSTLIIRCDILEPGTLQGYDRDPRSIAKRAEDYLRSTGIADTVLFGPEPEFFLFDDIRFGSSISGSHVAIDDIEGAWNSSTQYEGGNKGHRPAVKGGYFPVPPVDSAQDIRSEMCLVMEQMGLVVEAHHHEVATAGQNEVATRFNTMTKKADEIQIYKYVVHNVAHRFGKTATFMPKPMFGDNGSGMHCHMSLSKNGVNLFAGDKYAGLSEQALYYIGGVIKHAKAINALANPTTNSYKRLVPGYEAPVMLAYSARNRSASIRIPVVSSPKARRIEVRFPDPAANPYLCFAALLMAGLDGIKNKIHPGEAMDKNLYDLPPEEAKEIPQVAGSLEEALNELDLDREFLKAGGVFTDEAIDAYIALRREEDDRVRMTPHPVEFELYYSV. The pKi is 5.5.