This data is from Drug-target binding data from BindingDB using Kd measurements. The task is: Regression. Given a target protein amino acid sequence and a drug SMILES string, predict the binding affinity score between them. We predict pKd (pKd = -log10(Kd in M); higher means stronger binding). Dataset: bindingdb_kd. (1) The drug is COc1cc2c(cc1OC)C1CC(=O)C(CC(C)C)CN1CC2. The target protein sequence is MALSDLVLLRWLRDSRHSRKLILFIVFLALLLDNMLLTVVAPIIPSYLYSIKHEKNSTEIQTTRPELVVSTSESIFSYYNNSTVLITGNATGTLPGGQSHKATSTQHTVANTTVPSDCPSEDRDLLNENVQVGLLFASKATVQLLTNPFIGLLTNRIGYPIPMFAGFCIMFISTVMFAFSSSYAFLLIARSLQGIGSSCSSVAGMGMLASVYTDDEERGKPMGIALGGLAMGVLVGPPFGSVLYEFVGKTAPFLVLAALVLLDGAIQLFVLQPSRVQPESQKGTPLTTLLKDPYILIAAGSICFANMGIAMLEPALPIWMMETMCSRKWQLGVAFLPASISYLIGTNIFGILAHKMGRWLCALLGMVIVGISILCIPFAKNIYGLIAPNFGVGFAIGMVDSSMMPIMGYLVDLRHVSVYGSVYAIADVAFCMGYAIGPSAGGAIAKAIGFPWLMTIIGIIDIAFAPLCFFLRSPPAKEEKMAILMDHNCPIKRKMYTQNN.... The pKd is 7.8. (2) The small molecule is COc1nc2ccc(C(O)(c3ccc(C(F)(F)F)nc3)c3cncn3C)cc2c(Cl)c1Cc1cc2ccccc2s1. The target protein sequence is MRTQIEVIPCKICGDKSSGIHYGVITCEGCKGFFRRSQRCNAAYSCTRQQNCPIDRTSRNRCQHCRLQKCLALGMSRDAVKFGRMSKKQRDSLHAEVQKQLQQRQQQQQEPVVKTPPAGAQGADTLTYTLGLPDGQLPLGSSPDLPEASACPPGLLKASGSGPSYSNNLAKAGLNGASCHLEYSPERGKAEGRESFYSTGSQLTPDRCGLRFEEHRHPGLGELGQGPDSYGSPSFRSTPEAPYASLTEIEHLVQSVCKSYRETCQLRLEDLLRQRSNIFSREEVTGYQRKSMWEMWERCAHHLTEAIQYVVEFAKRLSGFMELCQNDQIVLLKAGAMEVVLVRMCRAYNADNRTVFFEGKYGGMELFRALGCSELISSIFDFSHSLSALHFSEDEIALYTALVLINAHRPGLQEKRKVEQLQYNLELAFHHHLCKTHRQSILAKLPPKGKLRSLCSQHVERLQIFQHLHPIVVQAAFPPLYKELFSTETESPVGLSK. The pKd is 8.4. (3) The compound is CC[C@H](C)[C@H](NC(=O)[C@H](C)NC(=O)[C@H](CCCNC(=N)N)NC(=O)[C@H](CC(C)C)NC(=O)[C@H](CC(C)C)NC(=O)[C@H](CCCNC(=N)N)NC(=O)[C@H](CCC(=O)O)NC(=O)[C@H](CCC(=O)O)NC(=O)[C@@H]1CCCN1C(=O)[C@@H](NC(=O)[C@H](CC(C)C)NC(=O)[C@@H](N)CCC(=O)O)[C@@H](C)O)C(=O)N[C@@H](Cc1ccccc1)C(=O)NCC(=O)N[C@@H](CCC(=O)O)C(=O)N[C@@H](CCCCN)C(=O)O. The target protein (P9WGI5) has sequence MADAPTRATTSRVDSDLDAQSPAADLVRVYLNGIGKTALLNAAGEVELAKRIEAGLYAEHLLETRKRLGENRKRDLAAVVRDGEAARRHLLEANLRLVVSLAKRYTGRGMPLLDLIQEGNLGLIRAMEKFDYTKGFKFSTYATWWIRQAITRGMADQSRTIRLPVHLVEQVNKLARIKREMHQHLGREATDEELAAESGIPIDKINDLLEHSRDPVSLDMPVGSEEEAPLGDFIEDAEAMSAENAVIAELLHTDIRSVLATLDEREHQVIRLRFGLDDGQPRTLDQIGKLFGLSRERVRQIERDVMSKLRHGERADRLRSYAS. The pKd is 6.9. (4) The compound is CC1=C(/C=C/C(C)=C/C=C/C(C)=C/CO)C(C)(C)CCC1. The target protein (P02754) has sequence MKCLLLALALTCGAQALIVTQTMKGLDIQKVAGTWYSLAMAASDISLLDAQSAPLRVYVEELKPTPEGDLEILLQKWENGECAQKKIIAEKTKIPAVFKIDALNENKVLVLDTDYKKYLLFCMENSAEPEQSLACQCLVRTPEVDDEALEKFDKALKALPMHIRLSFNPTQLEEQCHI. The pKd is 7.4. (5) The drug is CN1CCC[C@H]1c1ccc(=O)[nH]c1. The target protein sequence is MYDAIVVGGGFSGLKAARDLTNAGKKVLLLEGGERLGGRAYSRESRNVPGLRVEIGGAYLHRKHHPRLAAELDRYGIPTAAASEFTSFRHRLGPTAVDQAFPIPGSEAVAVEAATYTLLRDAHRIDLEKGLENQDLEDLDIPLNEYVDKLDLPPVSRQFLLAWAWNMLGQPADQASALWMLQLVAAHHYSILGVVLSLDEVFSNGSADLVDAMSQEIPEIRLQTVVTGIDQSGDVVNVTVKDGHAFQAHSVIVATPMNTWRRIVFTPALPERRRSVIEEGHGGQGLKILIHVRGAEAGIECVGDGIFPTLFDYCEVSESERLLVAFTDSGSFDPTDIGAVKDAVLYYLPEVEVLGIDYHDWIADPLFEGPWVAPRVGQFSRVHKELGEPAGRIHFVGSDVSLEFPGYIEGALETAECAVNAILHS. The pKd is 4.0. (6) The compound is CN[C@@H]1C[C@H]2O[C@@](C)([C@@H]1OC)n1c3ccccc3c3c4c(c5c6ccccc6n2c5c31)C(=O)NC4. The target is PFCDPK1(Pfalciparum). The pKd is 9.5. (7) The pKd is 5.0. The target protein (Q96RR4) has sequence MSSCVSSQPSSNRAAPQDELGGRGSSSSESQKPCEALRGLSSLSIHLGMESFIVVTECEPGCAVDLGLARDRPLEADGQEVPLDTSGSQARPHLSGRKLSLQERSQGGLAAGGSLDMNGRCICPSLPYSPVSSPQSSPRLPRRPTVESHHVSITGMQDCVQLNQYTLKDEIGKGSYGVVKLAYNENDNTYYAMKVLSKKKLIRQAGFPRRPPPRGTRPAPGGCIQPRGPIEQVYQEIAILKKLDHPNVVKLVEVLDDPNEDHLYMVFELVNQGPVMEVPTLKPLSEDQARFYFQDLIKGIEYLHYQKIIHRDIKPSNLLVGEDGHIKIADFGVSNEFKGSDALLSNTVGTPAFMAPESLSETRKIFSGKALDVWAMGVTLYCFVFGQCPFMDERIMCLHSKIKSQALEFPDQPDIAEDLKDLITRMLDKNPESRIVVPEIKLHPWVTRHGAEPLPSEDENCTLVEVTEEEVENSVKHIPSLATVILVKTMIRKRSFGNPF.... The small molecule is Cc1nc(Nc2ncc(C(=O)Nc3c(C)cccc3Cl)s2)cc(N2CCN(CCO)CC2)n1.