This data is from Drug-target binding data from BindingDB using Kd measurements. The task is: Regression. Given a target protein amino acid sequence and a drug SMILES string, predict the binding affinity score between them. We predict pKd (pKd = -log10(Kd in M); higher means stronger binding). Dataset: bindingdb_kd. (1) The small molecule is N[C@@H](CCC(=O)N[C@@H](CSSC[C@H](NC(=O)CC[C@H](N)C(=O)O)C(=O)NCC(=O)O)C(=O)NCC(=O)O)C(=O)O. The target protein sequence is MKLVNLTKVSAAVLAVLALAACDDKNTDGKTTAKPAAEKTFVNCVSRSPQYFSPALAMDGISYNASSQQVYNRLVEFKRGSTEIEPALAESWDVSEDGLTYTFHLRKGVKFHSNKEFTPSRDFNADDVVFSFNRQLDPNHPYHTVSKATYPYFKAMKFPTLLKSVEKVDDHTVKFTLTKRDATFVSSLGMDFTSIYSAEYADAMLKAGKPETIDTTPIGTGPFAFTGYVLDQASRYVAHKDYWKGKADFDRLIFEIIPDATARYAKLQAGQCDLIDFPNATDIEKMKTDPKVQLLSQPGLNIAYVAFNTEKAPFDNVKVRQALNLAVDKKAIIDVVYQGAGIAAKNPLPPTIWGYNDSLAESEFNIEKAKQLLAEAGYPNGFETELWVQPVVRASNPNPRRMSEIIQADWAKIGVKAKLVTYEWGDYIKRTKAGELTAGTYGWSGDNGDPDNFLSPLFGSANVGNSNYARFNSPELDALLDKALGLSDKAERTKLYEQAQ.... The pKd is 6.5. (2) The drug is O=P(O)(O)O[C@H]1[C@H](O)[C@@H](OP(=O)(O)O)[C@H](OP(=O)(O)O)[C@@H](O)[C@H]1O. The target protein (Q14573) has sequence MSEMSSFLHIGDIVSLYAEGSVNGFISTLGLVDDRCVVEPAAGDLDNPPKKFRDCLFKVCPMNRYSAQKQYWKAKQTKQDKEKIADVVLLQKLQHAAQMEQKQNDTENKKVHGDVVKYGSVIQLLHMKSNKYLTVNKRLPALLEKNAMRVTLDATGNEGSWLFIQPFWKLRSNGDNVVVGDKVILNPVNAGQPLHASNYELSDNAGCKEVNSVNCNTSWKINLFMQFRDHLEEVLKGGDVVRLFHAEQEKFLTCDEYKGKLQVFLRTTLRQSATSATSSNALWEVEVVHHDPCRGGAGHWNGLYRFKHLATGNYLAAEENPSYKGDASDPKAAGMGAQGRTGRRNAGEKIKYCLVAVPHGNDIASLFELDPTTLQKTDSFVPRNSYVRLRHLCTNTWIQSTNVPIDIEEERPIRLMLGTCPTKEDKEAFAIVSVPVSEIRDLDFANDASSMLASAVEKLNEGFISQNDRRFVIQLLEDLVFFVSDVPNNGQNVLDIMVTK.... The pKd is 6.7. (3) The small molecule is CO[C@@H]1[C@H](N(C)C(=O)c2ccccc2)C[C@H]2O[C@]1(C)n1c3ccccc3c3c4c(c5c6ccccc6n2c5c31)C(=O)N[C@H]4O. The target protein (Q9C098) has sequence MGKEPLTLKSIQVAVEELYPNKARALTLAQHSRAPSPRLRSRLFSKALKGDHRCGETETPKSCSEVAGCKAAMRHQGKIPEELSLDDRARTQKKWGRGKWEPEPSSKPPREATLEERHARGEKHLGVEIEKTSGEIIRCEKCKRERELQQSLERERLSLGTSELDMGKGPMYDVEKLVRTRSCRRSPEANPASGEEGWKGDSHRSSPRNPTQELRRPSKSMDKKEDRGPEDQESHAQGAAKAKKDLVEVLPVTEEGLREVKKDTRPMSRSKHGGWLLREHQAGFEKLRRTRGEEKEAEKEKKPCMSGGRRMTLRDDQPAKLEKEPKTRPEENKPERPSGRKPRPMGIIAANVEKHYETGRVIGDGNFAVVKECRHRETRQAYAMKIIDKSRLKGKEDMVDSEILIIQSLSHPNIVKLHEVYETDMEIYLILEYVQGGDLFDAIIESVKFPEPDAALMIMDLCKALVHMHDKSIVHRDLKPENLLVQRNEDKSTTLKLADF.... The pKd is 5.0. (4) The drug is Nc1nc2c(ncn2[C@@H]2O[C@H](COP(=O)(O)O)[C@H](O)[C@@H]2O)c(=O)[nH]1. The target protein (P62959) has sequence MADEIAKAQVAQPGGDTIFGKIIRKEIPAKIIFEDDRCLAFHDISPQAPTHFLVIPKKHISQISVADDDDESLLGHLMIVGKKCAADLGLKRGYRMVVNEGADGGQSVYHIHLHVLGGRQMNWPPG. The pKd is 4.2. (5) The drug is CO[C@H](c1ccccc1)[C@@H]1NC(=O)[C@H](C)NC(=O)[C@H](C[C@@H](C)CO)N(C)C(=O)[C@H]([C@H](O)c2cn(C(C)(C)[C@@H](O)CNCCCN)c3ccccc23)NC(=O)[C@H]([C@H](C)C=C(C)C)NC(=O)[C@H](CC(C)C)N(C)C(=O)[C@H](C(C)C)NC1=O. The target protein sequence is MAERFTDRARRVVVLAQEEARMLNHNYIGTEHILLGLIHEGEGVAAKSLESLGISLEGVRSQVEEIIGQGQQAPSGHIPFTPRAKKVLELSLREALQLGHNYIGTEHILLGLIREGEGVAAQVLVKLGAELTRVRQQVIQLLSGY. The pKd is 5.4.