This data is from Drug-target binding data from BindingDB using Kd measurements. The task is: Regression. Given a target protein amino acid sequence and a drug SMILES string, predict the binding affinity score between them. We predict pKd (pKd = -log10(Kd in M); higher means stronger binding). Dataset: bindingdb_kd. (1) The compound is Cc1sc2c(c1C)C(c1ccc(Cl)cc1)=N[C@@H](CC(=O)OC(C)(C)C)c1nnc(C)n1-2. The target protein sequence is NPPPPETSNPNKPKRQTNQLQYLLRVVLKTLWKHQFAWPFQQPVDAVKLNLPDYYKIIKTPMDMGTIKKRLENNYYWNAQECIQDFNTMFTNCYIYNKPGADIVLMAEALEKLFLQKINELPT. The pKd is 9.0. (2) The compound is Cc1cnc(Nc2ccc(OCCN3CCCC3)cc2)nc1Nc1cccc(S(=O)(=O)NC(C)(C)C)c1. The target protein (Q8NG66) has sequence MLKFQEAAKCVSGSTAISTYPKTLIARRYVLQQKLGSGSFGTVYLVSDKKAKRGEELKVLKEISVGELNPNETVQANLEAQLLSKLDHPAIVKFHASFVEQDNFCIITEYCEGRDLDDKIQEYKQAGKIFPENQIIEWFIQLLLGVDYMHERRILHRDLKSKNVFLKNNLLKIGDFGVSRLLMGSCDLATTLTGTPHYMSPEALKHQGYDTKSDIWSLACILYEMCCMNHAFAGSNFLSIVLKIVEGDTPSLPERYPKELNAIMESMLNKNPSLRPSAIEILKIPYLDEQLQNLMCRYSEMTLEDKNLDCQKEAAHIINAMQKRIHLQTLRALSEVQKMTPRERMRLRKLQAADEKARKLKKIVEEKYEENSKRMQELRSRNFQQLSVDVLHEKTHLKGMEEKEEQPEGRLSCSPQDEDEERWQGREEESDEPTLENLPESQPIPSMDLHELESIVEDATSDLGYHEIPEDPLVAEEYYADAFDSYCEESDEEEEEIALE.... The pKd is 5.9. (3) The drug is Cc1ccc(F)c(NC(=O)Nc2ccc(-c3cccc4[nH]nc(N)c34)cc2)c1. The target protein (Q9UKI8) has sequence MSVQSSSGSLEGPPSWSQLSTSPTPGSAAAARSLLNHTPPSGRPREGAMDELHSLDPRRQELLEARFTGVASGSTGSTGSCSVGAKASTNNESSNHSFGSLGSLSDKESETPEKKQSESSRGRKRKAENQNESSQGKSIGGRGHKISDYFEYQGGNGSSPVRGIPPAIRSPQNSHSHSTPSSSVRPNSPSPTALAFGDHPIVQPKQLSFKIIQTDLTMLKLAALESNKIQDLEKKEGRIDDLLRANCDLRRQIDEQQKLLEKYKERLNKCISMSKKLLIEKSTQEKLSSREKSMQDRLRLGHFTTVRHGASFTEQWTDGFAFQNLVKQQEWVNQQREDIERQRKLLAKRKPPTANNSQAPSTNSEPKQRKNKAVNGAENDPFVRPNLPQLLTLAEYHEQEEIFKLRLGHLKKEEAEIQAELERLERVRNLHIRELKRINNEDNSQFKDHPTLNERYLLLHLLGRGGFSEVYKAFDLYEQRYAAVKIHQLNKSWRDEKKEN.... The pKd is 5.0. (4) The drug is O=C(O)CN1Cc2cccc(NC(=O)OCC3c4ccccc4-c4ccccc43)c2N[C@@H](Cc2c[nH]c3ccccc23)C1=O. The target protein (P9WH75) has sequence MPPTVIAEPVASGAHASYSGGPGETDYHALNAMLNLYDADGKIQFDKDREAAHQYFLQHVNQNTVFFHNQDEKLDYLIRENYYEREVLDQYSRNFVKTLLDRAYAKKFRFPTFLGAFKYYTSYTLKTFDGKRYLERFEDRVVMVALTLAAGDTALAELLVDEIIDGRFQPATPTFLNSGKKQRGEPVSCFLLRVEDNMESIGRSINSALQLSKRGGGVALLLTNIREHGAPIKNIENQSSGVIPIMKLLEDAFSYANQLGARQGAGAVYLHAHHPDIYRFLDTKRENADEKIRIKTLSLGVVIPDITFELAKRNDDMYLFSPYDVERVYGVPFADISVTEKYYEMVDDARIRKTKIKAREFFQTLAELQFESGYPYIMFEDTVNRANPIDGKITHSNLCSEILQVSTPSLFNEDLSYAKVGKDISCNLGSLNIAKTMDSPDFAQTIEVAIRALTAVSDQTHIKSVPSIEQGNNDSHAIGLGQMNLHGYLARERIFYGSDE.... The pKd is 5.1. (5) The small molecule is CO[C@]12CC[C@@]3(C[C@@H]1C(C)(C)O)[C@H]1Cc4ccc(O)c5c4[C@@]3(CCN1CC1CC1)[C@H]2O5. The target protein sequence is MDSPIQIFRGEPGPTCAPSACLPPNSSAWFPGWAEPDSNGSAGSEDAQLEPAHISPAIPVIITAVYSVVFVVGLVGNSLVMFVIIRYTKMKTATNIYIFNLALADALVTTTMPFQSTVYLMNSWPFGDVLCKIVISIDYYNMFTSIFTLTMMSVDRYIAVCHPVKALDFRTPLKAKIINICIWLLSSSVGISAIVLGGTKVREDVDVIECSLQFPDDDYSWWDLFMKICVFIFAFVIPVLIIIVCYTLMILRLKSVRLLSGSREKDRNLRRITRLVLVVVAVFVVCWTPIHIFILVEALGSTSHSTAALSSYYFCIALGATNSSLNPILYAFLDENFKRCFRDFCFPLKMRMERQSTSRVRNTVQDPAYLRDIDGMNKPV. The pKd is 8.5. (6) The small molecule is O=C(Cc1ccc(Cl)cc1)Nc1cnccc1C(=O)O. The target protein sequence is MAGVGPGGYAAEFVPPPECPVFEPSWEEFTDPLSFIGRIRPLAEKTGICKIRPPKDWQPPFACEVKSFRFTPRVQRLNELEAMTRVRLDFLDQLAKFWELQGSTLKIPVVERKILDLYALSKIVASKGGFEMVTKEKKWSKVGSRLGYLPGKGTGSLLKSHYERILYPYELFQSGVSLMGVQMPNLDLKEKVEPEVLSTDTQTSPEPGTRMNILPKRTRRVKTQSESGDVSRNTELKKLQIFGAGPKVVGLAMGTKDKEDEVTRRRKVTNRSDAFNMQMRQRKGTLSVNFVDLYVCMFCGRGNNEDKLLLCDGCDDSYHTFCLIPPLPDVPKGDWRCPKCVAEECSKPREAFGFEQAVREYTLQSFGEMADNFKSDYFNMPVHMVPTELVEKEFWRLVSSIEEDVIVEYGADISSKDFGSGFPVKDGRRKILPEEEEYALSGWNLNNMPVLEQSVLAHINVDISGMKVPWLYVGMCFSSFCWHIEDHWSYSINYLHWGEP.... The pKd is 6.4.