Dataset: Drug-target binding data from BindingDB using Kd measurements. Task: Regression. Given a target protein amino acid sequence and a drug SMILES string, predict the binding affinity score between them. We predict pKd (pKd = -log10(Kd in M); higher means stronger binding). Dataset: bindingdb_kd. (1) The drug is CC[C@H](C)[C@@H]1NC(=O)[C@H](Cc2ccccc2)NC(=O)CC2(CCCCC2)SSC[C@H](C(=O)N(C)CC(=O)N[C@@H](CCCNC(=N)N)C(=O)NCC(N)=O)NC(=O)[C@H](CC(N)=O)NC(=O)[C@H](CCC(N)=O)NC1=O. The target protein (P48044) has sequence MFMASTTSAVPWHLSQPTPAGNGSEGELLTARDPLLAQAELALLSTVFVAVALSNGLVLGALVRRGRRGRWAPMHVFIGHLCLADLAVALFQVLPQLAWDATDRFRGPDALCRAVKYLQMVGMYASSYMILAMTLDRHRAICRPMLAHRHGGGTHWNRPVLLAWAFSLLFSLPQLFIFAQRDVDGSGVLDCWARFAEPWGLRAYVTWIALMVFVAPALGIAACQVLIFREIHASLGPGPVPRAGGPRRGCRPGSPAEGARVSAAVAKTVKMTLVIVIVYVLCWAPFFLVQLWAAWDPEAPREGPPFVLLMLLASLNSCTNPWIYASFSSSISSELRSLLCCTWRRAPPSPGPQEESCATASSFLAKDTPS. The pKd is 6.5. (2) The drug is CN1C(=O)C(c2c[nH]c3cc(Cl)ccc23)=C(c2ccc(Cl)cc2)C1(O)Cc1ccc(Cl)cc1. The target protein sequence is MCNTNMSVPTDGAVTTSQIPASEQETLVRPKPLLLKLLKSVGAQKDWYTMKEVLFYLGQYIMTKRLYDEKQQHIVYCSNDLLGDLFGVPSFSVKEHRKIYTMIYRNLVVVNQQESSDS. The pKd is 4.6. (3) The compound is CO[C@]12CC[C@@]3(C[C@@H]1C(C)(C)O)[C@H]1Cc4ccc(O)c5c4[C@@]3(CCN1CC1CC1)[C@H]2O5. The target protein sequence is MDSPIQIFRGEPGPTCAPSACLPPNSSAWFPGWAEPDSNGSAGSEDAQLEPAHISPAIPVIITAVYSVVFVVGLVGNSLVMFVIIRYTKMKTATNIYIFNLALADALVTTTMPFASTVYLMNSWPFGDVLCKIVISIDYYNMFTSIFTLTMMSVDRYIAVCHPVKALDFRTPLKAKIINICIWLLSSSVGISAIVLGGTKVREDVDVIECSLQFPDDDYSWWDLFMKICVFIFAFVIPVLIIIVCYTLMILRLKSVRLLSGSREKDRNLRRITRLVLVVVAVFVVCWTPIHIFILVEALGSTSHSTAALSSYYFCIALGYTNSSLNPILYAFLDENFKRCFRDFCFPLKMRMERQSTSRVRNTVQDPAYLRDIDGMNKPV. The pKd is 8.8. (4) The small molecule is C[C@]12O[C@H](C[C@]1(O)CO)n1c3ccccc3c3c4c(c5c6ccccc6n2c5c31)CNC4=O. The target protein (O94768) has sequence MSRRRFDCRSISGLLTTTPQIPIKMENFNNFYILTSKELGRGKFAVVRQCISKSTGQEYAAKFLKKRRRGQDCRAEILHEIAVLELAKSCPRVINLHEVYENTSEIILILEYAAGGEIFSLCLPELAEMVSENDVIRLIKQILEGVYYLHQNNIVHLDLKPQNILLSSIYPLGDIKIVDFGMSRKIGHACELREIMGTPEYLAPEILNYDPITTATDMWNIGIIAYMLLTHTSPFVGEDNQETYLNISQVNVDYSEETFSSVSQLATDFIQSLLVKNPEKRPTAEICLSHSWLQQWDFENLFHPEETSSSSQTQDHSVRSSEDKTSKSSCNGTCGDREDKENIPEDSSMVSKRFRFDDSLPNPHELVSDLLC. The pKd is 8.3. (5) The small molecule is Oc1ccc(-c2nc(-c3ccncc3)c(-c3ccc(F)cc3)[nH]2)cc1. The target protein (P29320) has sequence MDCQLSILLLLSCSVLDSFGELIPQPSNEVNLLDSKTIQGELGWISYPSHGWEEISGVDEHYTPIRTYQVCNVMDHSQNNWLRTNWVPRNSAQKIYVELKFTLRDCNSIPLVLGTCKETFNLYYMESDDDHGVKFREHQFTKIDTIAADESFTQMDLGDRILKLNTEIREVGPVNKKGFYLAFQDVGACVALVSVRVYFKKCPFTVKNLAMFPDTVPMDSQSLVEVRGSCVNNSKEEDPPRMYCSTEGEWLVPIGKCSCNAGYEERGFMCQACRPGFYKALDGNMKCAKCPPHSSTQEDGSMNCRCENNYFRADKDPPSMACTRPPSSPRNVISNINETSVILDWSWPLDTGGRKDVTFNIICKKCGWNIKQCEPCSPNVRFLPRQFGLTNTTVTVTDLLAHTNYTFEIDAVNGVSELSSPPRQFAAVSITTNQAAPSPVLTIKKDRTSRNSISLSWQEPEHPNGIILDYEVKYYEKQEQETSYTILRARGTNVTISSLK.... The pKd is 5.0. (6) The drug is CCN(CC)CCCC[C@H](NC(=O)[C@H](CC(C)C)NC(=O)[C@H](C)NC(=O)[C@H](Cc1ccccc1)NC(=O)c1ccc(C(C)(C)C)cc1)C(=O)N[C@@H](CO)C(=O)OC. The target protein (Q13185) has sequence MASNKTTLQKMGKKQNGKSKKVEEAEPEEFVVEKVLDRRVVNGKVEYFLKWKGFTDADNTWEPEENLDCPELIEAFLNSQKAGKEKDGTKRKSLSDSESDDSKSKKKRDAADKPRGFARGLDPERIIGATDSSGELMFLMKWKDSDEADLVLAKEANMKCPQIVIAFYEERLTWHSCPEDEAQ. The pKd is 4.0.