Regression. Given a target protein amino acid sequence and a drug SMILES string, predict the binding affinity score between them. We predict pKd (pKd = -log10(Kd in M); higher means stronger binding). Dataset: bindingdb_kd. From a dataset of Drug-target binding data from BindingDB using Kd measurements. (1) The compound is CCOc1ccc(/C=C2\SC(=S)N(CCN)C2=O)cc1. The target protein sequence is MAAAAAAGPEMVRGQVFDVGPRYTNLSYIGEGAYGMVCSAYDNLNKVRVAIKKISPFEHQTYCQRTLREIKILLRFRHENIIGINDIIRAPTIEQMKDVYIVQDLMETDLYKLLKTQHLSNDHICYFLYQILRGLKYIHSANVLHRDLKPSNLLLNATCDLKICDFGLARVADPDHDHTGFLTEYVATRWYRAPEIMLNSKGYTKSIDIWSVGCILAEMLSNRPIFPGKHYLDQLNHILGILGSPSQEDLNCIINLKARNYLLSLPHKNKVPWNRLFPNADSKALDLLDKMLTFNPHKRIEVEQALAHPYLEQYYDPSNEPIAEAPFKFDMELDDLPKEKLKELIFEETARFQPGYRS. The pKd is 5.2. (2) The drug is N#Cc1cnc2cc(OC[C@@H](O)CO)c(NC(=O)C[C@@H]3CCSS3)cc2c1Nc1ccc(OCc2cccc(F)c2)c(Cl)c1. The target protein sequence is MRPSGTAGAALLALLAALCPASRALEEKKVCQGTSNKLTQLGTFEDHFLSLQRMFNNCEVVLGNLEITYVQRNYDLSFLKTIQEVAGYVLIALNTVERIPLENLQIIRGNMYYENSYALAVLSNYDANKTGLKELPMRNLQEILHGAVRFSNNPALCNVESIQWRDIVSSDFLSNMSMDFQNHLGSCQKCDPSCPNGSCWGAGEENCQKLTKIICAQQCSGRCRGKSPSDCCHNQCAAGCTGPRESDCLVCRKFRDEATCKDTCPPLMLYNPTTYQMDVNPEGKYSFGATCVKKCPRNYVVTDHGSCVRACGADSYEMEEDGVRKCKKCEGPCRKVCNGIGIGEFKDSLSINATNIKHFKNCTSISGDLHILPVAFRGDSFTHTPPLDPQELDILKTVKEITGFLLIQAWPENRTDLHAFENLEIIRGRTKQHGQFSLAVVSLNITSLGLRSLKEISDGDVIISGNKNLCYANTINWKKLFGTSGQKTKIISNRGENSCK.... The pKd is 7.3. (3) The compound is C[C@H]1O[C@@H](OCCOCCOCCn2cc(COc3c4cc(C(C)(C)C)cc3Cc3cc(C(C)(C)C)cc(c3OCc3cn(CCOCCOCCO[C@@H]5O[C@H](CO)[C@H](O)[C@H](O)[C@H]5O)nn3)Cc3cc(C(C)(C)C)cc(c3OCc3cn(CCOCCOCCO[C@@H]5O[C@H](CO)[C@H](O)[C@H](O)[C@H]5O)nn3)Cc3cc(C(C)(C)C)cc(c3OCc3cn(CCOCCOCCO[C@@H]5O[C@H](CO)[C@H](O)[C@H](O)[C@H]5O)nn3)C4)nn2)[C@H](O)[C@@H](O)[C@H]1O. The target protein (Q05097) has sequence MAWKGEVLANNEAGQVTSIIYNPGDVITIVAAGWASYGPTQKWGPQGDREHPDQGLICHDAFCGALVMKIGNSGTIPVNTGLFRWVAPNNVQGAITLIYNDVPGTYGNNSGSFSVNIGKDQS. The pKd is 6.8. (4) The pKd is 8.1. The target protein (Q9FBI2) has sequence MKIIIFRVLTFFFVIFSVNVVAKEFTLDFSTAKTYVDSLNVIRSAIGTPLQTISSGGTSLLMIDSGTGDNLFAVDVRGIDPEEGRFNNLRLIVERNNLYVTGFVNRTNNVFYRFADFSHVTFPGTTAVTLSGDSSYTTLQRVAGISRTGMQINRHSLTTSYLDLMSHSGTSLTQSVARAMLRFVTVTAEALRFRQIQRGFRTTLDDLSGRSYVMTAEDVDLTLNWGRLSSVLPDYHGQDSVRVGRISFGSINAILGSVALILNCHHHASRVARMASDEFPSMCPADGRVRGITHNKILWDSSTLGAILMRRTISS. The compound is O=C(NCCNCCSc1ccccc1)c1ccc([N+](=O)[O-])cc1. (5) The compound is Cc1cccc(OC[C@@H](O)[C@H]2CCCCN2)c1. The target protein (Q8K4Z4) has sequence MGHLGNGSDFLLAPNASHAPDHNVTRERDEAWVVGMAIVMSLIVLAIVFGNVLVITAIAKFERLQTVTNYFITSLACADLVMGLAVVPFGASHILMNMWTFGNFWCEFWTSIDVLCVTASIETLCVIAVDRYFAITSPFKYQSLLTKNKARVVILMVWVVSGLTSFLPIQMHWYRATHKDAINCYAEETCCDFFTNQAYAIASSIVSFYLPLVVMVFVYSRVFQVAKKQLQKIDRSEGRFHTQNLSQVEQDGRSGHGLRRSSKFYLKEHKALKTLGIIMGTFTLCWLPFFIVNIVHVIQDNLIPKEVYILLNWVGYVNSAFNPLIYCRSPDFRIAFQELLCLRRSALKAYGNDCSSNSNGKTDYTGEPNVCHQGQEKERELLCEDPPGTEDLVSCPGTVPSDSIDSQGRNYSTNDSLL. The pKd is 6.9. (6) The drug is CN1C[C@H](C(=O)N2CCN(c3ccccn3)CC2)C[C@@H]2c3cccc4[nH]cc(c34)C[C@H]21. The target protein (P28646) has sequence MFPNGTAPSPTSSPSSSPGGCGEGVCSRGPGSGAADGMEEPGRNSSQNGTLSEGQGSAILISFIYSVVCLVGLCGNSMVIYVILRYAKMKTATNIYILNLAIADELLMLSVPFLVTSTLLRHWPFGALLCRLVLSVDAVNMFTSIYCLTVLSVDRYVAVVHPIKAARYRRPTVAKVVNLGVWVLSLLVILPIVVFSRTAANSDGTVACNMLMPEPAQRWLVGFVLYTFLMGFLLPVGAICLCYVLIIAKMRMVALKAGWQQRKRSERKITLMVMMVVMVFVICWMPFYVVQLVNVFAEQDDATVSQLSVILGYANSCANPILYGFLSDNFKRSFQRILCLSWMDNAAEEPVDYYATALKSRAYSVEDFQPENLESGGVFRNGTCASRISTL. The pKd is 7.8. (7) The drug is COc1ccc(F)c(F)c1C(=O)c1cnc(NC2CCN(S(C)(=O)=O)CC2)nc1N. The pKd is 5.0. The target protein (Q15569) has sequence MAGERPPLRGPGPGPGEVPGEGPPGPGGTGGGPGRGRPSSYRALRSAVSSLARVDDFHCAEKIGAGFFSEVYKVRHRQSGQVMVLKMNKLPSNRGNTLREVQLMNRLRHPNILRFMGVCVHQGQLHALTEYMNGGTLEQLLSSPEPLSWPVRLHLALDIARGLRYLHSKGVFHRDLTSKNCLVRREDRGFTAVVGDFGLAEKIPVYREGARKEPLAVVGSPYWMAPEVLRGELYDEKADVFAFGIVLCELIARVPADPDYLPRTEDFGLDVPAFRTLVGDDCPLPFLLLAIHCCNLEPSTRAPFTEITQHLEWILEQLPEPAPLTRTALTHNQGSVARGGPSATLPRPDPRLSRSRSDLFLPPSPESPPNWGDNLTRVNPFSLREDLRGGKIKLLDTPSKPVLPLVPPSPFPSTQLPLVTTPETLVQPGTPARRCRSLPSSPELPRRMETALPGPGPPAVGPSAEEKMECEGSSPEPEPPGPAPQLPLAVATDNFISTCS.... (8) The compound is CN(C)CC(=O)N1CCC(c2ccc(NC(=O)c3ncc(C#N)[nH]3)c(C3=CCCCC3)c2)CC1. The target protein (Q9UKI8) has sequence MSVQSSSGSLEGPPSWSQLSTSPTPGSAAAARSLLNHTPPSGRPREGAMDELHSLDPRRQELLEARFTGVASGSTGSTGSCSVGAKASTNNESSNHSFGSLGSLSDKESETPEKKQSESSRGRKRKAENQNESSQGKSIGGRGHKISDYFEYQGGNGSSPVRGIPPAIRSPQNSHSHSTPSSSVRPNSPSPTALAFGDHPIVQPKQLSFKIIQTDLTMLKLAALESNKIQDLEKKEGRIDDLLRANCDLRRQIDEQQKLLEKYKERLNKCISMSKKLLIEKSTQEKLSSREKSMQDRLRLGHFTTVRHGASFTEQWTDGFAFQNLVKQQEWVNQQREDIERQRKLLAKRKPPTANNSQAPSTNSEPKQRKNKAVNGAENDPFVRPNLPQLLTLAEYHEQEEIFKLRLGHLKKEEAEIQAELERLERVRNLHIRELKRINNEDNSQFKDHPTLNERYLLLHLLGRGGFSEVYKAFDLYEQRYAAVKIHQLNKSWRDEKKEN.... The pKd is 5.0. (9) The drug is C=CC(=O)Nc1cc2c(Nc3ccc(F)c(Cl)c3)ncnc2cc1OCCCN1CCOCC1. The target protein (Q5S007) has sequence MASGSCQGCEEDEETLKKLIVRLNNVQEGKQIETLVQILEDLLVFTYSERASKLFQGKNIHVPLLIVLDSYMRVASVQQVGWSLLCKLIEVCPGTMQSLMGPQDVGNDWEVLGVHQLILKMLTVHNASVNLSVIGLKTLDLLLTSGKITLLILDEESDIFMLIFDAMHSFPANDEVQKLGCKALHVLFERVSEEQLTEFVENKDYMILLSALTNFKDEEEIVLHVLHCLHSLAIPCNNVEVLMSGNVRCYNIVVEAMKAFPMSERIQEVSCCLLHRLTLGNFFNILVLNEVHEFVVKAVQQYPENAALQISALSCLALLTETIFLNQDLEEKNENQENDDEGEEDKLFWLEACYKALTWHRKNKHVQEAACWALNNLLMYQNSLHEKIGDEDGHFPAHREVMLSMLMHSSSKEVFQASANALSTLLEQNVNFRKILLSKGIHLNVLELMQKHIHSPEVAESGCKMLNHLFEGSNTSLDIMAAVVPKILTVMKRHETSLPV.... The pKd is 6.0.