Task: Regression. Given a target protein amino acid sequence and a drug SMILES string, predict the binding affinity score between them. We predict pKi (pKi = -log10(Ki in M); higher means stronger inhibition). Dataset: bindingdb_ki.. Dataset: Drug-target binding data from BindingDB using Ki measurements (1) The small molecule is O=Cc1cccnc1. The target protein (P53184) has sequence MKTLIVVDMQNDFISPLGSLTVPKGEELINPISDLMQDADRDWHRIVVTRDWHPSRHISFAKNHKDKEPYSTYTYHSPRPGDDSTQEGILWPVHCVKNTWGSQLVDQIMDQVVTKHIKIVDKGFLTDREYYSAFHDIWNFHKTDMNKYLEKHHTDEVYIVGVALEYCVKATAISAAELGYKTTVLLDYTRPISDDPEVINKVKEELKAHNINVVDK. The pKi is 5.8. (2) The drug is O=c1cc(Nc2cccc(Cl)c2)[nH]c(=O)[nH]1. The target protein (P96583) has sequence MSKTVVLAEKPSVGRDLARVLKCHKKGNGYLEGDQYIVTWALGHLVTLADPEGYGKEFQSWRLEDLPIIPEPLKLVVIKKTGKQFNAVKSQLTRKDVNQIVIATDAGREGELVARWIIEKANVRKPIKRLWISSVTDKAIKEGFQKLRSGKEYENLYHSAVARAEADWIVGINATRALTTKFNAQLSCGRVQTPTLAMIAKREADIQAFTPVPYYGIRAAVDGMTLTWQDKKSKQTRTFNQDVTSRLLKNLQGKQAVVAELKKTAKKSFAPALYDLTELQRDAHKRFGFSAKETLSVLQKLYEQHKLVTYPRTDSRFLSSDIVPTLKDRLEGMEVKPYAQYVSQIKKRGIKSHKGYVNDAKVSDHHAIIPTEEPLVLSSLSDKERKLYDLIAKRFLAVLMPAFEYEETKVIAEIGGETFTAKGKTVQSQGWKAVYDMAEEDDEQEDDRDQTLPALQKGDTLAVRTLTETSGQTKPPARFNEGTLLSAMENPSAFMQGEEK.... The pKi is 4.6. (3) The small molecule is COc1cc2c(cc1-c1c(C)noc1C)[nH]c1ccncc12. The target protein sequence is WKHQFAWPFYQPVDAIKLNLPDYHKIIKNPMDMGTIKKRLENNYYWSASECMQDFNTMFTNCYIYNKPTDDIV. The pKi is 6.1. (4) The compound is CC(C)CCCC(C)CCCC(C)CCCC(C)CC(c1ccccc1)c1c(O)c2ccccc2oc1=O. The target protein (Q6TEK4) has sequence MGTTWRSPGRLRLALCLAGLALSLYALHVKAARARNEDYRALCDVGTAISCSRVFSSRWGRGFGLVEHVLGADSILNQSNSIFGCMFYTIQLLLGCLRGRWASILLILSSLVSVAGSLYLAWILFFVLYDFCIVCITTYAINAGLMLLSFQKVPEHKVKKP. The pKi is 6.7. (5) The compound is [NH3+][C@@H](CNC(=O)/C=C/C(=O)[O-])C(=O)[O-]. The target protein (P17169) has sequence MCGIVGAIAQRDVAEILLEGLRRLEYRGYDSAGLAVVDAEGHMTRLRRLGKVQMLAQAAEEHPLHGGTGIAHTRWATHGEPSEVNAHPHVSEHIVVVHNGIIENHEPLREELKARGYTFVSETDTEVIAHLVNWELKQGGTLREAVLRAIPQLRGAYGTVIMDSRHPDTLLAARSGSPLVIGLGMGENFIASDQLALLPVTRRFIFLEEGDIAEITRRSVNIFDKTGAEVKRQDIESNLQYDAGDKGIYRHYMQKEIYEQPNAIKNTLTGRISHGQVDLSELGPNADELLSKVEHIQILACGTSYNSGMVSRYWFESLAGIPCDVEIASEFRYRKSAVRRNSLMITLSQSGETADTLAGLRLSKELGYLGSLAICNVPGSSLVRESDLALMTNAGTEIGVASTKAFTTQLTVLLMLVAKLSRLKGLDASIEHDIVHGLQALPSRIEQMLSQDKRIEALAEDFSDKHHALFLGRGDQYPIALEGALKLKEISYIHAEAYAA.... The pKi is 4.3. (6) The compound is CCCCc1nc(Cl)c(CO)n1Cc1ccc(-c2ccccc2-c2nnn[nH]2)cc1. The target protein sequence is MAFPPEKYAQWSAGIALMKNILGFIIPLVFIATCYFGIRKHLLKTNSYGKNRITRDQVLNMAAAVVLAFIICWLPFHVLTFLDALAWMGIINSCEVIAVIDLALPFAILLGFTNSCINPFLYCFVGNRFQQKLRSVF. The pKi is 5.0. (7) The small molecule is CC(C)(C)NC(=O)[C@@H]1CN(Cc2cccnc2)CCN1C[C@@H](O)C[C@@H](Cc1ccccc1)C(=O)N[C@H]1c2ccccc2C[C@H]1O. The target protein sequence is PQITLWQRPLVTIKIGGQLKEALLDTGADDTVLEEISLPGRWKPKMIGGIGGFIKVRQYDQILIEICGHKAIGTVLVGPTPVNIIGRNLLTQIGCTLNF. The pKi is 8.6.