This data is from Drug-target binding data from BindingDB using Ki measurements. The task is: Regression. Given a target protein amino acid sequence and a drug SMILES string, predict the binding affinity score between them. We predict pKi (pKi = -log10(Ki in M); higher means stronger inhibition). Dataset: bindingdb_ki. (1) The small molecule is C[N+]1(C)CC[C@]23c4c5ccc(OC(=O)c6ccccc6)c4O[C@H]2[C@@H](OS(=O)(=O)[O-])C=C[C@H]3[C@H]1C5. The target protein (P47748) has sequence MESLFPAPFWEVLYGSHLQGNLSLLSPNHSGLPPHLLLNASHSAFLPLGLKVTIVGLYLAVCIGGLLGNCLVMYVILRHTKMKTATNIYIFNLALADTLVLLTLPFQATDILLGFWPFGNTLCKTVIAIDYYNMFTSTFTLTAMSVDRYVAICHPIRALDVRTSSKAQAVNVAIWALALVVGVPVAIMGSAQVEDEEIECLVEIPDPQDYWGPVFAVSIFLFSFIIPVLIISVCYSLMIRRLHGVRLLSGSREKDRNLRRITRLVLVVVAVFVGCWTPVQVFVLVQGLGVQPGSETTVAILRFCTALGYVNSCLNPILYAFLDENFKACFRKFCCASALHREMQVSDRVRSIAKDVALGCKTTETVPRPA. The pKi is 5.0. (2) The compound is CN(Cc1cnc2nc(N)nc(N)c2n1)c1ccc(C(=O)N[C@@H](CCC(=O)O)C(=O)O)cc1. The target protein sequence is MDALCGSGELGSKFWDSNLSVHTENPDLTPCFQNSLLAWVPCIYLWVALPCYLLYLRHHCRGYIILSHLSKLKMVLGVLLWCVSWADLFYSFHGLVHGRAPAPVFFVTPLVVGVTMLLATLLIQYERLQGVQSSGVLIIFWFLCVVCAIVPFRSKILLAKAEGEISDPFRFTTFYIHFALVLSALILACFREKPPFFSAKNVDPNPYPETSAGFLSRLFFWWFTKMAIYGYRHPLEEKDLWSLKEEDRSQMVVQQLLEAWRKQEKQTARHKASAAPGKNASGEDEVLLGARPRPRKPSFLKALLATFGSSFLISACFKLIQDLLSFINPQLLSILIRFISNPMAPSWWGFLVAGLMFLCSMMQSLILQHYYHYIFVTGVKFRTGIMGVIYRKALVITNSVKRASTVGEIVNLMSVDAQRFMDLAPFLNLLWSAPLQIILAIYFLWQNLGPSVLAGVAFMVLLIPLNGAVAVKMRAFQVKQMKLKDSRIKLMSEILNGIKV.... The pKi is 2.9. (3) The drug is NCCCCCCCN. The target protein sequence is MVDHVSFIEVNKIRSDDECDADSHNEGDNIEDAKASVFVKSSLIPEKTDVVKGLNFDKEVDLHEFINNYKYMGFQATNLGISIDEINKMIYYKYKDENIKSEPNNENNLNCNNVSEDLNKDQENHLYHYEKKKKSCIIWLSFTSNMISSGLREIFVYLAKNKFIDVVVTTAGGIEEDIIKCFSNTYIGDFNLNGKKLRKKGWNRIGNLIVPNDNYCKFEDWLQPILNKMLHEQNEKNEQMFLKKLEKRKKKYNNNKNKNDNNNDNDNVWGNEKNDQNENQYNQGQESFKKDSNIYTNDVSNKKNHINNYINNYDSDSDDQCDMYYLSPSEFINTLGKEINDESSLIYWCYKNDIPIFCPGLTDGSLGDNLFLHNYGKKIKNNLILDIVKDIKKINSLAMNCEKSGIIILGGGLPKHHVCNANLMRNGADFAVYVNTASEYDGSDSGANTTEALSWGKIKYGQTNNHVKVFGDATILFPLMVLNSFYLYDQKRKKDM. The pKi is 4.3. (4) The drug is O=C(O)CN1C(=O)[C@@]2(CC(=O)N(Cc3cc(Cl)ccc3F)C2)c2cc(Cl)ccc21. The target protein (Q9Z2J6) has sequence MANVTLKPLCPLLEEMVQLPNHSNSSLRYIDHVSVLLHGLASLLGLVENGLILFVVGCRMRQTVVTTWVLHLALSDLLAAASLPFFTYFLAVGHSWELGTTFCKLHSSVFFLNMFASGFLLSAISLDRCLQVVRPVWAQNHRTVAVAHRVCLMLWALAVLNTIPYFVFRDTIPRLDGRIMCYYNLLLWNPGPDRDTTCDYRQKALAVSKFLLAFMVPLAIIASSHVAVSLRLHHRGRQRTGRFVRLVAAIVVAFVLCWGPYHIFSLLEARAHSVTTLRQLASRGLPFVTSLAFFNSVVNPLLYVFTCPDMLYKLRRSLRAVLESVLVEDSDQSGGLRNRRRRASSTATPASTLLLADRIPQLRPTRLIGWMRRGSAEVPQRV. The pKi is 7.1. (5) The target is MLLARMKPQVQPELGGADQ. The drug is Fc1ccc([C@@H]2CCNC[C@H]2COc2ccc3c(c2)OCO3)cc1. The pKi is 7.1. (6) The pKi is 8.4. The target protein sequence is MEGTPAANWSFELDLGSGVSPGVEGNLTAGPPQRNEALARVEVAVLCLILFLALSATVRAAGLRTTTRHKHSRLFFFMKHLSIADLVVAVFQVLPQLLWDITFRFYGPDLLCRLVKYLQVVGMFASTYLLLLMSLDRCLAICQPLRSLRRRTDRLAVLATWLGCLVASAPQVHIFSLREVADGVFDCWAVFIQPWGPNAYVTWITLAVYIVPVIVLAACYGLISFKIWQNLRLKTAAAAAAAEGTEGSAAGGAARGAGS. The compound is CS(=O)(=O)N1CCC(Oc2ccc(CC(=O)N3CCC(N4C(=O)CCc5ccccc54)CC3)c(OC/C=C\I)c2)CC1. (7) The pKi is 4.1. The target protein (P11707) has sequence MDLIFSLETWVLLAASLVLLYLYGTSTHGLFKKMGIPGPTPLPFIGTILEYRKGIWDFDIECRKKYGKMWGLFDGRQPLMVITDPDMIKTVLVKECYSVFTNRRSFGPVGFMKKAVSISEDEDWKRVRTLLSPTFTSGKLKEMLPIIAQYGDVLVKNLRQEAEKGKPVDLKEIFGAYSMDVITGTSFGVNIDSLRNPQDPFVKNVRRLLKFSFFDPLLLSITLFPFLTPIFEALHISMFPKDVMDFLKTSVEKIKDDRLKDKQKRRVDFLQLMINSQNSKEIDSHKALDDIEVVAQSIIILFAGYETTSSTLSFIMHLLATHPDVQQKLQEEIDTLLPNKELATYDTLVKMEYLDMVVNETLRLYPIAGRLERVCKKDVDINGTFIPKGTIVMMPTYALHRDPQHWTEPDEFRPERFSKKNKDNINPYIYHPFGAGPRNCLGMRFALMNIKLALVRLMQNFSFKLCKETQVPLKLGKQGLLQPEKPIVLKVVSRDGIIRG.... The small molecule is O=c1cc(-c2ccccc2)oc2c1ccc1ccccc12.