This data is from Drug-target binding data from BindingDB using Ki measurements. The task is: Regression. Given a target protein amino acid sequence and a drug SMILES string, predict the binding affinity score between them. We predict pKi (pKi = -log10(Ki in M); higher means stronger inhibition). Dataset: bindingdb_ki. (1) The compound is Nc1c(F)cc(S(=O)(=O)Nc2nnc(S(N)(=O)=O)s2)cc1Cl. The target protein (Q9Y2D0) has sequence MVVMNSLRVILQASPGKLLWRKFQIPRFMPARPCSLYTCTYKTRNRALHPLWESVDLVPGGDRQSPINIRWRDSVYDPGLKPLTISYDPATCLHVWNNGYSFLVEFEDSTDKSVIKGGPLEHNYRLKQFHFHWGAIDAWGSEHTVDSKCFPAELHLVHWNAVRFENFEDAALEENGLAVIGVFLKLGKHHKELQKLVDTLPSIKHKDALVEFGSFDPSCLMPTCPDYWTYSGSLTTPPLSESVTWIIKKQPVEVDHDQLEQFRTLLFTSEGEKEKRMVDNFRPLQPLMNRTVRSSFRHDYVLNVQAKPKPATSQATP. The pKi is 7.7. (2) The small molecule is C[C@@H](O)[C@H](N)C(=O)NS(=O)(=O)c1cccc(-c2ccc3[nH]ncc3c2)c1. The target protein (Q8ZDW5) has sequence MPVITLPDGSQRHYDHAVSVLDVALDIGPGLAKACIAGRVNGELVDASDLIESDAQLAIITAKDAEGLEILRHSCAHLLGHAIKQLWPDTKMAIGPVIDNGFYYDVDIEHTLTQEDLALLEKRMHELADKDYDVIKKKVSWQEARDTFAARGEDYKVAILDENISRDDRPGLYHHEEYVDMCRGPHVPNMRFCHHFKLQKTSGAYWRGDSKNKMLQRIYGTAWGDKKQLNAYLQRLEEAAKRDHRKIGKQLDLYHMQEEAPGMVFWHNDGWTIFRELETFVRMKLKEYQYQEVKGPFMMDRVLWEKTGHWENYAEHMFTTSSENREYCIKPMNCPGHVQIFNQGLKSYRDLPLRMAEFGSCHRNEPSGALHGLMRVRGFTQDDAHVFCTEEQVRDEVNSCIKMVYDMYSTFGFEKIVVKLSTRPEKRIGSDELWTRAEDDLAAALTENGIPFDYQPGEGAFYGPKIEFTLHDCLDRAWQCGTVQLDFSLPGRLSASYIGE.... The pKi is 6.9. (3) The target protein (P35523) has sequence MEQSRSQQRGGEQSWWGSDPQYQYMPFEHCTSYGLPSENGGLQHRLRKDAGPRHNVHPTQIYGHHKEQFSDREQDIGMPKKTGSSSTVDSKDEDHYSKCQDCIHRLGQVVRRKLGEDGIFLVLLGLLMALVSWSMDYVSAKSLQAYKWSYAQMQPSLPLQFLVWVTFPLVLILFSALFCHLISPQAVGSGIPEMKTILRGVVLKEYLTMKAFVAKVVALTAGLGSGIPVGKEGPFVHIASICAAVLSKFMSVFCGVYEQPYYYSDILTVGCAVGVGCCFGTPLGGVLFSIEVTSTYFAVRNYWRGFFAATFSAFVFRVLAVWNKDAVTITALFRTNFRMDFPFDLKELPAFAAIGICCGLLGAVFVYLHRQVMLGVRKHKALSQFLAKHRLLYPGIVTFVIASFTFPPGMGQFMAGELMPREAISTLFDNNTWVKHAGDPESLGQSAVWIHPRVNVVIIIFLFFVMKFWMSIVATTMPIPCGGFMPVFVLGAAFGRLVGE.... The pKi is 5.0. The compound is c1cncc(OC[C@@H]2CCN2)c1. (4) The pKi is 8.8. The drug is O=C(c1cc2cccc(N3CCN(CCc4ccccn4)CC3)c2o1)N1CCC1. The target protein sequence is MDIFSLGQGNNTTSSQEPFGTGGNVTGISDVTFSYQVITSLLLGTLIFCAVLGNACVVAAIALERSLQNVANYLIGSLAVTDLMVSVLVLPMAALYQVLNKWTLGQVTCDLFIALDVLCCTSSILHLCAIALDRYWAITDPIDYVNKRTPRRAAALISLTWLIGFLISIPPMLGWRTPEDRSDPDACTISKDHGYTIYSTFGAFYIPLLLMLVLYGRIFRAARFRIRKTVKKVEKKGASTSLSTSSAPPPKKSLNGQPGNGDWRRSAENRAVGAPCANGAVRQGDDEATLEVIEVHRVGNSKDHLPLPSESGATSYAPACLERKNERNAEAKRKMALARERKTVKTLGIIMGTFILCWLPFFIVALVLPFCESSCHMPALLGAIINWLGYSNSLLNPVIYAYFNKDFQNAFKKIIKCKFCRR. (5) The compound is COc1ccccc1N1CCN(CCN(C(=O)C2CCCCC2)c2ccccn2)CC1. The target protein sequence is MGVFVKDSSDSAYLTPERKLALGRGKAQGKSRQAAYLSEEKNKPRSTGTGFTQVCSLEVKKLFQIPPFWRRLKKRDAKLAKHNEEYSESVQSEPNRILRVGSDVQPGFSMYAYTGLPMELKTKHFSIQSNSVSNFAMDILCDQESSVNPTAKSLIQINHERRLYRNVYGAGEINASHLFNLTVDSENLTNVSSESSVTPPCYSSLFQLSQKNWPALLTVIVIVLTIAGNILVIMAVSLEKKLQNATNYFLMSLAIADMLLGFLVMPVSMLTILYGYAWPLPRKLCAIWIYLDVLFSTASIMHLCAISLDRYIAIRNPIHHSRFNSRTKAFAKIIAVWTISVGISMPVPVFGLQDDSKVFKKDSCLLADDNFVLVGSFVAFFIPLTIMVVTYFLTIKSLQKEAMLCVNDIGPKTKFASFSFLPQSSLSSEKLFQRSLNRDVGTSGRRTMQSISNEQKASKVLGIVFFLFVVMWCPFFITNVMAVICKESCNQEVIGELLNV.... The pKi is 5.7. (6) The drug is C[C@H](CCC(=O)Nc1ccc(S(N)(=O)=O)cc1I)[C@H]1CC[C@H]2[C@@H]3C(=O)C[C@@H]4CC(=O)CC[C@]4(C)[C@H]3CC(=O)[C@@]21C. The target protein (Q95323) has sequence MRLLLALLVLAAAPPQARAASHWCYQIQVKPSNYTCLEPDEWEGSCQNNRQSPVNIVTAKTQLDPNLGRFSFSGYNMKHQWVVQNNGHTVMVLLENKPSIAGGGLSTRYQATQLHLHWSRAMDRGSEHSFDGERFAMEMHIVHEKEKGLSGNASQNQFAEDEIAVLAFMVEDGSKNVNFQPLVEALSDIPRPNMNTTMKEGVSLFDLLPEEESLRHYFRYLGSLTTPTCDEKVVWTVFQKPIQLHRDQILAFSQKLFYDDQQKVNMTDNVRPVQSLGQRQVFRSGAPGLLLAQPLPTLLAPVLACLTVGFLR. The pKi is 7.7. (7) The target protein (P16638) has sequence MSAKAISEQTGKELLYKYICTTSAIQNRFKYARVTPDTDWAHLLQDHPWLLSQSLVVKPDQLIKRRGKLGLVGVNLSLDGVKSWLKPRLGHEATVGKAKGFLKNFLIEPFVPHSQAEEFYVCIYATREGDYVLFHHEGGVDVGDVDTKAQKLLVGVDEKLNAEDIKRHLLVHAPEDKKEILASFISGLFNFYEDLYFTYLEINPLVVTKDGVYILDLAAKVDATADYICKVKWGDIEFPPPFGREAYPEEAYIADLDAKSGASLKLTLLNPKGRIWTMVAGGGASVVYSDTICDLGGVNELANYGEYSGAPSEQQTYDYAKTILSLMTREKHPDGKILIIGGSIANFTNVAATFKGIVRAIRDYQGSLKEHEVTIFVRRGGPNYQEGLRVMGEVGKTTGIPIHVFGTETHMTAIVGMAWAPAIPNQPPTAAHTANFLLNASGSTSTPAPSRTASFSESRADEVAPAKKAKPAMPQDSVPSPRSLQGKSATLFSRHTKAIV.... The pKi is 6.8. The compound is O=C(O)C[C@@](O)(C(=O)O)[C@H](O)C(=O)O. (8) The compound is O=C(NO)[C@H](O)[C@H](O)COP(=O)(O)O. The target protein sequence is MSMDVGVVGLGVMGANLALNIAEKGFKVAVFNRTYSKSEEFMKANASAPFAGNLKAFETMEAFAASLKKPRKALILVQAGAATDSTIEQLKKVFEKGDILVDTGNAHFKDQGRRAQQLEAAGLRFLGMGISGGEEGARKGPAFFPGGTLSVWEEIRPIVEAAAAKADDGRPCVTMNGSGGAGSCVKMYHNSGEYAILQIWGEVFDILRAMGLNNDEVAAVLEDWKSKNFLKSYMLDISIAAARAKDKDGSYLTEHVMDRIGSKGTGLWSAQEALEIGVPAPSLNMAVVSRQFTMYKTERQANASNAPGITQSPGYTLKNKSPSGPEIKQLYDSVCIAIISCYAQMFQCLREMDKVHNFGLNLPATIATFRAGCILQGYLLKPMTEAFEKNPNISNLMCAFQTEIRAGLQNYRDMVALITSKLEVSIPVLSASLNYVTAMFTPTLKYGQLVSLQRDVFGRHGYERVDKDGRESFQWPELQ. The pKi is 8.0.