This data is from Drug-target binding data from BindingDB using Ki measurements. The task is: Regression. Given a target protein amino acid sequence and a drug SMILES string, predict the binding affinity score between them. We predict pKi (pKi = -log10(Ki in M); higher means stronger inhibition). Dataset: bindingdb_ki. (1) The compound is Nc1nc([O-])c2ncn([C@@H]3O[C@H](COP(=O)(O)OP(=O)(O)C[P+]([O-])(O)O)[C@@H](O)[C@H]3O)c2n1. The target protein sequence is MPPQLHNGLDFSAKVIQGSLDSLPQEVRKFVEGNAQLCQPEYIHICDGSEEEYGRLLAHMQEEGVIRKLKKYDNCWLALTDPRDVARIESKTVIITQEQRDTVPIPKSGQSQLGRWMSEEDFEKAFNARFPGCMKGRTMYVIPFSMGPLGSPLAKIGIELTDSPYVVASMRIMTRMGTSVLEALGDGEFIKCLHSVGCPLPLKKPLVNNWACNPELTLIAHLPDRREIISFGSGYGGNSLLGKKCFALRIASRLAKEEGWLAEHMLILGITNPEGKKKYLAAAFPSACGKTNLAMMNPTLPGWKVECVGDDIAWMKFDAQGNLRAINPENGFFGVAPGTSVKTNPNAIKTIQKNTIFTNVAETSDGGVYWEGIDEPLAPGVTITSWKNKEWRPQDEEPCAHPNSRFCTPASQCPIIDPAWESPEGVPIEGIIFGGRRPAGVPLVYEALSWQHGVFVGAAMRSEATAAAEHKGKVIMRDPFAMRPFFGYNFGKYLAHWLSM.... The pKi is 3.2. (2) The small molecule is CNCCCN1c2ccccc2CCc2ccccc21. The target is MLLARMKPQVQPELGGADQ. The pKi is 9.4. (3) The drug is CNS(=O)(=O)Cc1ccc2[nH]cc(CCN(C)C)c2c1. The target protein (P31387) has sequence MEVSNLSGATPGLAFPPGPESCSDSPSSGRSMGSTPGGLILPGREPPFSAFTVLVVTLLVLLIAATFLWNLLVLVTILRVRAFHRVPHNLVASTAVSDVLVAVLVMPLSLVSELSAGRRWQLGRSLCHVWISFDVLCCTASIWNVAAIALDRYWTITRHLQYTLRTRSRASALMIAITWALSALIALAPLLFGWGEAYDARLQRCQVSQEPSYAVFSTCGAFYLPLAVVLFVYWKIYKAAKFRFGRRRRAVVPLPATTQAKEAPPESEMVFTARRRATVTFQTSGDSWREQKEKRAAMMVGILIGVFVLCWIPFFLTELISPLCACSLPPIWKSIFLWLGYSNSFFNPLIYTAFNKNYNNAFKSLFTKQR. The pKi is 5.1. (4) The drug is COc1ccc(S(=O)(=O)N(CC2CCCCC2)C[C@@H](O)[C@H](Cc2ccccc2)NC(=O)c2cccc(O)c2)cc1. The target protein sequence is PQITLWQRPLVTVKIGGQLREALLDTGADDTVLEDINLPGKWKPKMIGGIGGFIKVKQYEQVLIEICGKKAIGTVLVGPTPVNIIGRNMLTQIGCTLNF. The pKi is 9.2. (5) The drug is NC1CCc2c(Br)ccc(-c3ccccc3)c2CC1=O. The target protein (P97449) has sequence MAKGFYISKTLGILGILLGVAAVCTIIALSVVYAQEKNRNAENSATAPTLPGSTSATTATTTPAVDESKPWNQYRLPKTLIPDSYRVILRPYLTPNNQGLYIFQGNSTVRFTCNQTTDVIIIHSKKLNYTLKGNHRVVLRTLDGTPAPNIDKTELVERTEYLVVHLQGSLVEGRQYEMDSQFQGELADDLAGFYRSEYMEGDVKKVVATTQMQAADARKSFPCFDEPAMKAMFNITLIYPNNLIALSNMLPKESKPYPEDPSCTMTEFHSTPKMSTYLLAYIVSEFKNISSVSANGVQIGIWARPSAIDEGQGDYALNVTGPILNFFAQHYNTSYPLPKSDQIALPDFNAGAMENWGLVTYRESSLVFDSQSSSISNKERVVTVIAHELAHQWFGNLVTVAWWNDLWLNEGFASYVEYLGADYAEPTWNLKDLMVLNDVYRVMAVDALASSHPLSSPADEIKTPDQIMELFDSITYSKGASVIRMLSSFLTEDLFKKGLS.... The pKi is 9.5. (6) The compound is CC(C)C(c1ccc(C(C)(F)F)c(F)c1)n1nc(CF)c2c(=O)[nH]c(N(C)C)nc21. The target protein sequence is MVLVLHHILIAVVQFLRRGQQVFLKPDEPPPPPPQPCADSLQDALLSLGSVIDISGLQRAVKEALSAVLPRVETVYTYLLDGESRLVCEDPPHELPQEGKVWEAIISQKRLGCNGLGLSDLPGKPLARLVAPLAPHTQVLVIPLVDKEAGAVAAVILVHCGQLSDNEEWSLQAVEKHTLVALRRVQALQQRRPSEAPRAVQNPPEGAVEDQKGGAAYTDRDRKILQLCGELYDLDASSLQLKVLQYLQQETRASRCCLLLVSEDSLQLSCKVMGDKVLGEEISFPLTGCLGQVVEDKKSIQLKDLTSEDVQQLQSMLGCELQAMLCVPVISRATDQVVALACAFNKLEGDLFTDQDEHVIQHCFHYTSTVLTSTLAFQKEQKLKCECQALLQVAKNLFTHLDDVSVLLQEIITEARNLSNAEICSVFLLDQNELVAKVFDGGVVDDESYEIRIPADQGIAGHVATTGQILNIPDAYAHPLFYRGVDDSTGFRTRNILCFP.... The pKi is 9.5. (7) The small molecule is C[C@@H](OP(=O)(O)OP(=O)(O)O)[C@H]1OC(n2cnc3c(N)ncnc32)[C@H](O)[C@@H]1O. The target protein (P11980) has sequence MPKPDSEAGTAFIQTQQLHAAMADTFLEHMCRLDIDSAPITARNTGIICTIGPASRSVEMLKEMIKSGMNVARLNFSHGTHEYHAETIKNVRAATESFASDPILYRPVAVALDTKGPEIRTGLIKGSGTAEVELKKGATLKITLDNAYMEKCDENILWLDYKNICKVVEVGSKIYVDDGLISLQVKEKGADYLVTEVENGGSLGSKKGVNLPGAAVDLPAVSEKDIQDLKFGVEQDVDMVFASFIRKAADVHEVRKVLGEKGKNIKIISKIENHEGVRRFDEILEASDGIMVARGDLGIEIPAEKVFLAQKMMIGRCNRAGKPVICATQMLESMIKKPRPTRAEGSDVANAVLDGADCIMLSGETAKGDYPLEAVRMQHLIAREAEAAVFHRLLFEELARASSQSTDPLEAMAMGSVEASYKCLAAALIVLTESGRSAHQVARYRPRAPIIAVTRNPQTARQAHLYRGIFPVLCKDAVLDAWAEDVDLRVNLAMNVGKAR.... The pKi is 2.1.