Dataset: Drug-target binding data from BindingDB using Ki measurements. Task: Regression. Given a target protein amino acid sequence and a drug SMILES string, predict the binding affinity score between them. We predict pKi (pKi = -log10(Ki in M); higher means stronger inhibition). Dataset: bindingdb_ki. (1) The compound is O=C(CCCN1CCc2ccccc2C1)c1ccc(Cl)cc1. The target protein (P30729) has sequence MGNSSATGDGGLLAGRGPESLGTGTGLGGAGAAALVGGVLLIGMVLAGNSLVCVSVASERILQTPTNYFIVSLAAADLLLAVLVLPLFVYSEVQGGVWLLSPRLCDTLMAMDVMLCTASIFNLCAISVDRFVAVTVPLRYNQQGQCQLLLIAATWLLSAAVAAPVVCGLNDVPGRDPTVCCLEDRDYVVYSSICSFFLPCPLMLLLYWATFRGLRRWEAARHTKLHSRAPRRPSGPGPPVSDPTQGPLFSDCPPPSPSLRTSPTVSSRPESDLSQSPCSPGCLLPDAALAQPPAPSSRRKRGAKITGRERKAMRVLPVVVGAFLMCWTPFFVVHITRALCPACFVSPRLVSAVTWLGYVNSALNPIIYTIFNAEFRSVFRKTLRLRC. The pKi is 8.1. (2) The compound is COC(=O)c1c(C)oc(-c2ccc(C)cc2)c1CC(=O)c1ccc(C)cc1. The target protein sequence is MFSAGHKIKGTVVLMPKNELEVNPDGSAVDNLNAFLGRSVSLQLISATKADAHGKGKVGKDTFLEGINTSLPTLGAGESAFNIHFEWDGSMGIPGAFYIKNYMQVEFFLKSLTLEAISNQGTIRFVCNSWVYNTKLYKSVRIFFANHTYVPSETPAPLVEYREEELKSLRGNGTGERKEYDRIYDYDVYNDLGNPDKSEKLARPVLGGSSTFPYPRRGRTGRGPTVTDPNTEKQGEVFYVPRDENLGHLKSKDALEIGTKSLSQIVQPAFESAFDLKSTPIEFHSFQDVHDLYEGGIKLPRDVISTIIPLPVIKELYRTDGQHILKFPQPHVVQVSQSAWMTDEEFAREMIAGVNPCVIRGLEEFPPKSNLDPAIYGDQSSKITADSLDLDGYTMDEALGSRRLFMLDYHDIFMPYVRQINQLNSAKTYATRTILFLREDGTLKPVAIELSLPHSAGDLSAAVSQVVLPAKEGVESTIWLLAKAYVIVNDSCYHQLMSHW.... The pKi is 6.8. (3) The compound is COc1ccc2c(c1)c1c(n2C)CCCC1NC(=O)C1CC1. The target protein (P49288) has sequence MERPGSNGSCSGCRLEGGPAARAASGLAAVLIVTIVVDVLGNALVILSVLRNKKLRNAGNIFVVSLSVADLVVAVYPYPLILSAIFHNGWTMGNIHCQISGFLMGLSVIGSIFNITAIAINRYCYICHSLRYDKLFNLKNTCCYICLTWTLTVVAIVPNFFVGSLQYDPRIYSCTFAQTVSTSYTITVVVVHFIVPLSIVTFCYLRIWILVIQVKHRVRQDCKQKIRAADIRNFLTMFVVFVLFAVCWGPLNFIGLAVSINPSKVQPHIPEWLFVLSYFMAYFNSCLNAVIYGLLNQNFRKEYKRILLMLRTPRLLFIDVSKGGTEGLKSKPSPAVTNNNQAEIHL. The pKi is 7.5. (4) The drug is CC(C)[C@@H](NS(=O)(=O)c1ccc(-c2ccc(Br)cc2)cc1)C(=O)O. The target protein (P08254) has sequence MKSLPILLLLCVAVCSAYPLDGAARGEDTSMNLVQKYLENYYDLKKDVKQFVRRKDSGPVVKKIREMQKFLGLEVTGKLDSDTLEVMRKPRCGVPDVGHFRTFPGIPKWRKTHLTYRIVNYTPDLPKDAVDSAVEKALKVWEEVTPLTFSRLYEGEADIMISFAVREHGDFYPFDGPGNVLAHAYAPGPGINGDAHFDDDEQWTKDTTGTNLFLVAAHEIGHSLGLFHSANTEALMYPLYHSLTDLTRFRLSQDDINGIQSLYGPPPDSPETPLVPTEPVPPEPGTPANCDPALSFDAVSTLRGEILIFKDRHFWRKSLRKLEPELHLISSFWPSLPSGVDAAYEVTSKDLVFIFKGNQFWAIRGNEVRAGYPRGIHTLGFPPTVRKIDAAISDKEKNKTYFFVEDKYWRFDEKRNSMEPGFPKQIAEDFPGIDSKIDAVFEEFGFFYFFTGSSQLEFDPNAKKVTHTLKSNSWLNC. The pKi is 7.3. (5) The drug is CC(C)(C)C[C@@H](O)CC(=O)[O-]. The target protein (P43155) has sequence MLAFAARTVVKPLGFLKPFSLMKASSRFKAHQDALPRLPVPPLQQSLDHYLKALQPIVSEEEWAHTKQLVDEFQASGGVGERLQKGLERRARKTENWLSEWWLKTAYLQYRQPVVIYSSPGVMLPKQDFVDLQGQLRFAAKLIEGVLDFKVMIDNETLPVEYLGGKPLCMNQYYQILSSCRVPGPKQDTVSNFSKTKKPPTHITVVHNYQFFELDVYHSDGTPLTADQIFVQLEKIWNSSLQTNKEPVGILTSNHRNSWAKAYNTLIKDKVNRDSVRSIQKSIFTVCLDATMPRVSEDVYRSHVAGQMLHGGGSRLNSGNRWFDKTLQFIVAEDGSCGLVYEHAAAEGPPIVTLLDYVIEYTKKPELVRSPLVPLPMPKKLRFNITPEIKSDIEKAKQNLSIMIQDLDITVMVFHHFGKDFPKSEKLSPDAFIQMALQLAYYRIYGQACATYESASLRMFHLGRTDTIRSASMDSLTFVKAMDDSSVTEHQKVELLRKAV.... The pKi is 2.1. (6) The small molecule is CC(=O)N[C@H](C(=O)N[C@H](C(=O)N[C@@H](Cc1ccc(O)cc1)C(=O)N[C@H]1CSSC[C@@H](C(=O)N[C@@H](Cc2ccc(O)cc2)C(=O)N[C@H](C(=O)N[C@@H](CCCNC(=N)N)C(N)=O)[C@@H](C)O)NC(=O)[C@H](Cc2ccccc2)NC(=O)[C@H](C)NC(=O)[C@H](CC(N)=O)NC(=O)[C@H](Cc2ccccc2)NC(=O)[C@H](Cc2ccccc2)NC(=O)[C@H](CCCNC(=N)N)NC1=O)[C@@H](C)O)[C@@H](C)O. The target protein (P33033) has sequence MNSSCCLSSVSPMLPNLSEHPAAPPASNRSGSGFCEQVFIKPEVFLALGIVSLMENILVILAVVRNGNLHSPMYFFLCSLAAADMLVSLSNSLETIMIAVINSDSLTLEDQFIQHMDNIFDSMICISLVASICNLLAIAIDRYVTIFYALRYHSIMTVRKALTLIGVIWVCCGICGVMFIIYSESKMVIVCLITMFFAMVLLMGTLYIHMFLFARLHVQRIAVLPPAGVVAPQQHSCMKGAVTITILLGVFIFCWAPFFLHLVLIITCPTNPYCICYTAHFNTYLVLIMCNSVIDPLIYAFRSLELRNTFKEILCGCNSMNLG. The pKi is 6.2. (7) The target is MLLARMKPQVQPELGGADQ. The small molecule is C=CCN1CC[C@]23CCCCC2[C@H]1Cc1ccc(O)cc13. The pKi is 6.1. (8) The pKi is 3.9. The target protein (Q00688) has sequence MAAAVPQRAWTVEQLRSEQLPKKDIIKFLQEHGSDSFLAEHKLLGNIKNVAKTANKDHLVTAYNHLFETKRFKGTESISKVSEQVKNVKLNEDKPKETKSEETLDEGPPKYTKSVLKKGDKTNFPKKGDVVHCWYTGTLQDGTVFDTNIQTSAKKKKNAKPLSFKVGVGKVIRGWDEALLTMSKGEKARLEIEPEWAYGKKGQPDAKIPPNAKLTFEVELVDID. The compound is C[C@H]1C[C@H](C)C(=O)[C@H]([C@H](O)CC2CC(=O)NC(=O)C2)C1. (9) The drug is Nc1c(C=O)ncn1[C@@H]1O[C@H](COP(=O)([O-])[O-])[C@@H](O)[C@H]1O. The target protein (P38024) has sequence MAPAASELKLGKKVNEGKTKEVYELPDIPGCVLMQSKDQITAGNAARKDRMEGKAAISNTTTSCVFQLLQEAGIKTAFVRKQSDTAFIAAHCEMIPIEWVCRRIATGSFLKRNPGVKEGYKFYPPKIEMFYKDDANNDPQWSEEQLIEAKFSFAGLTIGKTEVDIMARSTQAIFEILEKSWQPQNCTLVDLKIEFGVNILTKEIVLADVIDNDSWRLWPSGDRSQQKDKQSYRDLKEVTPEALQMVKRNFEWVAERVELLLKTKSQGRVVVLMGSTSDLGHCEKIKKACATFGIPCELRVTSAHKGPDETLRIKAEYEGDGIPTVFVAVAGRSNGLGPVMSGNTAYPVVNCPPLSSDWGAQDVWSSLRLPSGLGCPTTLSPEGAAQFAAQIFGLNNHLVWAKLRSNMLNTWISLKQADKKLRECTL. The pKi is 4.2. (10) The small molecule is Nc1nc2c(c(=O)[nH]1)NCCS2. The target protein (P04176) has sequence MAAVVLENGVLSRKLSDFGQETSYIEDNSNQNGAISLIFSLKEEVGALAKVLRLFEENDINLTHIESRPSRLNKDEYEFFTYLDKRTKPVLGSIIKSLRNDIGATVHELSRDKEKNTVPWFPRTIQELDRFANQILSYGAELDADHPGFKDPVYRARRKQFADIAYNYRHGQPIPRVEYTEEEKQTWGTVFRTLKALYKTHACYEHNHIFPLLEKYCGFREDNIPQLEDVSQFLQTCTGFRLRPVAGLLSSRDFLGGLAFRVFHCTQYIRHGSKPMYTPEPDICHELLGHVPLFSDRSFAQFSQEIGLASLGAPDEYIEKLATIYWFTVEFGLCKEGDSIKAYGAGLLSSFGELQYCLSDKPKLLPLELEKTACQEYSVTEFQPLYYVAESFSDAKEKVRTFAATIPRPFSVRYDPYTQRVEVLDNTQQLKILADSINSEVGILCNALQKIKS. The pKi is 3.3.