Dataset: Drug-target binding data from BindingDB using Ki measurements. Task: Regression. Given a target protein amino acid sequence and a drug SMILES string, predict the binding affinity score between them. We predict pKi (pKi = -log10(Ki in M); higher means stronger inhibition). Dataset: bindingdb_ki. (1) The small molecule is COc1ccccc1OCCNCCc1ccc(O)c(O)c1. The target protein (P34971) has sequence MGAGALALGASEPCNLSSAAPLPDGAATAARLLVLASPPASLLPPASEGSAPLSQQWTAGMGLLLALIVLLIVVGNVLVIVAIAKTPRLQTLTNLFIMSLASADLVMGLLVVPFGATIVVWGRWEYGSFFCELWTSVDVLCVTASIETLCVIALDRYLAITSPFRYQSLLTRARARALVCTVWAISALVSFLPILMHWWRAESDEARRCYNDPKCCDFVTNRAYAIASSVVSFYVPLCIMAFVYLRVFREAQKQVKKIDSCERRFLGGPARPPSPEPSPSPGPPRPADSLANGRSSKRRPSRLVALREQKALKTLGIIMGVFTLCWLPFFLANVVKAFHRDLVPDRLFVFFNWLGYANSAFNPIIYCRSPDFRKAFQRLLCCARRAACRRRAAHGDRPRASGCLARAGPPPSPGAPSDDDDDDAGTTPPARLLEPWTGCNGGTTTVDSDSSLDEPGRQGFSSESKV. The pKi is 5.3. (2) The small molecule is COc1cc2c(cc1OC)CN(CCCCn1cc(-c3ccc(F)cc3)c3ccccc31)CC2. The target protein sequence is MGALAARRCVEWLLGLYFVSHIPITLFIDLQAVLPPELYPQEFSNLLRWYSKEFKDPLMQEPPVWFKSFLLCELVFQLPFFPIAAYAFFKGSCRWIRIPAIIYAAHTITTLIPILYTLLFEDFSKAVAFKGQRPESFRERLTLVGVYAPYLIIPLILLLFMLRNPYYKYEEKRKKK. The pKi is 8.3. (3) The compound is CCOC(=O)[C@@]12C[C@@H]1[C@@H](n1cnc3c(NC)nc(C#Cc4ccc(Cl)s4)nc31)[C@H](O)[C@@H]2O. The target protein (P0DMS9) has sequence MEGSPAGPIEQKEARWESSWEEQPDWTLGCLSPESQFRIPGLPGCILSFQLKVCFLPVMWLFILLSLALISDAMVMDEKVKRSFVLDTASAICNYNAHYKNHPKYWCRGYFRDYCNIIAFSPNSTNHVALRDTGNQLIVTMSCLTKEDTGWYWCGIQRDFARDDMDFTELIVTDDKGTLANDFWSGKDLSGNKTRSCKAPKVVRKADRSRTSILIICILITGLGIISVISHLTKRRRSQRNRRVGNTLKPFSRVLTPKEMAPTEQM. The pKi is 7.8. (4) The compound is C[C@H](O)[C@@H](CCOc1ccc2ccccc2c1)n1cnc(C(N)=O)c1. The target protein (P00813) has sequence MAQTPAFDKPKVELHVHLDGSIKPETILYYGRRRGIALPANTAEGLLNVIGMDKPLTLPDFLAKFDYYMPAIAGCREAIKRIAYEFVEMKAKEGVVYVEVRYSPHLLANSKVEPIPWNQAEGDLTPDEVVALVGQGLQEGERDFGVKARSILCCMRHQPNWSPKVVELCKKYQQQTVVAIDLAGDETIPGSSLLPGHVQAYQEAVKSGIHRTVHAGEVGSAEVVKEAVDILKTERLGHGYHTLEDQALYNRLRQENMHFEICPWSSYLTGAWKPDTEHAVIRLKNDQANYSLNTDDPLIFKSTLDTDYQMTKRDMGFTEEEFKRLNINAAKSSFLPEDEKRELLDLLYKAYGMPPSASAGQNL. The pKi is 8.0. (5) The compound is CCCC(=O)NCC[C@@H]1CCc2ccc(OC)cc21. The target protein (P16083) has sequence MAGKKVLIVYAHQEPKSFNGSLKNVAVDELSRQGCTVTVSDLYAMNLEPRATDKDITGTLSNPEVFNYGVETHEAYKQRSLASDITDEQKKVREADLVIFQFPLYWFSVPAILKGWMDRVLCQGFAFDIPGFYDSGLLQGKLALLSVTTGGTAEMYTKTGVNGDSRYFLWPLQHGTLHFCGFKVLAPQISFAPEIASEEERKGMVAAWSQRLQTIWKEEPIPCTAHWHFGQ. The pKi is 5.7. (6) The small molecule is NCC(=O)NC(CCc1ccccc1)P(=O)(O)CCC(=O)O. The target protein (P97821) has sequence MGPWTHSLRAVLLLVLLGVCTVRSDTPANCTYPDLLGTWVFQVGPRSSRSDINCSVMEATEEKVVVHLKKLDTAYDELGNSGHFTLIYNQGFEIVLNDYKWFAFFKYEVRGHTAISYCHETMTGWVHDVLGRNWACFVGKKVESHIEKVNMNAAHLGGLQERYSERLYTHNHNFVKAINTVQKSWTATAYKEYEKMSLRDLIRRSGHSQRIPRPKPAPMTDEIQQQILNLPESWDWRNVQGVNYVSPVRNQESCGSCYSFASMGMLEARIRILTNNSQTPILSPQEVVSCSPYAQGCDGGFPYLIAGKYAQDFGVVEESCFPYTAKDSPCKPRENCLRYYSSDYYYVGGFYGGCNEALMKLELVKHGPMAVAFEVHDDFLHYHSGIYHHTGLSDPFNPFELTNHAVLLVGYGRDPVTGIEYWIIKNSWGSNWGESGYFRIRRGTDECAIESIAVAAIPIPKL. The pKi is 3.7. (7) The drug is O=c1ccn(C/C=C\COC(c2ccccc2)(c2ccccc2)c2ccccc2)c(=O)[nH]1. The target protein sequence is MHLKIVCLSDEVREMYKNHKTHHEGDSGLDLFIVKDEVLKPKSTTFVKLGIKAIALQYKSNYYYKCEKSENKKKDDDKSNIVNTSFLLFPRSSISKTPLRLANSIGLIDAGYRGEIIAALDNTSDQEYHIKKNDKLVQLVSFTGEPLSFELVEELDETSRGEGGFGSTSNNKY. The pKi is 5.9.