Dataset: Drug-target binding data from BindingDB using Ki measurements. Task: Regression. Given a target protein amino acid sequence and a drug SMILES string, predict the binding affinity score between them. We predict pKi (pKi = -log10(Ki in M); higher means stronger inhibition). Dataset: bindingdb_ki. (1) The small molecule is CNC(=O)[C@@]12C[C@@H]1[C@@H](n1cnc3c(NC)cc(C#Cc4ccc(Cl)s4)nc31)[C@H](O)[C@@H]2O. The target protein (Q28309) has sequence MAVNGTALLLANVTYITVEILIGLCAIVGNVLVIWVVKLNPSLQTTTFYFIVSLALADIAVGVLVMPLAIVISLGITIQFYNCLFMTCLLLIFTHASIMSLLAIAVDRYLRVKLTVRYRRVTTQRRIWLALGLCWLVSFLVGLTPMFGWNMKLTSEHQRNVTFLSCQFSSVMRMDYMVYFSFFTWILIPLVVMCAIYLDIFYVIRNKLNQNFSSSKETGAFYGREFKTAKSLFLVLFLFAFSWLPLSIINCITYFHGEVPQIILYLGILLSHANSMMNPIVYAYKIKKFKETYLLIFKTYMICQSSDSLDSSTE. The pKi is 7.1. (2) The small molecule is CC[C@H](C)[C@H](NC(=O)[C@H](CCCNC(=N)N)NC(=O)[C@H](CCC(N)=O)NC(=O)CNC(=O)[C@H](CCC(=O)O)NC(=O)[C@H](Cc1ccccc1)NC(=O)[C@H](CC(C)C)NC(=O)[C@H](Cc1ccccc1)NC(=O)[C@H](CCCNC(=N)N)NC(=O)[C@H](CC(C)C)NC(=O)[C@@H](N)CO)C(=O)N[C@@H](C)C(=O)N[C@@H](CC(=O)O)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](Cc1cnc[nH]1)C(=O)O. The target protein sequence is MATTGTPTADRGDAAATDDPAARFQVQKHSWDGLRSIIHGSRKYSGLIVNKAPHDFQFVQKTDESGPHSHRLYYLGMPYGSRENSLLYSEIPKKVRKEALLLLSWKQMLDHFQATPHHGVYSREEELLRERKRLGVFGITSYDFHSESGLFLFQASNSLFHCRDGGKNGFMVSPMKPLEIKTQCSGPRMDPKICPADPAFFSFINNSDLWVANIETGEERRLTFCHQGLSNVLDDPKSAGVATFVIQEEFDRFTGYWWCPTASWEGSEGLKTLRILYEEVDESEAEVIHVPSPALEERKTDSYRYPRTGSKNPKIALKLAEFQTDSQGKIVSTQEKELVQPFSSLFPKVEYIARAGWTRDGKYAWAMFLDRPQQWLQLVLLPPALFIPSTENEEQRLASARAVPRNVQPYVVYEEVTNVWINVHDIFYPFPQSEGEDELCFLRANECKTGFCHLYKVTAVLKSQGYDWSEPFSPGEDEFKCPIKEEIALTSGEWEVLARH.... The pKi is 5.0. (3) The small molecule is Cc1cc(OCC2(C)COC2)cc(C)c1-c1cccc(COc2ccc3c(c2)OC[C@H]3CC(=O)O)c1. The target protein (Q8K3T4) has sequence MDLPPQLSFALYVSAFALGFPLNLLAIRGAVSHAKLRLTPSLVYTLHLACSDLLLAITLPLKAVEALASGVWPLPLPFCPVFALAHFAPLYAGGGFLAALSAGRYLGAAFPFGYQAIRRPCYSWGVCVAIWALVLCHLGLALGLEAPRGWVDNTTSSLGINIPVNGSPVCLEAWDPDSARPARLSFSILLFFLPLVITAFCYVGCLRALVHSGLSHKRKLRAAWVAGGALLTLLLCLGPYNASNVASFINPDLEGSWRKLGLITGAWSVVLNPLVTGYLGTGPGQGTICVTRTPRGTIQK. The pKi is 7.0. (4) The compound is CN1CCc2nc(C(=O)N[C@@H]3C[C@@H](C(=O)N(C)C)CC[C@@H]3NC(=O)c3cc4cc(Cl)ccc4[nH]3)sc2C1. The pKi is 8.7. The target protein (O19045) has sequence MANPLHLVLLGAALAGLLLSGSSVFISRRAANDVLARTRRANSFLEELKKGNLERECMEENCSYEEALEVFEDREKTNEFWNKYVDGDQCESNPCQNQGTCKDGLGMYTCSCVEGYEGQDCEPVTRKLCSLDNGGCDQFCKEEENSVLCSCASGYTLGDNGKSCISTELFPCGKVTLGRWRRSPATNSSEGPPEAPGPEQQDDGNLTATENPFNLLDSPEPPPEDDSSSLVRIVGGQDCRDGECPWQALLVNEENEGFCGGTILSEYHVLTAAHCLHQAKRFKVRVGDRDTEHEEGNEETHEVEVVVKHNRFVKETYDFDIAVLRLKTPITFRRNVAPACLPQKDWAESTLMAQKTGIVSGFGRTHEMGRLSTTLKMLEVPYVDRNSCKRSSSFTITQNMFCAGYDARPEDACQGDSGGPHVTRFRDTYFVTGIVSWGEGCARKGKFGVYTKVSNFLKWIEKSMRARAVPVAEAAGTPGPTQPTIKGSPS. (5) The pKi is 8.2. The target protein (P35342) has sequence MPVNSTAVSWTSVTYITVEILIGLCAIVGNVLVIWVVKLNPSLQTTTFYFIVSLALADIAVGVLVMPLAIVISLGVTIHFYSCLFMTCLMLIFTHASIMSLLAIAVDRYLRVKLTVRYRRVTTQRRIWLALGLCWLVSFLVGLTPMFGWNMKLSSADENLTFLPCRFRSVMRMDYMVYFSFFLWILVPLVVMCAIYFDIFYIIRNRLSQSFSGSRETGAFYGREFKTAKSLLLVLFLFALCWLPLSIINCILYFDGQVPQTVLYLGILLSHANSMMNPIVYAYKIKKFKETYLLILKACVMCQPSKSMDPSTEQTSE. The compound is CC(Cc1ccccc1)Nc1ncnc2c1ncn2C1OC(CO)C(O)C1O. (6) The compound is COc1ccc(N(C)C(=O)c2ccc(O)cc2O)cc1. The target protein (Q16654) has sequence MKAARFVLRSAGSLNGAGLVPREVEHFSRYSPSPLSMKQLLDFGSENACERTSFAFLRQELPVRLANILKEIDILPTQLVNTSSVQLVKSWYIQSLMDLVEFHEKSPDDQKALSDFVDTLIKVRNRHHNVVPTMAQGIIEYKDACTVDPVTNQNLQYFLDRFYMNRISTRMLMNQHILIFSDSQTGNPSHIGSIDPNCDVVAVVQDAFECSRMLCDQYYLSSPELKLTQVNGKFPDQPIHIVYVPSHLHHMLFELFKNAMRATVEHQENQPSLTPIEVIVVLGKEDLTIKISDRGGGVPLRIIDRLFSYTYSTAPTPVMDNSRNAPLAGFGYGLPISRLYAKYFQGDLNLYSLSGYGTDAIIYLKALSSESIEKLPVFNKSAFKHYQMSSEADDWCIPSREPKNLAKEVAM. The pKi is 5.9. (7) The compound is O=C(O)c1cc(-c2ccc(-c3cccc(C(F)(F)F)c3)cc2)on1. The target protein sequence is MAVRELPGAWNFRDVADTATALRPGRLFRSSELSRLDDAGRATLRRLGITDVADLRSSREVARRGPGRVPDGIDVHLLPFPDLADDDADDSAPHETAFKRLLTNDGSNGESGESSQSINDAATRYMTDEYRQFPTRNGAQRALHRVVTLLAAGRPVLTHCFAGKDRTGFVVALVLEAVGLDRDVIVADYLRSNDSVPQLRARISEMIQQRFDTELAPEVVTFTKARLSDGVLGVRAEYLAAARQTIDETYGSLGGYLRDAGISQATVNRMRGVLLG. The pKi is 5.6.