Task: Regression. Given a target protein amino acid sequence and a drug SMILES string, predict the binding affinity score between them. We predict pKi (pKi = -log10(Ki in M); higher means stronger inhibition). Dataset: bindingdb_ki.. Dataset: Drug-target binding data from BindingDB using Ki measurements (1) The small molecule is C#CC(C)OC(C)(C)C. The target protein (P00178) has sequence MEFSLLLLLAFLAGLLLLLFRGHPKAHGRLPPGPSPLPVLGNLLQMDRKGLLRSFLRLREKYGDVFTVYLGSRPVVVLCGTDAIREALVDQAEAFSGRGKIAVVDPIFQGYGVIFANGERWRALRRFSLATMRDFGMGKRSVEERIQEEARCLVEELRKSKGALLDNTLLFHSITSNIICSIVFGKRFDYKDPVFLRLLDLFFQSFSLISSFSSQVFELFPGFLKHFPGTHRQIYRNLQEINTFIGQSVEKHRATLDPSNPRDFIDVYLLRMEKDKSDPSSEFHHQNLILTVLSLFFAGTETTSTTLRYGFLLMLKYPHVTERVQKEIEQVIGSHRPPALDDRAKMPYTDAVIHEIQRLGDLIPFGVPHTVTKDTQFRGYVIPKNTEVFPVLSSALHDPRYFETPNTFNPGHFLDANGALKRNEGFMPFSLGKRICLGEGIARTELFLFFTTILQNFSIASPVPPEDIDLTPRESGVGNVPPSYQIRFLAR. The pKi is 4.4. (2) The small molecule is COC(=O)C1=C(C)N=C(C)N(CCCCCN2CCC(C(=O)OC)(c3ccccc3)CC2)C1c1ccc(F)c(F)c1. The target protein (P25100) has sequence MTFRDLLSVSFEGPRPDSSAGGSSAGGGGGSAGGAAPSEGPAVGGVPGGAGGGGGVVGAGSGEDNRSSAGEPGSAGAGGDVNGTAAVGGLVVSAQGVGVGVFLAAFILMAVAGNLLVILSVACNRHLQTVTNYFIVNLAVADLLLSATVLPFSATMEVLGFWAFGRAFCDVWAAVDVLCCTASILSLCTISVDRYVGVRHSLKYPAIMTERKAAAILALLWVVALVVSVGPLLGWKEPVPPDERFCGITEEAGYAVFSSVCSFYLPMAVIVVMYCRVYVVARSTTRSLEAGVKRERGKASEVVLRIHCRGAATGADGAHGMRSAKGHTFRSSLSVRLLKFSREKKAAKTLAIVVGVFVLCWFPFFFVLPLGSLFPQLKPSEGVFKVIFWLGYFNSCVNPLIYPCSSREFKRAFLRLLRCQCRRRRRRRPLWRVYGHHWRASTSGLRQDCAPSSGDAPPGAPLALTALPDPDPEPPGTPEMQAPVASRRKPPSAFREWRLL.... The pKi is 6.7. (3) The small molecule is CCC(C)[C@H](NC(=O)[C@H](Cc1ccc(O)cc1)NC(=O)[C@H](CCC[NH+]=C(N)N)NC(=O)[C@H](CCCNC(N)=[NH2+])NC(=O)[C@H](CC(C)C)NC(=O)[C@H](CC(=O)O)NC(=O)[C@H](C)NC(=O)[C@H](C)NC(=O)[C@H](Cc1ccc(O)cc1)NC(=O)[C@H](CCC(N)=O)NC(=O)[C@H](C)NC(=O)[C@H](CCSC)NC(=O)[C@H](CC(N)=O)NC(=O)[C@H](CCC(=O)O)NC(=O)[C@@H]1CCCN1C(=O)[C@@H](NC(=O)[C@H](C)NC(=O)[C@H](CC(N)=O)NC(=O)[C@H](CC(=O)O)NC(=O)C(C)NC(=O)[C@@H]1CCCN1C(=O)[C@H](Cc1ccc(O)cc1)NC(=O)[C@@H](NC(=O)[C@@H]1CCCN1C(=O)[C@H](CCC(=O)O)NC(=O)[C@H](CC(C)C)NC(=O)[C@@H]1CCCN1C(=O)C(C)N)C(C)C)C(C)O)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CCSC)C(=O)N[C@@H](CC(C)C)C(=O)N[C@H](C(=O)N[C@@H](CCCN=C(N)N)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CCCN=C(N)N)C(=O)N[C@@H](Cc1ccc(O)cc1)C(N)=O)[C@@H](C)O. The target protein sequence is MDLGFKDYTNRTPTKNTSATTKNFSAWEDYKSSVDDIQYFLIGLYTLISLAGFVGNLLVLTALTKRKQKTIINILIGNLAFSDILVVLFCSPFTLTSVLLDRWMFGTVMCHIMPFLQCTSVLVSTLMLISIAAVRYRMVKYPLSSNLTAKHGYFLIVIIWAVGCAICSPLPVFHKIVDLHKTLNLEALENRLLCIESWPSDSYRIAFTISLLLMQYILPLVCLTASHTSVCRSVGSRLSSKEGKFQENEMINLTLHPSKSAGTEAQPSSHTSWSCALVRKHHRRYSKKTSTVMPAILRQQQDADFRDLPETSGTEKSQLSSSSKFIPGVPICFEMKPEENTEIQDMITVSQSIIRIKTRSRRVFCRLTVLILVFGFSWMPLHLFHIVTDFNATLISNRHFKLVYCICHLLGMMSCCLNPILYGFLNNSIKADLMSLIPCCQIL. The pKi is 9.2. (4) The drug is O=S(=O)(O)OC[C@H]1O[C@H](Oc2ccccc2)[C@H](O)[C@@H](O)[C@@H]1O. The target protein (A8NS89) has sequence MTETVTDQGKQRSSKLQKNEAAKDEQVEGKGKETLESGTDKSAEQNSSLLVGQPDVIDNDNVQTVDDFKNLMYKMQETRRAIVFALLNEKDLTKDDVEILKRAYEKLTDNQTHSFQREMCTLTTKLSVNIGDETRGLEKDLKYLDALMNIRREEPNLLWPIIMSRVDLFSILANYHPKGKETFLKEYEDTVKFLKTFISSEAITGKKPIFITDWDGTMKDYCSQYATNLQPVYSAVGMTRFAASFTRISAVLTAGPLRGPGILDLTAMPIDGPVMFSGSWGREWWLSGKRVVHQDGITDEGFNALQRLDDEMKDLLHTSDYAPFALVGSGVQRKVDRLTLGVQTVCHHVTSELSNRYQMAVKERMHRVDPNSQILVFDPSTELEVEVVAHNSGIIWNKGNGVERLIKSLGDSLQSPGKILICGDTLSDIPMVRQAVKQNPDGVLAIFVGAKMSLREEVKQVIGDESRCCFVSCPDVIHAAMSQILNEHCIGK. The pKi is 3.5. (5) The drug is COc1ccccc1N1CCN(Cc2cn([C@@H]3O[C@H](CO)[C@@H](O)[C@H](O)[C@H]3F)nn2)CC1. The target protein (Q28998) has sequence MGAGALALGASEPCNLSSAAPLPDGAATAARLLVPASPPASLLTPASEGSVQLSQQWTAGMGLLMALIVLLIVAGNVLVIVAIAKTPRLQTLTNLFIMSLASADLVMGLLVVPFGATIVVWGRWEYGSFFCELWTSVDVLCVTASIETLCVIALDRYLAITSPFRYQSLLTRAARALVCTVWAISALVSFLPILMHWWRDKGAEARRCYNDPKCCDFVTNRAYAIASSVVSFYVPLCIMAFVYLRVFREAQKQVKKIDSCERRFLGSPARPPSPAPSPGSPLPAAAAAAPVANGRTSKRRPSRLVALREQKALKTLGIIMGVFTLCWLPFFLANVVKAFHRDLVPDRLFVFFNWLGYANSAFNPIIYCRSPDFRKAFQRLLCCARRVARGSCAAAGDGPRASGCLAVARPPPSPGAASDDDDDEEDVGAAPPAPLLEPWAGYNGGAARDSDSSLDERTPGGRASESKV. The pKi is 6.3. (6) The small molecule is O=C1c2cc(Br)ccc2-n2c1nc1ccccc1c2=O. The target protein (Q6ZQW0) has sequence MLHFHYYDTSNKIMEPHRPNVKTAVPLSLESYHISEEYGFLLPDSLKELPDHYRPWMEIANKLPQLIDAHQLQAHVDKMPLLSCQFLKGHREQRLAHLVLSFLTMGYVWQEGEAQPAEVLPRNLALPFVEVSRNLGLPPILVHSDLVLTNWTKKDPDGFLEIGNLETIISFPGGESLHGFILVTALVEKEAVPGIKALVQATNAILQPNQEALLQALQRLRLSIQDITKTLGQMHDYVDPDIFYAGIRIFLSGWKDNPAMPAGLMYEGVSQEPLKYSGGSAAQSTVLHAFDEFLGIRHSKESGDFLYRMRDYMPPSHKAFIEDIHSAPSLRDYILSSGQDHLLTAYNQCVQALAELRSYHITMVTKYLITAAAKAKHGKPNHLPGPPQALKDRGTGGTAVMSFLKSVRDKTLESILHPRG. The pKi is 5.2. (7) The target protein (P31390) has sequence MSFANTSSTFEDKMCEGNRTAMASPQLLPLVVVLSSISLVTVGLNLLVLYAVHSERKLHTVGNLYIVSLSVADLIVGAVVMPMNILYLIMTKWSLGRPLCLFWLSMDYVASTASIFSVFILCIDRYRSVQQPLRYLRYRTKTRASATILGAWFFSFLWVIPILGWHHFMPPAPELREDKCETDFYNVTWFKIMTAIINFYLPTLLMLWFYVKIYKAVRRHCQHRQLTNGSLPSFSELKLRSDDTKEGAKKPGRESPWGVLKRPSRDPSVGLDQKSTSEDPKMTSPTVFSQEGERETRPCFRLDIMQKQSVAEGDVRGSKANDQALSQPKMDEQSLNTCRRISETSEDQTLVDQQSFSRTTDSDTSIEPGPGRVKSRSGSNSGLDYIKITWKRLRSHSRQYVSGLHLNRERKAAKQLGFIMAAFILCWIPYFIFFMVIAFCKSCCSEPMHMFTIWLGYINSTLNPLIYPLCNENFKKTFKKILHIRS. The pKi is 5.3. The drug is CN(C)CCON=C(C=Cc1ccc(O)cc1)c1ccccc1F.