Dataset: Drug-target binding data from BindingDB using Ki measurements. Task: Regression. Given a target protein amino acid sequence and a drug SMILES string, predict the binding affinity score between them. We predict pKi (pKi = -log10(Ki in M); higher means stronger inhibition). Dataset: bindingdb_ki. (1) The compound is COc1ccc2c(C)c(C(=O)N(CCCN3CCCCC3)c3ccccc3)oc2c1. The target protein sequence is MDILCEENTSLSSTTNSLMQLNDDTRLYSNDFNSGEANTSDAFNWTVDSENRTNLSCEGCLSPSCLSLLHLQEKNWSALLTAVVIILTIAGNILVIMAVSLEKKLQNATNYFLMSLAIADMLLGFLVMPVSMLTILYGYRWPLPSKLCAVWIYLDVLFSTASIMHLCAISLDRYVAIQNPIHHSRFNSRTKAFLKIIAVWTISVGISMPIPVFGLQDDSKVFKEGSCLLADDNFVLIGSFVSFFIPLTIMVITYFLTIKSLQKEATLCVSDLGTRAKLASFSFLPQSSLSSEKLFQRSIHREPGSYTGRRTMQSISNEQKACKVLGIVFFLFVVMWCPFFITNIMAVICKESCNEDVIGALLNVFVWIGYLSSAVNPLVYTLFNKTYRSAFSRYIQCQYKENKKPLQLILVNTIPALAYKSSQLQMGQKKNSKQDAKTTDNDCSMVALGKQHSEEASKDNSDGVNEKVSCV. The pKi is 5.6. (2) The drug is N=C(Nc1cccc(/C=C\c2cccc(NC(=N)c3cccs3)c2)c1)c1cccs1. The target protein (P29473) has sequence MGNLKSVGQEPGPPCGLGLGLGLGLCGKQGPASPAPEPSRAPAPATPHAPDHSPAPNSPTLTRPPEGPKFPRVKNWELGSITYDTLCAQSQQDGPCTPRCCLGSLVLPRKLQTRPSPGPPPAEQLLSQARDFINQYYSSIKRSGSQAHEERLQEVEAEVASTGTYHLRESELVFGAKQAWRNAPRCVGRIQWGKLQVFDARDCSSAQEMFTYICNHIKYATNRGNLRSAITVFPQRAPGRGDFRIWNSQLVRYAGYRQQDGSVRGDPANVEITELCIQHGWTPGNGRFDVLPLLLQAPDEAPELFVLPPELVLEVPLEHPTLEWFAALGLRWYALPAVSNMLLEIGGLEFSAAPFSGWYMSTEIGTRNLCDPHRYNILEDVAVCMDLDTRTTSSLWKDKAAVEINLAVLHSFQLAKVTIVDHHAATVSFMKHLDNEQKARGGCPADWAWIVPPISGSLTPVFHQEMVNYILSPAFRYQPDPWKGSATKGAGITRKKTFKE.... The pKi is 3.3. (3) The drug is NC(N)=NCCN1C[C@H](O)[C@H](O)[C@H]1CO. The target protein sequence is MAKNVVLDHDGNLDDFVAMVLLASNTEKVRLIGALCTDADCFVENGFNVTGKIMCLMHNNMNLPLFPIGKSAATAVNPFPKEWRCLAKNMDDMPILNIPENVELWDKIKAENEKYEGQQLLADLVMNSEEKVTICVTGPLSNVAWCIDKYGEKFTSKVEECVIMGGAVDVRGNVFLPSTDGTAEWNIYWDPASAKTVFGCPGLRRIMFSLDSTNTVPVRSPYVQRFGEQTNFLLSILVGTMWAMCTHCELLRDGDGYYAWDALTAAYVVDQKVANVDPVPIDVVVDKQPNEGATVRTDAEKYPLTFVARNPEAEFFLDMLLRSARAC. The pKi is 3.5. (4) The compound is CCN(CC)C(=O)c1ccc2cc1CN(C)C(=O)[C@H](Nc1ccc3c(N)nccc3c1)c1ccc(c(C)c1)[C@@H](C)COC(=O)N2. The target protein (P06870) has sequence MWFLVLCLALSLGGTGAAPPIQSRIVGGWECEQHSQPWQAALYHFSTFQCGGILVHRQWVLTAAHCISDNYQLWLGRHNLFDDENTAQFVHVSESFPHPGFNMSLLENHTRQADEDYSHDLMLLRLTEPADTITDAVKVVELPTEEPEVGSTCLASGWGSIEPENFSFPDDLQCVDLKILPNDECKKAHVQKVTDFMLCVGHLEGGKDTCVGDSGGPLMCDGVLQGVTSWGYVPCGTPNKPSVAVRVLSYVKWIEDTIAENS. The pKi is 6.2. (5) The compound is CN(C(=O)Cc1ccc(Cl)c(Cl)c1)C1CCCC[C@H]1N1CCCC1. The target protein sequence is MCFNLTMKKKKECCAPACPSSCFPNTSWLLGWDDHDNVSAYPDLPLNEGNHTSISPTISVIITAVYSMVFVVGLVGNALVMFVIIRYTKMKTATNIYIFNLALADALVTTTMPFQSTSFLMNSWPFGDVLCKIVVSIDYYNMFTSIFTLTMMSVDRYIAVCHPVKALDFRTPLKAKCINICIWMLSSSVGISAIVLGGTKISDGSTECALQFPTHYWYWDTVMKMCVFIFAFIIPVFIITICYTLMILRLKSVRLLSGSREKDRNLRRITRLVLVVVAVFIVCWTPIHIFVLVEALVDVPQSIAVVSIYYFCIALGYTNSSLNPILYAFLDENFKRCFKDFCFPSKHRLDRQPNSRVGNTVQDPACNRHGSQKPV. The pKi is 5.0. (6) The compound is N[C@H]1CCSC1=O. The target protein (Q15822) has sequence MGPSCPVFLSFTKLSLWWLLLTPAGGEEAKRPPPRAPGDPLSSPSPTALPQGGSHTETEDRLFKHLFRGYNRWARPVPNTSDVVIVRFGLSIAQLIDVDEKNQMMTTNVWLKQEWSDYKLRWNPTDFGNITSLRVPSEMIWIPDIVLYNNADGEFAVTHMTKAHLFSTGTVHWVPPAIYKSSCSIDVTFFPFDQQNCKMKFGSWTYDKAKIDLEQMEQTVDLKDYWESGEWAIVNATGTYNSKKYDCCAEIYPDVTYAFVIRRLPLFYTINLIIPCLLISCLTVLVFYLPSDCGEKITLCISVLLSLTVFLLLITEIIPSTSLVIPLIGEYLLFTMIFVTLSIVITVFVLNVHHRSPSTHTMPHWVRGALLGCVPRWLLMNRPPPPVELCHPLRLKLSPSYHWLESNVDAEEREVVVEEEDRWACAGHVAPSVGTLCSHGHLHSGASGPKAEALLQEGELLLSPHMQKALEGVHYIADHLRSEDADSSVKEDWKYVAMVI.... The pKi is 5.0. (7) The drug is Cc1ccc(N(CC2=NCCN2)c2cccc(O)c2)cc1. The target protein (Q28838) has sequence MGSLQPDAGNASWNGTEAPGGGARATPYSLQVTLTLVCLAGLLMLFTVFGNVLVIIAVFTSRALKAPQNLFLVSLASADILVATLVIPFSLANEVMGYWYFGKAWCEIYLALDVLFCTSSIVHLCAISLDRYWSITQAIEYNLKRTPRRIKAIIVTVWVISAVISFPPLISFEKKRGRSGQPSAEPRCEINDQKWYVISSSIGSFFAPCLIMILVYVRIYQIAKRRTRVPPSRRGPDATAAELPGSAERRPNGLGPERGGVGPVGAEVESLQVQLNGAPGEPAPAGAGADALDLEESSSSEHAERPPGSRRSERGPRAKGKARASQVKPGDSLPRRGPGATGLGAPTAGPAEERSGGGAKASRWRGRQNREKRFTFVLAVVIGVFVVCWFPFFFTYTLTAIGCPVPPTLFKFFFWFGYCNSSLNPVIYTIFNHDFRRAFKKILCRGDRKRIV. The pKi is 9.1.