Dataset: Drug-target binding data from BindingDB using Ki measurements. Task: Regression. Given a target protein amino acid sequence and a drug SMILES string, predict the binding affinity score between them. We predict pKi (pKi = -log10(Ki in M); higher means stronger inhibition). Dataset: bindingdb_ki. (1) The drug is CC(C)(C)NC(=O)[C@@H]1CN(Cc2cccnc2)CCN1C[C@@H](O)C[C@@H](Cc1ccccc1)C(=O)N[C@H]1c2ccccc2C[C@H]1O. The target protein sequence is PQVTLWQRPLVTIKIGGQLREALLDTGADDTIFEEISLPGRWKPKMIGGIGGFIKVRQYDQIPIEICGHKVIGTVLVGPTPANVIGRNLMTQIGCTLNF. The pKi is 6.5. (2) The target protein sequence is MGVEIETISPGDGRTFPKKGQTCVVHYTGMLQNGKKFDSSRDRNKPFRFKIGRQEVIKGFEEGVTQMSLGQRAKLTCTPEMAYGATGHPGVIPPNATLLFDVELLRLE. The pKi is 7.8. The compound is CCC(C)(C)C(=O)C(=O)N1CCCCC1C(=O)OC(CCCc1ccccc1)CCc1ccc(OC)cc1. (3) The target protein (Q5MY95) has sequence MGLSRKEQVFLALLGASGVSGLTALILLLVEATSVLLPTDIKFGIVFDAGSSHTSLFLYQWLANKENGTGVVSQALACQVEGPGISSYTSNAAQAGESLQGCLEEALVLIPEAQHRKTPTFLGATAGMRLLSRKNSSQARDIFAAVTQVLGRSPVDFWGAELLAGQAEGAFGWITVNYGLGTLVKYSFTGEWIQPPEEMLVGALDMGGASTQITFVPGGPILDKSTQADFRLYGSDYSVYTHSYLCFGRDQMLSRLLVGLVQSRPAALLRHPCYLSGYQTTLALGPLYESPCVHATPPLSLPQNLTVEGTGNPGACVSAIRELFNFSSCQGQEDCAFDGVYQPPLRGQFYAFSNFYYTFHFLNLTSRQPLSTVNATIWEFCQRPWKLVEASYPGQDRWLRDYCASGLYILTLLHEGYGFSEETWPSLEFRKQAGGVDIGWTLGYMLNLTGMIPADAPAQWRAESYGVWVAKVVFMVLALVAVVGAALVQLFWLQD. The small molecule is Nc1c(S(=O)(=O)[O-])cc(Nc2ccc(Nc3nc(Cl)nc(Nc4cccc(S(=O)(=O)[O-])c4)n3)c(S(=O)(=O)[O-])c2)c2c1C(=O)c1ccccc1C2=O. The pKi is 4.0. (4) The compound is CCCOc1ccccc1C1(O)OC(=O)c2cccc3cccc1c23. The target protein (P0A884) has sequence MKQYLELMQKVLDEGTQKNDRTGTGTLSIFGHQMRFNLQDGFPLVTTKRCHLRSIIHELLWFLQGDTNIAYLHENNVTIWDEWADENGDLGPVYGKQWRAWPTPDGRHIDQITTVLNQLKNDPDSRRIIVSAWNVGELDKMALAPCHAFFQFYVADGKLSCQLYQRSCDVFLGLPFNIASYALLVHMMAQQCDLEVGDFVWTGGDTHLYSNHMDQTHLQLSREPRPLPKLIIKRKPESIFDYRFEDFEIEGYDPHPGIKAPVAI. The pKi is 3.5. (5) The compound is CC[C@H](C)[C@H](NC(=O)[C@H](CCCNC(=N)N)NC(=O)[C@H](CCCNC(=N)N)NC(=O)[C@H](CC(C)C)NC(=O)[C@H](Cc1ccccc1)NC(=O)CNC(=O)CNC(=O)[C@@H](N)Cc1ccc(O)cc1)C(=O)N[C@@H](CCCNC(=N)N)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCCCN)C(=O)O. The target protein sequence is MDSPIQIFRGEPGPTCAPSACLPPNSSAWFPGWAEPDSNGSAGSEDAQLEPAHISPAIPVIITAVYSVVFVVGLVGNSLVMFVIIRYTKMKTATNIYIFNLALADALVTTTMPFQSTVYLMNSWPFGDVLCKIVISIDYYNMFTSIFTLTMMSVDRYIAVCHPVKALDFRTPLKAKIINICIWLLSSSVGISAIVLGGTKVREDVDVIECSLQFPDDDYSWWDLFMKICVFIFAFVIPVLIIIVCYTLMILRLKSVRLLSGSREKDRNLRRITRLVLVVVAVFVVCWTPIHIFILVEALGSTSHSTAALSSYFFCIALGYTNSSLNPILYAFLDENFKRCFRDFCFPLKMRMERQSTSRVRNTVQDPAYLRDIDGMNKPV. The pKi is 8.0. (6) The drug is Cc1cc(Cl)c(O)c(Cc2c(O)c(Cl)cc(Cl)c2Cl)c1Cl. The target protein (P53686) has sequence MSVSTASTEMSVRKIAAHMKSNPNAKVIFMVGAGISTSCGIPDFRSPGTGLYHNLARLKLPYPEAVFDVDFFQSDPLPFYTLAKELYPGNFRPSKFHYLLKLFQDKDVLKRVYTQNIDTLERQAGVKDDLIIEAHGSFAHCHCIGCGKVYPPQVFKSKLAEHPIKDFVKCDVCGELVKPAIVFFGEDLPDSFSETWLNDSEWLREKITTSGKHPQQPLVIVVGTSLAVYPFASLPEEIPRKVKRVLCNLETVGDFKANKRPTDLIVHQYSDEFAEQLVEELGWQEDFEKILTAQGGMGDNSKEQLLEIVHDLENLSLDQSEHESADKKDKKLQRLNGHDSDEDGASNSSSSQKAAKE. The pKi is 5.6. (7) The compound is CCOC(=O)/C=C/C(=O)/C=C/C(=O)OCC. The target protein (P0A6L2) has sequence MFTGSIVAIVTPMDEKGNVCRASLKKLIDYHVASGTSAIVSVGTTGESATLNHDEHADVVMMTLDLADGRIPVIAGTGANATAEAISLTQRFNDSGIVGCLTVTPYYNRPSQEGLYQHFKAIAEHTDLPQILYNVPSRTGCDLLPETVGRLAKVKNIIGIKEATGNLTRVNQIKELVSDDFVLLSGDDASALDFMQLGGHGVISVTANVAARDMAQMCKLAAEGHFAEARVINQRLMPLHNKLFVEPNPIPVKWACKELGLVATDTLRLPMTPITDSGRETVRAALKHAGLL. The pKi is 2.3. (8) The compound is CC(C)C[C@H](NC(=O)[C@H](CC(N)=O)NC(=O)/C=C/C(=O)NCC(=O)NCC(=O)N[C@@H](Cc1ccccc1)C(=O)O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@H](C(N)=O)C(C)C. The target protein sequence is MPRTEMVRFVRLPVVLLAMAACLASVALGSLHVEESLEMRFAAFKKKYGKVYKDAKEEAFRFRAFEENMEQAKIQAAANPYATFGVTPFSDMTREEFRARYRNGASYFAAAQKRLRKTVNVTTGRAPAAVDWREKGAVTPVKDQGQCGSCWAFSTIGNIEGQWQVAGNPLVSLSEQMLVSCDTIDFGCGGGLMDNAFNWIVNSNGGNVFTEASYPYVSGNGEQPQCQMNGHEIGAAITDHVDLPQDEDAIAAYLAENGPLAIAVDATSFMDYNGGILTSCTSEQLDHGVLLVGYNDSSNPPYWIIKNSWSNMWGEDGYIRIEKGTNQCLMNQAVSSAVVGGPTPPPPPPPPPSATFTQDFCEGKGCTKGCSHATFPTGECVQTTGVGSVIATCGASNLTQIIYPLSRSCSGLSVPITVPLDKCIPILIGSVEYHCSTNPPTKAARLVPHQ. The pKi is 5.6. (9) The compound is CSc1sc(C(=N)N)cc1S(=O)(=O)c1cccc(-c2ccccc2)c1. The target protein (P09871) has sequence MWCIVLFSLLAWVYAEPTMYGEILSPNYPQAYPSEVEKSWDIEVPEGYGIHLYFTHLDIELSENCAYDSVQIISGDTEEGRLCGQRSSNNPHSPIVEEFQVPYNKLQVIFKSDFSNEERFTGFAAYYVATDINECTDFVDVPCSHFCNNFIGGYFCSCPPEYFLHDDMKNCGVNCSGDVFTALIGEIASPNYPKPYPENSRCEYQIRLEKGFQVVVTLRREDFDVEAADSAGNCLDSLVFVAGDRQFGPYCGHGFPGPLNIETKSNALDIIFQTDLTGQKKGWKLRYHGDPMPCPKEDTPNSVWEPAKAKYVFRDVVQITCLDGFEVVEGRVGATSFYSTCQSNGKWSNSKLKCQPVDCGIPESIENGKVEDPESTLFGSVIRYTCEEPYYYMENGGGGEYHCAGNGSWVNEVLGPELPKCVPVCGVPREPFEEKQRIIGGSDADIKNFPWQVFFDNPWAGGALINEYWVLTAAHVVEGNREPTMYVGSTSVQTSRLAKS.... The pKi is 6.6. (10) The small molecule is CC[C@H](C)[C@H](NC(=O)[C@H](Cc1ccc(O)cc1)NC(=O)[C@@H](NC(=O)[C@@H](N)CCCN=C(N)N)C(C)C)C(=O)N[C@@H](Cc1cnc[nH]1)C(=O)N1CCC[C@H]1C(=O)N[C@@H](Cc1ccccc1)C(=O)O. The target protein (Q58DD0) has sequence MTGSFWLLLSLVAVTAAQSTTEEQAKTFLEKFNHEAEDLSYQSSLASWNYNTNITDENVQKMNEARAKWSAFYEEQSRMAKTYSLEEIQNLTLKRQLKALQHSGTSALSAEKSKRLNTILNKMSTIYSTGKVLDPNTQECLALEPGLDDIMENSRDYNRRLWAWEGWRAEVGKQLRPLYEEYVVLENEMARANNYEDYGDYWRGDYEVTGAGDYDYSRDQLMKDVERTFAEIKPLYEQLHAYVRAKLMHTYPSYISPTGCLPAHLLGDMWGRFWTNLYSLTVPFEHKPSIDVTEKMENQSWDAERIFKEAEKFFVSISLPYMTQGFWDNSMLTEPGDGRKVVCHPTAWDLGKGDFRIKMCTKVTMDDFLTAHHEMGHIQYDMAYAAQPYLLRNGANEGFHEAVGEIMSLSAATPHYLKALGLLAPDFHEDNETEINFLLKQALTIVGTLPFTYMLEKWRWMVFKGEIPKQQWMEKWWEMKREIVGVVEPLPHDETYCDPA.... The pKi is 7.4.