Regression. Given a target protein amino acid sequence and a drug SMILES string, predict the binding affinity score between them. We predict pKd (pKd = -log10(Kd in M); higher means stronger binding). Dataset: bindingdb_kd. From a dataset of Drug-target binding data from BindingDB using Kd measurements. (1) The target protein (P23284) has sequence MLRLSERNMKVLLAAALIAGSVFFLLLPGPSAADEKKKGPKVTVKVYFDLRIGDEDVGRVIFGLFGKTVPKTVDNFVALATGEKGFGYKNSKFHRVIKDFMIQGGDFTRGDGTGGKSIYGERFPDENFKLKHYGPGWVSMANAGKDTNGSQFFITTVKTAWLDGKHVVFGKVLEGMEVVRKVESTKTDSRDKPLKDVIIADCGKIEVEKPFAIAKE. The drug is C/C=C/C[C@@H](C)[C@@H](O)[C@H]1C(=O)N[C@@H](CC)C(=O)N(C)[C@@H](SCCN(C)C)C(=O)N(C)[C@@H](CC(C)(C)O)C(=O)N[C@@H](C(C)C)C(=O)N(C)[C@@H](CC(C)C)C(=O)N[C@@H](C)C(=O)N[C@H](C)C(=O)N(C)[C@@H](CC(C)C)C(=O)N(C)[C@@H](CC(C)C)C(=O)N(C)[C@@H](C(C)C)C(=O)N1C. The pKd is 8.5. (2) The drug is CN(C)C/C=C/C(=O)Nc1cc2c(Nc3ccc(F)c(Cl)c3)ncnc2cc1O[C@H]1CCOC1. The target protein (Q9Y6E0) has sequence MDSRAQLWGLALNKRRATLPHPGGSTNLKADPEELFTKLEKIGKGSFGEVFKGIDNRTQKVVAIKIIDLEEAEDEIEDIQQEITVLSQCDSPYVTKYYGSYLKDTKLWIIMEYLGGGSALDLLEPGPLDETQIATILREILKGLDYLHSEKKIHRDIKAANVLLSEHGEVKLADFGVAGQLTDTQIKRNTFVGTPFWMAPEVIKQSAYDSKADIWSLGITAIELARGEPPHSELHPMKVLFLIPKNNPPTLEGNYSKPLKEFVEACLNKEPSFRPTAKELLKHKFILRNAKKTSYLTELIDRYKRWKAEQSHDDSSSEDSDAETDGQASGGSDSGDWIFTIREKDPKNLENGALQPSDLDRNKMKDIPKRPFSQCLSTIISPLFAELKEKSQACGGNLGSIEELRGAIYLAEEACPGISDTMVAQLVQRLQRYSLSGGGTSSH. The pKd is 5.0. (3) The compound is Oc1ccc(-c2nc(-c3ccncc3)c(-c3ccc(F)cc3)[nH]2)cc1. The target protein (Q9UQB9) has sequence MSSPRAVVQLGKAQPAGEELATANQTAQQPSSPAMRRLTVDDFEIGRPLGKGKFGNVYLARLKESHFIVALKVLFKSQIEKEGLEHQLRREIEIQAHLQHPNILRLYNYFHDARRVYLILEYAPRGELYKELQKSEKLDEQRTATIIEELADALTYCHDKKVIHRDIKPENLLLGFRGEVKIADFGWSVHTPSLRRKTMCGTLDYLPPEMIEGRTYDEKVDLWCIGVLCYELLVGYPPFESASHSETYRRILKVDVRFPLSMPLGARDLISRLLRYQPLERLPLAQILKHPWVQAHSRRVLPPCAQMAS. The pKd is 5.0. (4) The compound is CN[C@@H]1C[C@H]2O[C@@](C)([C@@H]1OC)n1c3ccccc3c3c4c(c5c6ccccc6n2c5c31)C(=O)NC4. The target protein (O00444) has sequence MATCIGEKIEDFKVGNLLGKGSFAGVYRAESIHTGLEVAIKMIDKKAMYKAGMVQRVQNEVKIHCQLKHPSILELYNYFEDSNYVYLVLEMCHNGEMNRYLKNRVKPFSENEARHFMHQIITGMLYLHSHGILHRDLTLSNLLLTRNMNIKIADFGLATQLKMPHEKHYTLCGTPNYISPEIATRSAHGLESDVWSLGCMFYTLLIGRPPFDTDTVKNTLNKVVLADYEMPSFLSIEAKDLIHQLLRRNPADRLSLSSVLDHPFMSRNSSTKSKDLGTVEDSIDSGHATISTAITASSSTSISGSLFDKRRLLIGQPLPNKMTVFPKNKSSTDFSSSGDGNSFYTQWGNQETSNSGRGRVIQDAEERPHSRYLRRAYSSDRSGTSNSQSQAKTYTMERCHSAEMLSVSKRSGGGENEERYSPTDNNANIFNFFKEKTSSSSGSFERPDNNQALSNHLCPGKTPFPFADPTPQTETVQQWFGNLQINAHLRKTTEYDSISP.... The pKd is 9.1. (5) The small molecule is CCN(CCO)CCCOc1ccc2c(Nc3cc(CC(=O)Nc4cccc(F)c4)n[nH]3)ncnc2c1. The target protein (Q8WU08) has sequence MGANTSRKPPVFDENEDVNFDHFEILRAIGKGSFGKVCIVQKNDTKKMYAMKYMNKQKCVERNEVRNVFKELQIMQGLEHPFLVNLWYSFQDEEDMFMVVDLLLGGDLRYHLQQNVHFKEETVKLFICELVMALDYLQNQRIIHRDMKPDNILLDEHGHVHITDFNIAAMLPRETQITTMAGTKPYMAPEMFSSRKGAGYSFAVDWWSLGVTAYELLRGRRPYHIRSSTSSKEIVHTFETTVVTYPSAWSQEMVSLLKKLLEPNPDQRFSQLSDVQNFPYMNDINWDAVFQKRLIPGFIPNKGRLNCDPTFELEEMILESKPLHKKKKRLAKKEKDMRKCDSSQTCLLQEHLDSVQKEFIIFNREKVNRDFNKRQPNLALEQTKDPQGEDGQNNNL. The pKd is 5.0. (6) The small molecule is CC(C)C[C@H](NC(=O)[C@H](CCC(N)=O)NC(=O)[C@H](CCCCN)NC(=O)[C@H](CCCNC(=N)N)NC(=O)[C@@H]1CCCN1C(=O)[C@H](C)NC(=O)[C@H](CCCCN)NC(=O)CCCC[C@@H]1SC[C@@H]2NC(=O)N[C@H]12)C(=O)N[C@@H](C)C(=O)N[C@H](C(=O)N[C@@H](CCCC[N+](C)(C)C)C(=O)N[C@@H](C)C(=O)N[C@@H](C)C(=O)N[C@@H](CCCNC(=N)N)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CO)C(=O)N[C@@H](C)C(=O)N1CCC[C@H]1C(=O)N[C@@H](C)C(=O)N[C@H](C(=O)NCC(=O)NCC(=O)O)[C@@H](C)O)[C@@H](C)O. The target protein sequence is PLASHRSTSQILPSMSVSICPSSTEVLKACRNPGKNGLSNSCILLDKCPPPRPPTSPYPPLPKDKLNPPTPSIYLENKRDAFFPPLHQFCTNPKNPVTVIRGLAGALKLDLGLFSTKTLVEANNEHMVEVRTQLLQPADENWDPTGTKKIWRCESNRSHTTIAKYAQYQASSFQESLREENEKRTQHKDHSDNESTSSENSGRRRKGPFKTIKFGTNIDLSDNKKWKLQLHELTKLPAFARVVSAGNLLTHVGHTILGMNTVQLYMKVPGSRTPGHQENNNFCSVNINIGPGDCEWFVVPEDYWGVLNDFCEKNNLNFLMSSWWPNLEDLYEANVPVYRFIQRPGDLVWINAGTVHWVQAVGWCNNIAWNVGPLTACQYKLAVERYEWNKLKSVKSPVPMVHLSWNMARNIKVSDPKLFEMIKYCLLKILKQYQTLREALVAAGKEVIWHGRTNDEPAHYCSICEVEVFNLLFVTNESNTQKTYIVHCHDCARKTSKSLE.... The pKd is 4.1.