From a dataset of Drug-target binding data from BindingDB using Kd measurements. Regression. Given a target protein amino acid sequence and a drug SMILES string, predict the binding affinity score between them. We predict pKd (pKd = -log10(Kd in M); higher means stronger binding). Dataset: bindingdb_kd. (1) The drug is CO[C@H]1O[C@H](CO)[C@H](O)[C@H](O)[C@H]1O. The target protein (Q05097) has sequence MAWKGEVLANNEAGQVTSIIYNPGDVITIVAAGWASYGPTQKWGPQGDREHPDQGLICHDAFCGALVMKIGNSGTIPVNTGLFRWVAPNNVQGAITLIYNDVPGTYGNNSGSFSVNIGKDQS. The pKd is 4.7. (2) The drug is CC1=NN(C(=O)c2ccc(O)cc2)C(=O)C1/N=N/c1ccc(S(=O)(=O)Nc2ncccn2)cc1. The target protein sequence is MASPPESDGFSDVRKVGYLRKPKSMHKRFFVLRAASEAGGPARLEYYENEKKWRHKSSAPKRSIPLESCFNINKRADSKNKHLVALYTRDEHFAIAADSEAEQDSWYQALLQLHNRAKGHHDGAAALGAGGGGGSCSGSSGLGEAGEDLSYGDVPPGPAFKEVWQVILKPKGLGQTKNLIGIYRLCLTSKTISFVKLNSEAAAVVLQLMNIRRCGHSENFFFIEVGRSAVTGPGEFWMQVDDSVVAQNMHETILEAMRAMSDEF. The pKd is 5.8. (3) The small molecule is COc1cc2c(Oc3ccc(NC(=O)C4(C(=O)NC5=CCC(F)C=C5)CC4)cc3F)ccnc2cc1OCCCN1CCOCC1. The target protein (Q13163) has sequence MLWLALGPFPAMENQVLVIRIKIPNSGAVDWTVHSGPQLLFRDVLDVIGQVLPEATTTAFEYEDEDGDRITVRSDEEMKAMLSYYYSTVMEQQVNGQLIEPLQIFPRACKPPGERNIHGLKVNTRAGPSQHSSPAVSDSLPSNSLKKSSAELKKILANGQMNEQDIRYRDTLGHGNGGTVYKAYHVPSGKILAVKVILLDITLELQKQIMSELEILYKCDSSYIIGFYGAFFVENRISICTEFMDGGSLDVYRKMPEHVLGRIAVAVVKGLTYLWSLKILHRDVKPSNMLVNTRGQVKLCDFGVSTQLVNSIAKTYVGTNAYMAPERISGEQYGIHSDVWSLGISFMELALGRFPYPQIQKNQGSLMPLQLLQCIVDEDSPVLPVGEFSEPFVHFITQCMRKQPKERPAPEELMGHPFIVQFNDGNAAVVSMWVCRALEERRSQQGPP. The pKd is 8.6. (4) The compound is O=C(NOCC1CC1)c1ccc(F)c(F)c1Nc1ccc(I)cc1Cl. The target protein (O15111) has sequence MERPPGLRPGAGGPWEMRERLGTGGFGNVCLYQHRELDLKIAIKSCRLELSTKNRERWCHEIQIMKKLNHANVVKACDVPEELNILIHDVPLLAMEYCSGGDLRKLLNKPENCCGLKESQILSLLSDIGSGIRYLHENKIIHRDLKPENIVLQDVGGKIIHKIIDLGYAKDVDQGSLCTSFVGTLQYLAPELFENKPYTATVDYWSFGTMVFECIAGYRPFLHHLQPFTWHEKIKKKDPKCIFACEEMSGEVRFSSHLPQPNSLCSLVVEPMENWLQLMLNWDPQQRGGPVDLTLKQPRCFVLMDHILNLKIVHILNMTSAKIISFLLPPDESLHSLQSRIERETGINTGSQELLSETGISLDPRKPASQCVLDGVRGCDSYMVYLFDKSKTVYEGPFASRSLSDCVNYIVQDSKIQLPIIQLRKVWAEAVHYVSGLKEDYSRLFQGQRAAMLSLLRYNANLTKMKNTLISASQQLKAKLEFFHKSIQLDLERYSEQMTY.... The pKd is 5.0. (5) The small molecule is Cc1cc2c(cc1S(=O)(=O)c1ccc(C(=O)O)cc1)C(C)(C)CCC2(C)C. The target protein (P18911) has sequence MATNKERLFAPGALGPGSGYPGAGFPFAFPGALRGSPPFEMLSPSFRGLGQPDLPKEMASLSVETQSTSSEEMVPSSPSPPPPPRVYKPCFVCNDKSSGYHYGVSSCEGCKGFFRRSIQKNMVYTCHRDKNCIINKVTRNRCQYCRLQKCFEVGMSKEAVRNDRNKKKKEVKEEGSPDSYELSPQLEELITKVSKAHQETFPSLCQLGKYTTNSSADHRVQLDLGLWDKFSELATKCIIKIVEFAKRLPGFTGLSIADQITLLKAACLDILMLRICTRYTPEQDTMTFSDGLTLNRTQMHNAGFGPLTDLVFAFAGQLLPLEMDDTETGLLSAICLICGDRMDLEEPEKVDKLQEPLLEALRLYARRRRPSQPYMFPRMLMKITDLRGISTKGAERAITLKMEIPGPMPPLIREMLENPEMFEDDSSKPGPHPKASSEDEAPGGQGKRGQSPQPDQGP. The pKd is 5.0. (6) The drug is CSCC[C@H](NC=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](Cc1ccccc1)C(=O)O. The target protein (O08790) has sequence METNYSIPLNGSDVVIYDSTISRVLWILSMVVVSITFFLGVLGNGLVIWVAGFRMPHTVTTIWYLNLALADFSFTATLPFLLVEMAMKEKWPFGWFLCKLVHIAVDVNLFGSVFLIAVIALDRCICVLHPVWAQNHRTVSLARNVVVGSWIFALILTLPLFLFLTTVRDARGDVHCRLSFVSWGNSVEERLNTAITFVTTRGIIRFIVSFSLPMSFVAICYGLITTKIHKKAFVNSSRPFRVLTGVVASFFICWFPFQLVALLGTVWLKEMQFSGSYKIIGRLVNPTSSLAFFNSCLNPILYVFMGQDFQERLIHSLSSRLQRALSEDSGHISDTRTNLASLPEDIEIKAI. The pKd is 6.5. (7) The pKd is 8.7. The drug is C#C[C@]1(O)CC[C@H]2[C@@H]3CCC4=CC(=O)CC[C@]4(C)[C@H]3CC[C@@]21C. The target protein (P04278) has sequence MESRGPLATSRLLLLLLLLLLRHTRQGWALRPVLPTQSAHDPPAVHLSNGPGQEPIAVMTFDLTKITKTSSSFEVRTWDPEGVIFYGDTNPKDDWFMLGLRDGRPEIQLHNHWAQLTVGAGPRLDDGRWHQVEVKMEGDSVLLEVDGEEVLRLRQVSGPLTSKRHPIMRIALGGLLFPASNLRLPLVPALDGCLRRDSWLDKQAEISASAPTSLRSCDVESNPGIFLPPGTQAEFNLRDIPQPHAEPWAFSLDLGLKQAAGSGHLLALGTPENPSWLSLHLQDQKVVLSSGSGPGLDLPLVLGLPLQLKLSMSRVVLSQGSKMKALALPPLGLAPLLNLWAKPQGRLFLGALPGEDSSTSFCLNGLWAQGQRLDVDQALNRSHEIWTHSCPQSPGNGTDASH. (8) The drug is CC[C@H](CO)Nc1nc(NCc2ccccc2)c2ncn(C(C)C)c2n1. The target protein (P08631) has sequence MGGRSSCEDPGCPRDEERAPRMGCMKSKFLQVGGNTFSKTETSASPHCPVYVPDPTSTIKPGPNSHNSNTPGIREAGSEDIIVVALYDYEAIHHEDLSFQKGDQMVVLEESGEWWKARSLATRKEGYIPSNYVARVDSLETEEWFFKGISRKDAERQLLAPGNMLGSFMIRDSETTKGSYSLSVRDYDPRQGDTVKHYKIRTLDNGGFYISPRSTFSTLQELVDHYKKGNDGLCQKLSVPCMSSKPQKPWEKDAWEIPRESLKLEKKLGAGQFGEVWMATYNKHTKVAVKTMKPGSMSVEAFLAEANVMKTLQHDKLVKLHAVVTKEPIYIITEFMAKGSLLDFLKSDEGSKQPLPKLIDFSAQIAEGMAFIEQRNYIHRDLRAANILVSASLVCKIADFGLARVIEDNEYTAREGAKFPIKWTAPEAINFGSFTIKSDVWSFGILLMEIVTYGRIPYPGMSNPEVIRALERGYRMPRPENCPEELYNIMMRCWKNRPEE.... The pKd is 5.0. (9) The pKd is 8.9. The target protein (P15498) has sequence MELWRQCTHWLIQCRVLPPSHRVTWDGAQVCELAQALRDGVLLCQLLNNLLPHAINLREVNLRPQMSQFLCLKNIRTFLSTCCEKFGLKRSELFEAFDLFDVQDFGKVIYTLSALSWTPIAQNRGIMPFPTEEESVGDEDIYSGLSDQIDDTVEEDEDLYDCVENEEAEGDEIYEDLMRSEPVSMPPKMTEYDKRCCCLREIQQTEEKYTDTLGSIQQHFLKPLQRFLKPQDIEIIFINIEDLLRVHTHFLKEMKEALGTPGAANLYQVFIKYKERFLVYGRYCSQVESASKHLDRVAAAREDVQMKLEECSQRANNGRFTLRDLLMVPMQRVLKYHLLLQELVKHTQEAMEKENLRLALDAMRDLAQCVNEVKRDNETLRQITNFQLSIENLDQSLAHYGRPKIDGELKITSVERRSKMDRYAFLLDKALLICKRRGDSYDLKDFVNLHSFQVRDDSSGDRDNKKWSHMFLLIEDQGAQGYELFFKTRELKKKWMEQFE.... The small molecule is CCCC(=O)O[C@@H]1[C@@H](C)[C@@]2(O)[C@@H](C=C(CO)C[C@]3(O)C(=O)C(C)=C[C@@H]23)[C@@H]2C(C)(C)[C@]12OC(=O)CCC. (10) The target protein sequence is MTLHSNSTTLPLFPNISTSWIHSPSEAGLPPGTVTHFGSYNISQAAGNFSSLNGTTSDPLGGHTIWQVVFIAFLTGFLALVTIIGNILVIVSFKVNKQLKHVNNYFLLSLADLIIGVISMNLFTTYIIMNRWALGNLACDLWLSIDYVASNASVMNLLVISFDRYFSITRPLTYRAKRTTKRAGVMIGLAWVISFVLWAPAILFWQYFVGKRTVPPGECFIQFLSEPTITFGTAIAAFYMPVTIMTILYWRIYKETEKRTKELAGLQASGTEAETENFVHPTGSSRSCSSYELQQQSLKHSSRRKYSRCHFWFATKSWKPNAGQMDQDHSSSDSWNNYDAAASLENSASDEEDIGSETRAIYSIVLKLPGHSTILNSTKLPSSDNLQVPEEDLEPMDMERNASKPQTQKSMDDGGSFQKSFSNLPIQLESTMDTAKTSDANSSVSKTMATLPLSFKEATLAKRFALRTRSQITKRKRMSLIKEKRAAQTLSAILLAFIIT.... The compound is CN(CCCCCCCCN(C)CCCCCCNCC(=O)N1c2ccccc2C(=O)Nc2cccnc21)CCCCCCNCC(=O)N1c2ccccc2C(=O)Nc2cccnc21. The pKd is 6.9.