This data is from Drug-target binding data from BindingDB using IC50 measurements. The task is: Regression. Given a target protein amino acid sequence and a drug SMILES string, predict the binding affinity score between them. We predict pIC50 (pIC50 = -log10(IC50 in M); higher means more potent). Dataset: bindingdb_ic50. (1) The compound is CC1=CCC[C@H]1NC(=O)Nc1ccc(Cl)c(S(=O)(=O)[C@@]2(C)CCOC2)c1O. The target protein (P32302) has sequence MNYPLTLEMDLENLEDLFWELDRLDNYNDTSLVENHLCPATEGPLMASFKAVFVPVAYSLIFLLGVIGNVLVLVILERHRQTRSSTETFLFHLAVADLLLVFILPFAVAEGSVGWVLGTFLCKTVIALHKVNFYCSSLLLACIAVDRYLAIVHAVHAYRHRRLLSIHITCGTIWLVGFLLALPEILFAKVSQGHHNNSLPRCTFSQENQAETHAWFTSRFLYHVAGFLLPMLVMGWCYVGVVHRLRQAQRRPQRQKAVRVAILVTSIFFLCWSPYHIVIFLDTLARLKAVDNTCKLNGSLPVAITMCEFLGLAHCCLNPMLYTFAGVKFRSDLSRLLTKLGCTGPASLCQLFPSWRRSSLSESENATSLTTF. The pIC50 is 5.0. (2) The drug is O=C(c1ccccc1Cl)c1oc2cc(O)ccc2c1-c1cccc2ccccc12. The target protein (Q62986) has sequence MEIKNSPSSLSSPASYNCSQSILPLEHGPIYIPSSYVDNRHEYSAMTFYSPAVMNYSVPGSTSNLDGGPVRLSTSPNVLWPTSGHLSPLATHCQSSLLYAEPQKSPWCEARSLEHTLPVNRETLKRKLSGSSCASPVTSPNAKRDAHFCPVCSDYASGYHYGVWSCEGCKAFFKRSIQGHNDYICPATNQCTIDKNRRKSCQACRLRKCYEVGMVKCGSRRERCGYRIVRRQRSSSEQVHCLSKAKRNGGHAPRVKELLLSTLSPEQLVLTLLEAEPPNVLVSRPSMPFTEASMMMSLTKLADKELVHMIGWAKKIPGFVELSLLDQVRLLESCWMEVLMVGLMWRSIDHPGKLIFAPDLVLDRDEGKCVEGILEIFDMLLATTSRFRELKLQHKEYLCVKAMILLNSSMYPLASANQEAESSRKLTHLLNAVTDALVWVIAKSGISSQQQSVRLANLLMLLSHVRHISNKGMEHLLSMKCKNVVPVYDLLLEMLNAHTL.... The pIC50 is 7.2. (3) The drug is COC(=O)[C@@H]1CCCN1C(=O)[C@@H](Cc1ccccc1)N(C)C(=O)[C@H](C)NC(=O)[C@@H](NC(=O)C[C@H](O)[C@H](CC(C)C)NC(=O)[C@H](CC(N)=O)NC(=O)[C@H](CC(C)C)NC(=O)[C@H](OC(=O)[C@@H](OC(=O)[C@H](C(C)C)N(C)C)C(C)C)C(C)C)[C@@H](C)O. The target protein (P31944) has sequence MSNPRSLEEEKYDMSGARLALILCVTKAREGSEEDLDALEHMFRQLRFESTMKRDPTAEQFQEELEKFQQAIDSREDPVSCAFVVLMAHGREGFLKGEDGEMVKLENLFEALNNKNCQALRAKPKVYIIQACRGEQRDPGETVGGDEIVMVIKDSPQTIPTYTDALHVYSTVEGYIAYRHDQKGSCFIQTLVDVFTKRKGHILELLTEVTRRMAEAELVQEGKARKTNPEIQSTLRKRLYLQ. The pIC50 is 5.3. (4) The drug is CC(=O)NCc1nc2ccc(NCc3ccccc3OCc3ccccc3)cc2[nH]1. The target protein (P41245) has sequence MSPWQPLLLALLAFGCSSAAPYQRQPTFVVFPKDLKTSNLTDTQLAEAYLYRYGYTRAAQMMGEKQSLRPALLMLQKQLSLPQTGELDSQTLKAIRTPRCGVPDVGRFQTFKGLKWDHHNITYWIQNYSEDLPRDMIDDAFARAFAVWGEVAPLTFTRVYGPEADIVIQFGVAEHGDGYPFDGKDGLLAHAFPPGAGVQGDAHFDDDELWSLGKGVVIPTYYGNSNGAPCHFPFTFEGRSYSACTTDGRNDGTPWCSTTADYDKDGKFGFCPSERLYTEHGNGEGKPCVFPFIFEGRSYSACTTKGRSDGYRWCATTANYDQDKLYGFCPTRVDATVVGGNSAGELCVFPFVFLGKQYSSCTSDGRRDGRLWCATTSNFDTDKKWGFCPDQGYSLFLVAAHEFGHALGLDHSSVPEALMYPLYSYLEGFPLNKDDIDGIQYLYGRGSKPDPRPPATTTTEPQPTAPPTMCPTIPPTAYPTVGPTVGPTGAPSPGPTSSPS.... The pIC50 is 4.0. (5) The small molecule is C/C1=C/C(=O)O[C@@H]2C[C@@H](CC[C@H](C)/C=C\C=C\CC1)O[C@@](O)([C@@H]1CSC(=O)N1)C2. The target protein sequence is MDSEVAALVIDNGSGMCKAGFAGDDAPRAVFPSIVGRPRHQGIMVGMGQKDSYVGDEAQSKRGILTLRYPIEHGIVTNWDDMEKIWHHTFYNELRVAPEEHPVLLTEAPMNPKSNREKMTQIMFETFNVPAFYVSIQAVLSLYSSGRTTGIVLDSGDGVTHVVPIYAGFSLPHAILRIDLAGKDLTDYLMKILSERGYSFSTTAEREIVRDIKEKLCYVALDFEQEMQTAAQSSSIEKSYELPDGQVITIGNERFRAPEALFHPSVLGLESAGIDQTTYNSIMKCDVDVRKELYGNIVMSGGTTMFPGIAERMQKEITALAPSSMKVKIIAPPERKYSVWIGGSILASLTTFQQMWISKQEYDESGPSIVHHKCF. The pIC50 is 7.0. (6) The pIC50 is 4.9. The target protein (O35430) has sequence MNHLEGSAEVEVADEAPGGEVNESVEADLEHPEVEEEQQPSPPPPAGHAPEDHRAHPAPPPPPPPQEEEEERGECLARSASTESGFHNHTDTAEGDVLAAARDGYEAERAQDADDESAYAVQYRPEAEEYTEQAEAEHAEAAQRRALPNHLHFHSLEHEEAMNAAYSGYVYTHRLFHRAEDEPYAEPYADYGGLQEHVYEEIGDAPELEARDGLRLYERERDEAAAYRQEALGARLHHYDERSDGESDSPEKEAEFAPYPRMDSYEQEEDIDQIVAEVKQSMSSQSLDKAAEDMPEAEQDLERAPTPGGGHPDSPGLPAPAGQQQRVVGTPGGSEVGQRYSKEKRDAISLAIKDIKEAIEEVKTRTIRSPYTPDEPKEPIWVMRQDISPTRDCDDQRPVDGDSPSPGSSSPLGAESSITPLHPGDPTEASTNKESRKSLASFPTYVEVPGPCDPEDLIDGIIFAANYLGSTQLLSDKTPSKNVRMMQAQEAVSRIKTAQK.... The drug is N#CC(C(=O)Nc1ccc(Cl)cc1)C(=S)Nc1ccccc1. (7) The small molecule is CCC(=O)Nc1ccc2ncnc(Nc3cccc(Br)c3)c2c1. The target protein sequence is GHMQTQGLAKDAWEIPRESLRLEVKLGQGCFGEVWMGTWNGTTRVAIKTLKPGTMSPEAFLQEAQVMKKLRHEKLVQLYAVVSEEPIYIVTEYMSKGCLLDFLKGEMGKYLRLPQLVDMAAQIASGMAYVERMNYVHRDLRAANILVGENLVCKVADFGLARLIEDNEYTARQGAKFPIKWTAPEAALYGRFTIKSDVWSFGILLTELTTKGRVPYPGMVNREVLDQVERGYRMPCPPECPESLHDLMCQCWRKDPEERPTFEYLQAFLEDYFTSTEPQYQPGENL. The pIC50 is 5.2. (8) The compound is CCCS(=O)(=O)Nc1ccc(C)c(Nc2ncccc2-c2ncnc3[nH]cnc23)c1. The target is CKENALLRYLLDKDD. The pIC50 is 7.0. (9) The compound is CNC(=O)O[C@@H](CC(C)C)c1nc([C@H]2OC(=O)/C(C)=C/C/C(C)=C/[C@@H](OC(=O)c3ccc(C4(C(F)(F)F)N=N4)cc3)[C@@H](C)/C=C(C)\C=C(C)/C=C/[C@@H](OC(=O)c3ccc(C4(C(F)(F)F)N=N4)cc3)[C@H](C)[C@H](OC)/C(C)=C/C=C/[C@@H]2C)cs1. The target protein (Q9U5N1) has sequence MSEYWLISAPGDKTCQQTWEALNQATKANNLSLNYKFPIPDLKVGTLDQLVGLSDDLGKLDTFVEGVTRKVAQYLGEVLEDQRDKLHENLTANNDDLPHYLTRFQWDMAKYPIKQSLRNIADIISKQVGQIDADLKVKSSAYNALKGNLQNLEKKQTGSLLTRNLADLVKKEHFILDSEYLTTLLVIVPKSMFNDWNANYEKITDMIVPRSTQLIHQDGDYGLFTVTLFKKVVDEFKLHARERKFVVREFAYNEADLVAGKNEITKLLTDKKKQFGPLVRWLKVNFSECFCAWIHVKALRVFVESVLRYGLPVNFQAALLVPSRRSARRLRDTLHALYAHLDHSAHHHANAQQDSVELAGLGFGQSEYYPYVFYKINIDMIEKAA. The pIC50 is 6.2.