This data is from Drug-target binding data from BindingDB using IC50 measurements. The task is: Regression. Given a target protein amino acid sequence and a drug SMILES string, predict the binding affinity score between them. We predict pIC50 (pIC50 = -log10(IC50 in M); higher means more potent). Dataset: bindingdb_ic50. (1) The drug is CN(Cc1ccco1)C(=O)C(=O)O. The target protein sequence is LIVKKNLGDVVLFDIVKNMPHGKALDTSHTNVMAYSNCKVSGSNTYDDLAGADVVIVTAGFTKAPGKSDKEWNRDDLLPLNNKIMIEIGGHIKKNCPNAFIIVVTNPVDVMVQLLHQHSGVPKNKIIGLGGVLDTSRLKYYISQKLNVCPRDVNAHIVGAHGNKMVLLKRYITVGGIPLQEFINNKLISDAELEAIFDRTVNTALEIVNLHASPYVAPAAAIIEMAESYLKDLKKVLICSTLLEGQYGHSDIFGGTPVVLGANGVEQ. The pIC50 is 3.7. (2) The drug is Cc1cccc(CSc2nnc(NC(=O)CSc3nc4ccc(N5C(=O)c6ccccc6C5=O)cc4s3)s2)c1. The target protein (Q99952) has sequence MSRSLDSARSFLERLEARGGREGAVLAGEFSDIQACSAAWKADGVCSTVAGSRPENVRKNRYKDVLPYDQTRVILSLLQEEGHSDYINGNFIRGVDGSLAYIATQGPLPHTLLDFWRLVWEFGVKVILMACREIENGRKRCERYWAQEQEPLQTGLFCITLIKEKWLNEDIMLRTLKVTFQKESRSVYQLQYMSWPDRGVPSSPDHMLAMVEEARRLQGSGPEPLCVHCSAGCGRTGVLCTVDYVRQLLLTQMIPPDFSLFDVVLKMRKQRPAAVQTEEQYRFLYHTVAQMFCSTLQNASPHYQNIKENCAPLYDDALFLRTPQALLAIPRPPGGVLRSISVPGSPGHAMADTYAVVQKRGAPAGAGSGTQTGTGTGTGARSAEEAPLYSKVTPRAQRPGAHAEDARGTLPGRVPADQSPAGSGAYEDVAGGAQTGGLGFNLRIGRPKGPRDPPAEWTRV. The pIC50 is 4.0. (3) The drug is CC(C)(C)[C@@H]1CC2OC(=O)C34OC5OC(=O)[C@H](O)C51[C@@]23CC1OC(=O)[C@](C)(O)[C@@]14O. The target protein (P23416) has sequence MNRQLVNILTALFAFFLETNHFRTAFCKDHDSRSGKQPSQTLSPSDFLDKLMGRTSGYDARIRPNFKGPPVNVTCNIFINSFGSVTETTMDYRVNIFLRQQWNDSRLAYSEYPDDSLDLDPSMLDSIWKPDLFFANEKGANFHDVTTDNKLLRISKNGKVLYSIRLTLTLSCPMDLKNFPMDVQTCTMQLESFGYTMNDLIFEWLSDGPVQVAEGLTLPQFILKEEKELGYCTKHYNTGKFTCIEVKFHLERQMGYYLIQMYIPSLLIVILSWVSFWINMDAAPARVALGITTVLTMTTQSSGSRASLPKVSYVKAIDIWMAVCLLFVFAALLEYAAVNFVSRQHKEFLRLRRRQKRQNKEEDVTRESRFNFSGYGMGHCLQVKDGTAVKATPANPLPQPPKDGDAIKKKFVDRAKRIDTISRAAFPLAFLIFNIFYWITYKIIRHEDVHKK. The pIC50 is 4.0. (4) The small molecule is O=S(=O)(Nc1ncns1)c1cc(Cl)c(NCC23CCCN2CCC3)cc1F. The target protein (Q14524) has sequence MANFLLPRGTSSFRRFTRESLAAIEKRMAEKQARGSTTLQESREGLPEEEAPRPQLDLQASKKLPDLYGNPPQELIGEPLEDLDPFYSTQKTFIVLNKGKTIFRFSATNALYVLSPFHPIRRAAVKILVHSLFNMLIMCTILTNCVFMAQHDPPPWTKYVEYTFTAIYTFESLVKILARGFCLHAFTFLRDPWNWLDFSVIIMAYTTEFVDLGNVSALRTFRVLRALKTISVISGLKTIVGALIQSVKKLADVMVLTVFCLSVFALIGLQLFMGNLRHKCVRNFTALNGTNGSVEADGLVWESLDLYLSDPENYLLKNGTSDVLLCGNSSDAGTCPEGYRCLKAGENPDHGYTSFDSFAWAFLALFRLMTQDCWERLYQQTLRSAGKIYMIFFMLVIFLGSFYLVNLILAVVAMAYEEQNQATIAETEEKEKRFQEAMEMLKKEHEALTIRGVDTVSRSSLEMSPLAPVNSHERRSKRRKRMSSGTEECGEDRLPKSDSE.... The pIC50 is 5.1. (5) The compound is CC1(C)Oc2ccc(C#N)cc2[C@@H](N(Cc2ncc[nH]2)c2ccc(Cl)cc2)[C@@H]1O. The target protein (P05631) has sequence MFSRAGVAGLSAWTVQPQWIQVRNMATLKDITRRLKSIKNIQKITKSMKMVAAAKYARAERELKPARVYGVGSLALYEKADIKTPEDKKKHLIIGVSSDRGLCGAIHSSVAKQMKSEAANLAAAGKEVKIIGVGDKIRSILHRTHSDQFLVTFKEVGRRPPTFGDASVIALELLNSGYEFDEGSIIFNRFRSVISYKTEEKPIFSLDTISSAESMSIYDDIDADVLRNYQEYSLANIIYYSLKESTTSEQSARMTAMDNASKNASEMIDKLTLTFNRTRQAVITKELIEIISGAAALD. The pIC50 is 6.6. (6) The compound is CC[C@H](c1ccc(C(=O)NCc2nn[nH]n2)cc1)N1C(=O)C(c2cc(Cl)cc(Cl)c2)=N[C@]12CC[C@@H](C(C)(C)C)CC2. The target protein (P43220) has sequence MAGAPGPLRLALLLLGMVGRAGPRPQGATVSLWETVQKWREYRRQCQRSLTEDPPPATDLFCNRTFDEYACWPDGEPGSFVNVSCPWYLPWASSVPQGHVYRFCTAEGLWLQKDNSSLPWRDLSECEESKRGERSSPEEQLLFLYIIYTVGYALSFSALVIASAILLGFRHLHCTRNYIHLNLFASFILRALSVFIKDAALKWMYSTAAQQHQWDGLLSYQDSLSCRLVFLLMQYCVAANYYWLLVEGVYLYTLLAFSVLSEQWIFRLYVSIGWGVPLLFVVPWGIVKYLYEDEGCWTRNSNMNYWLIIRLPILFAIGVNFLIFVRVICIVVSKLKANLMCKTDIKCRLAKSTLTLIPLLGTHEVIFAFVMDEHARGTLRFIKLFTELSFTSFQGLMVAILYCFVNNEVQLEFRKSWERWRLEHLHIQRDSSMKPLKCPTSSLSSGATAGSSMYTATCQASCS. The pIC50 is 5.0. (7) The drug is CC[C@@H](COCC1CC1)N1C(=O)[C@@](C)(CC(=O)O)CC(c2cccc(Cl)c2)C1c1ccc(Cl)cc1. The target protein sequence is MCNTNMSVPTDGAVTTSQIPASEQETLVRPKPLLLKLLKSVGAQKDTYTMKEVLFYLGQYIMTKRLYDEKQQHIVYCSNDLLGDLFGVPSFSVKEHRKIYTMIYRNLVVVNQQESSDSGTSVSENRCHLEGGSDQKDLVQELQEEKPSSSHLVSRPSTSSRRRAISETEENSDELSGERQRKRHKSDS. The pIC50 is 7.5.