This data is from Drug-target binding data from BindingDB using IC50 measurements. The task is: Regression. Given a target protein amino acid sequence and a drug SMILES string, predict the binding affinity score between them. We predict pIC50 (pIC50 = -log10(IC50 in M); higher means more potent). Dataset: bindingdb_ic50. (1) The drug is COCc1ccc(C(C)=O)c2sc(C(=O)Nc3ccc4c(C(=O)N5CCC5)cccc4n3)c(C)c12. The target protein sequence is MEDGPSNNASCFRRLTECFLSPSLTDEKVKAYLSLHPQVLDEFVSESVSAETVEKWLKRKNNKSEDESAPKEVSRYQDTNMQGVVYELNSYIEQRLDTGGDNQLLLYELSSIIKIATKADGFALYFLGECNNSLCIFTPPGIKEGKPRLIPAGPITQGTTVSAYVAKSRKTLLVEDILGDERFPRGTGLESGTRIQSVLCLPIVTAIGDLIGILELYRHWGKEAFCLSHQEVATANLAWASVAIHQVQVCRGLAKQTELNDFLLDVSKTYFDNIVAIDSLLEHIMIYAKNLVNADRCALFQVDHKNKELYSDLFDIGEEKEGKPVFKKTKEIRFSIEKGIAGQVARTGEVLNIPDAYADPRFNREVDLYTGYTTRNILCMPIVSRGSVIGVVQMVNKISGSAFSKTDENNFKMFAVFCALALHCANMYHRIRHSECIYRVTMEKLSYHSICTSEEWQGLMQFTLPVRLCKEIELFHFDIGPFENMWPGIFVYMVHRSCGT.... The pIC50 is 9.5. (2) The small molecule is NS(=O)(=O)Oc1ccc2nc(C=C3C4CC5CC(C4)CC3C5)oc2c1. The target protein (P08842) has sequence MPLRKMKIPFLLLFFLWEAESHAASRPNIILVMADDLGIGDPGCYGNKTIRTPNIDRLASGGVKLTQHLAASPLCTPSRAAFMTGRYPVRSGMASWSRTGVFLFTASSGGLPTDEITFAKLLKDQGYSTALIGKWHLGMSCHSKTDFCHHPLHHGFNYFYGISLTNLRDCKPGEGSVFTTGFKRLVFLPLQIVGVTLLTLAALNCLGLLHVPLGVFFSLLFLAALILTLFLGFLHYFRPLNCFMMRNYEIIQQPMSYDNLTQRLTVEAAQFIQRNTETPFLLVLSYLHVHTALFSSKDFAGKSQHGVYGDAVEEMDWSVGQILNLLDELRLANDTLIYFTSDQGAHVEEVSSKGEIHGGSNGIYKGGKANNWEGGIRVPGILRWPRVIQAGQKIDEPTSNMDIFPTVAKLAGAPLPEDRIIDGRDLMPLLEGKSQRSDHEFLFHYCNAYLNAVRWHPQNSTSIWKAFFFTPNFNPVGSNGCFATHVCFCFGSYVTHHDPP.... The pIC50 is 8.6. (3) The small molecule is Cc1sc(NC(=O)C2=C(C(=O)O)COCC2)c(-c2nc(C3CC3)no2)c1C. The target protein (P15090) has sequence MCDAFVGTWKLVSSENFDDYMKEVGVGFATRKVAGMAKPNMIISVNGDVITIKSESTFKNTEISFILGQEFDEVTADDRKVKSTITLDGGVLVHVQKWDGKSTTIKRKREDDKLVVECVMKGVTSTRVYERA. The pIC50 is 7.0. (4) The compound is C[C@@]1(c2cc(NC(=O)c3ccc(Cl)cn3)ccc2F)C[C@@H](c2ccccc2)OCC(N)=N1. The target protein (Q6IE75) has sequence MGALLRALLLPLLAQWLLRAVPVLAPAPFTLPLQVAGAANHRASTVPGLGTPELPRADGLALALEPARATANFLAMVDNLQGDSGRGYYLEMLIGTPPQKVRILVDTGSSNFAVAGAPHSYIDTYFDSESSSTYHSKGFEVTVKYTQGSWTGFVGEDLVTIPKGFNSSFLVNIATIFESENFFLPGIKWNGILGLAYAALAKPSSSLETFFDSLVAQAKIPDIFSMQMCGAGLPVAGSGTNGGSLVLGGIEPSLYKGDIWYTPIKEEWYYQIEILKLEIGGQSLNLDCREYNADKAIVDSGTTLLRLPQKVFDAVVEAVARTSLIPEFSDGFWTGAQLACWTNSETPWAYFPKISIYLRDENASRSFRITILPQLYIQPMMGAGFNYECYRFGISSSTNALVIGATVMEGFYVVFDRAQRRVGFAVSPCAEIAGTTVSEISGPFSTEDIASNCVPAQALNEPILWIVSYALMSVCGAILLVLILLLLFPLHCRHAPRDPE.... The pIC50 is 5.3. (5) The small molecule is CCCS(=O)(=O)N1CCC(CNC(=O)c2c(F)cccc2Cl)(C(=O)N2CCOCC2)CC1. The target protein (P48067) has sequence MSGGDTRAAIARPRMAAAHGPVAPSSPEQVTLLPVQRSFFLPPFSGATPSTSLAESVLKVWHGAYNSGLLPQLMAQHSLAMAQNGAVPSEATKRDQNLKRGNWGNQIEFVLTSVGYAVGLGNVWRFPYLCYRNGGGAFMFPYFIMLIFCGIPLFFMELSFGQFASQGCLGVWRISPMFKGVGYGMMVVSTYIGIYYNVVICIAFYYFFSSMTHVLPWAYCNNPWNTHDCAGVLDASNLTNGSRPAALPSNLSHLLNHSLQRTSPSEEYWRLYVLKLSDDIGNFGEVRLPLLGCLGVSWLVVFLCLIRGVKSSGKVVYFTATFPYVVLTILFVRGVTLEGAFDGIMYYLTPQWDKILEAKVWGDAASQIFYSLGCAWGGLITMASYNKFHNNCYRDSVIISITNCATSVYAGFVIFSILGFMANHLGVDVSRVADHGPGLAFVAYPEALTLLPISPLWSLLFFFMLILLGLGTQFCLLETLVTAIVDEVGNEWILQKKTYV.... The pIC50 is 7.9. (6) The small molecule is O=C1CC(c2ccccc2)c2c(n3ncnc3[nH]c2=O)N1. The target protein sequence is KGKNIQVVVRCRPFNLAERKASAHSIVECDPVRKEVSVRTGGLADKSSRKTYTFDMVFGASTKQIDVYRSVVCPILDEVIMGYNCTIFAYGQTGTGKTFTMEGERSPNEEYTWEEDPLAGIIPRTLHQIFEKLTDNGTEFSVKVSLLEIYNEELFDLLNPSSDVSERLQMFDDPRNKRGVIIKGLEEITVHNKDEVYQILEKGAAKRTTAATLMNAYSSRSHSVFSVTIHMKETTIDGEELVKIGKLNLVDLAGSENIGRSGAVDKRAREAGNINQSLLTLGRVITALVERTPHVPYRESKLTRILQDSLGGRTRTSIIATISPASLNLEETLSTLEYAHRAKNILNKPEVNQK. The pIC50 is 4.2.