This data is from Drug-target binding data from BindingDB using IC50 measurements. The task is: Regression. Given a target protein amino acid sequence and a drug SMILES string, predict the binding affinity score between them. We predict pIC50 (pIC50 = -log10(IC50 in M); higher means more potent). Dataset: bindingdb_ic50. (1) The drug is CNc1cc(C(=O)N2CCCC(c3ccc(Cl)cc3C)C2)cnn1. The pIC50 is 7.5. The target protein sequence is METTMGFMDDNATNTSTSFLSVLNPHGAHATSFPFNFSYSDYDMPLDEDEDVTNSRTFFAAKIVIGMALVGIMLVCGIGNFIFIAALVRYKKLRNLTNLLIANLAISDFLVAIVCCPFEMDYYVVRQLSWEHGHVLCTSVNYLRTVSLYVSTNALLAIAIDRYLAIVHPLRPRMKCQTATGLIALVWTVSILIAIPSAYFTTETVLVIVKSQEKIFCGQIWPVDQQLYYKSYFLFIFGIEFVGPVVTMTLCYARISRELWFKAVPGFQTEQIRKRLRCRRKTVLVLMCILTAYVLCWAPFYGFTIVRDFFPTVFVKEKHYLTAFYIVECIAMSNSMINTLCFVTVKNDTVKYFKKIMLLHWKASYNGGKSSADLDLKTIGMPATEEVDCIRLK. (2) The drug is COc1ccc2cc1Oc1ccc(cc1)C[C@H]1c3cc(c(OC)cc3CCN1C)Oc1c(O)c(OC)cc3c1[C@@H](C2)N(C)CC3. The target protein (P39040) has sequence MSRAYDLVVIGAGSGGLEAGWNAASLHKKRVAVIDLQKHHGPPHYAALGGTCVNVGCVPKKLMVTGANYMDTIRESAGFGWELDRESVRPNWKALIAAKNKAVSGINDSYEGMFADTEGLTFHQGFGALQDNHTVLVRESADPNSAVLETLDTEYILLATGSWPQHLGIEGDDLCITSNEAFYLDEAPKRALCVGGGYISIEFAGIFNAYKARGGQVDLAYRGDMILRGFDSELRKQLTEQLRANGINVRTHENPAKVTKNADGTRHVVFESGAEADYDVVMLAIGRVPRSQTLQLDKAGVEVAKNGAIKVDAYSKTNVDNIYAIGDVTDRVMLTPVAINEGAAFVDTVFANKPRATDHTKVACAVFSIPPMGVCGYVEEDAAKKYDQVAVYESSFTPLMHNISGSTYKKFMVRIVTNHADGEVLGVHMLGDSSPEIIQSVAICLKMGAKISDFYNTIGVHPTSAEELCSMRTPAYFYQKGKRVEKIDSNL. The pIC50 is 4.0. (3) The drug is CC[C@H](C)[C@H](NC(=O)[C@@H](NC(=O)[C@@H](CC(=O)O)NC(=O)[C@H](N)CCCNC(=O)[C@H](Cc1ccccc1)NC(C)=O)[C@@H](C)CC)C(=O)N[C@@H](Cc1c[nH]c2ccccc12)C(=O)O. The target protein (Q61614) has sequence MSIFCLAAYFWLTMVGGVMADNPERYSANLSSHMEDFTPFPGTEINFLGTTHRPPNLALPSNGSMHGYCPQQTKITTAFKYINTVISCTIFIVGMVGNATLLRIIYQNKCMRNGPNALIASLALGDLIYVVIDLPINVFKLLAGRWPFDHNDFGVFLCKLFPFLQKSSVGITVLNLCALSVDRYRAVASWSRVQGIGIPLITAIEIVSIWILSFILAIPEAIGFVMVPFEYKGELHRTCMLNATSKFMEFYQDVKDWWLFGFYFCMPLVCTAIFYTLMTCEMLNRRNGSLRIALSEHLKQRREVAKTVFCLVVIFALCWFPLHLSRILKKTVYDEMDKNRCELLSFLLLMDYIGINLATMNSCINPIALYFVSKKFKNCFQSCLCCCCHQSKSLMTSVPMNGTSIQWKNQEQNNHNTERSSHKDSMN. The pIC50 is 6.6. (4) The small molecule is Cc1cc(C)cc(NC2=C(NS(=O)(=O)c3ccccc3)C(=O)c3ccccc3C2=O)c1. The target protein (P50293) has sequence MDIEAYFERIGYQNSRNKLDLQTLTEILQHQIRAIPFENLNIHCGESMELSLETIFDQIVRKKRGGWCLQVNHLLYWALTKMGFETTMLGGYVFNTPANKYSSGMIHLLVQVTISDRNYIVDAGFGRSLQMWEPLELVSGKDHPQVPAIFRLTEENETWYLDQIRREQYVPNQAFVNSDLLEKNKYRKIYSFTLEPRTIEDFESMNTYLQTSPASVFTSKSFCSLQTPEGVHCLVGCTLTYRRFSYKDNVDLVEFKSLKEEEIEDVLKTIFGISLEKKLVPKHGDRFFTI. The pIC50 is 4.0. (5) The compound is Nc1nc(NCCCS(=O)(=O)c2ccccc2)c(N=O)c(=O)[nH]1. The pIC50 is 4.7. The target protein (P0AC13) has sequence MKLFAQGTSLDLSHPHVMGILNVTPDSFSDGGTHNSLIDAVKHANLMINAGATIIDVGGESTRPGAAEVSVEEELQRVIPVVEAIAQRFEVWISVDTSKPEVIRESAKVGAHIINDIRSLSEPGALEAAAETGLPVCLMHMQGNPKTMQEAPKYDDVFAEVNRYFIEQIARCEQAGIAKEKLLLDPGFGFGKNLSHNYSLLARLAEFHHFNLPLLVGMSRKSMIGQLLNVGPSERLSGSLACAVIAAMQGAHIIRVHDVKETVEAMRVVEATLSAKENKRYE. (6) The drug is COC1C(OC(=O)c2ccc(C)[nH]2)C(O)C(Oc2ccc3c(O)c(NC(=O)c4ccc(O)c(CC=C(C)C)c4)c(=O)oc3c2Cl)OC1(C)C. The target protein sequence is MRVLVCGGAGYIGSHFVRALLRDTNHSVVIVDSLVGTHGKSDHVETRENVARKLQQSDGPKPPWADRYAALEVGDVRNEDFLNGVFTRHGPIDAVVHMCAFLAVGESVRDPLKYYDNNVVGILRLLQAMLLHKCDKIIFSSSAAIFGNPTMGSVSTNAEPIDINAKKSPESPYGESKLIAERMIRDCAEAYGIKGICLRYFNACGAHEDGDIGEHYQGSTHLIPIILGRVMSDIAPDQRLTIHEDASTDKRMPIFGTDYPTPDGTCVRDYVHVCDLASAHILALDYVEKLGPNDKSKYFSVFNLGTSRGYSVREVIEVARKTTGHPIPVRECGRREGDPAYLVAASDKAREVLGWKPKYDTLEAIMETSWKFQRTHPNGYASQENGTPGGRTTKL. The pIC50 is 5.3. (7) The drug is Cc1ccc2cc(C[C@@H](NC(=O)C3C[C@@H](O)CN3C(=O)c3cn(C)c4ccccc34)C(=O)N(C)Cc3ccccc3)ccc2c1. The target protein (P30547) has sequence MDNVLPVDSDLFPNISTNTSEPNQFVQPAWQIVLWAAAYTVIVVTSVVGNVVVMWIILAHKRMRTVTNYFLVNLAFAEASMAAFNTVVNFTYAVHNEWYYGLFYCKFHNFFPIAAVFASIYSMTAVAFDRYMAIIHPLQPRLSATATKVVICVIWVLALLLAFPQGYYSTTETMPGRVVCMIEWPSHPDKIYEKVYHICVTVLIYFLPLLVIGYAYTVVGITLWASEIPGDSSDRYHEQVSAKRKVVKMMIVVVCTFAICWLPFHIFFLLPYINPDLYLKKFIQQVYLAIMWLAMSSTMYNPIIYCCLNDRFRLGFKHAFRCCPFISAADYEGLEMKSTRYFQTQGSVYKVSRLETTISTVVGAHEEDPEEGPKATPSSLDLTSNGSSRSNSKTVTESSSFYSNMLS. The pIC50 is 7.6. (8) The small molecule is O=C1/C(=C/c2ccc(=O)n(O)c2)Oc2cccc(O)c21. The target protein (P14679) has sequence MLLAVLYCLLWSFQTSAGHFPRACVSSKNLMEKECCPPWSGDRSPCGQLSGRGSCQNILLSNAPLGPQFPFTGVDDRESWPSVFYNRTCQCSGNFMGFNCGNCKFGFWGPNCTERRLLVRRNIFDLSAPEKDKFFAYLTLAKHTISSDYVIPIGTYGQMKNGSTPMFNDINIYDLFVWMHYYVSMDALLGGSEIWRDIDFAHEAPAFLPWHRLFLLRWEQEIQKLTGDENFTIPYWDWRDAEKCDICTDEYMGGQHPTNPNLLSPASFFSSWQIVCSRLEEYNSHQSLCNGTPEGPLRRNPGNHDKSRTPRLPSSADVEFCLSLTQYESGSMDKAANFSFRNTLEGFASPLTGIADASQSSMHNALHIYMNGTMSQVQGSANDPIFLLHHAFVDSIFEQWLRRHRPLQEVYPEANAPIGHNRESYMVPFIPLYRNGDFFISSKDLGYDYSYLQDSDPDSFQDYIKSYLEQASRIWSWLLGAAMVGAVLTALLAGLVSLLC.... The pIC50 is 4.5. (9) The drug is Cn1ccc2c1C(c1ccccc1)N(c1cc(Cl)c(=O)n(C)c1)C2=O. The target protein sequence is NPPPPETSNPNKPKRQTNQLQYLLRVVLKTLWKHQFAWPFQQPVDAVKLNLPDYYKIIKTPMDMGTIKKRLENNYYWNAQECIQDFNTMFTNCYIYNKPGDDIVLMAEALEKLFLQKINELPTEETEIMIVQAKGRGRGRKETGTAKPGVSTVPNTTQASTPPQTQTPQPNPPPVQATPHPFPAVTPDLIVQTPVMTVVPPQPLQTPPPVPPQPQPPPAPAPQPVQSHPPIIAATPQPVKTKKGVKRKADTTTPTTIDPIHEPPSLPPEPKTTKLGQRRESSRPVKPPKKDVPDSQQHPAPEKSSKVSEQLKCCSGILKEMFAKKHAAYAWPFYKPVDVEALGLHDYCDIIKHPMDMSTIKSKLEAREYRDAQEFGADVRLMFSNCYKYNPPDHEVVAMARKLQDVFEMRFAKMPDEPEEPVVAVSSPAVPPPT. The pIC50 is 6.4. (10) The drug is CN1CC(CC(C)(C)O)=Nc2c1nc(N)[nH]c2=O. The target protein sequence is MCSLKWDYDLRCGEYTLNLNEKTLIMGILNVTPDSFSDGGSYNEVDAAVRHAKEMRDEGAHIIDIGGESTRPGFAKVSVEEEIKRVVPMIQAVSKEVKLPISIDTYKAEVAKQAIEAGAHIINDIWGAKAEPKIAEVAAHYDVPIILMHNRDNMNYRNLMADMIADLYDSIKIAKDAGVRDENIILDPGIGFAKTPEQNLEAMRNLEQLNVLGYPVLLGTSRKSFIGHVLDLPVEERLEGTGATVCLGIEKGCEFVRVHDVKEMSRMAKMMDAMIGKGVK. The pIC50 is 3.7.