Dataset: Drug-target binding data from BindingDB using IC50 measurements. Task: Regression. Given a target protein amino acid sequence and a drug SMILES string, predict the binding affinity score between them. We predict pIC50 (pIC50 = -log10(IC50 in M); higher means more potent). Dataset: bindingdb_ic50. (1) The compound is CC(C)=CCOC(=O)c1ccc(O)cc1. The target protein (Q00024) has sequence MSHLLVSPLGGGVQPRLEINNFVKNDRQFSLYVQALDRMYATPQNETASYFQVAGVHGYPLIPFDDAVGPTEFSPFDQWTGYCTHGSTLFPTWHRPYVLILEQILSGHAQQIADTYTVNKSEWKKAATEFRHPYWDWASNSVPPPEVISLPKVTITTPNGQKTSVANPLMRYTFNSVNDGGFYGPYNQWDTTLRQPDSTGVNAKDNVNRLKSVLKNAQASLTRATYDMFNRVTTWPHFSSHTPASGGSTSNSIEAIHDNIHVLVGGNGHMSDPSVAPFDPIFFLHHANVDRLIALWSAIRYDVWTSPGDAQFGTYTLRYKQSVDESTDLAPWWKTQNEYWKSNELRSTESLGYTYPEFVGLDMYNKDAVNKTISRKVAQLYGPQRGGQRSLVEDLSNSHARRSQRPAKRSRLGQLLKGLFSDWSAQIKFNRHEVGQSFSVCLFLGNVPEDPREWLVSPNLVGARHAFVRSVKTDHVAEEIGFIPINQWIAEHTGLPSFAV.... The pIC50 is 4.8. (2) The drug is CCCNC(=O)[C@@H](NC(=O)[C@@H]1CCCN1C(=O)[C@H](CC(=O)O)NC(=O)[C@@H]1CCCCN1C(=O)[C@H](CCCCN)NC(=O)CCC1CC2CCC1C2)[C@@H](C)O. The target protein (P79483) has sequence MVCLKLPGGSSLAALTVTLMVLSSRLAFAGDTRPRFLELRKSECHFFNGTERVRYLDRYFHNQEEFLRFDSDVGEYRAVTELGRPVAESWNSQKDLLEQKRGRVDNYCRHNYGVGESFTVQRRVHPQVTVYPAKTQPLQHHNLLVCSVSGFYPGSIEVRWFRNGQEEKAGVVSTGLIQNGDWTFQTLVMLETVPRSGEVYTCQVEHPSVTSALTVEWRARSESAQSKMLSGVGGFVLGLLFLGAGLFIYFRNQKGHSGLQPTGFLS. The pIC50 is 7.1. (3) The drug is COc1ccc(CCNC(=O)Cn2c(=O)n(-c3ccc(F)cc3)c(=O)c3sccc32)cc1OC. The target protein sequence is AAPYLKTKFICVTPTTCSNTIDLPMSPRTLDSLMQFGNGEGAEPSAGGQF. The pIC50 is 4.2. (4) The drug is NCc1ccnc(Cl)c1. The target protein (B5DF27) has sequence MEIPFGSCLYSCLALLVLLPSLSLAQYESWPYQLQYPEYFQQPPPEHHQHQVPSDVVKIQVRLAGQKRKHNEGRVEVYYEGQWGTVCDDDFSIHAAHVVCREVGYVEAKSWTASSSYGPGEGPIWLDNIYCTGKESTLAACSSNGWGVTDCKHPEDVGVVCSEKRIPGFKFDNSLINQIESLNIQVEDIRIRPILSAFRHRKPVTEGYVEVKEGKAWKQICDKHWTAKNSHVVCGMFGFPAEKTYNPKAYKTFASRRKLRYWKFSMNCTGTEAHISSCKLGPPMFRDPVKNATCENGQPAVVSCVPSQIFSPDGPSRFRKAYKPEQPLVRLRGGAQVGEGRVEVLKNGEWGTVCDDKWDLVSASVVCRELGFGTAKEAVTGSRLGQGIGPIHLNEVQCTGTEKSIIDCKLNTESQGCNHEEDAGVRCNIPIMGFQKKVRLNGGRNPYEGRVEVLTERNGSLVWGNVCGQNWGIVEAMVVCRQLGLGFASNAFQETWYWHG.... The pIC50 is 6.9. (5) The compound is CC(COc1ccccc1)N(CCCl)Cc1ccccc1. The target protein (O75751) has sequence MPSFDEALQRVGEFGRFQRRVFLLLCLTGVTFAFLFVGVVFLGTQPDHYWCRGPSAAALAERCGWSPEEEWNRTAPASRGPEPPERRGRCQRYLLEAANDSASATSALSCADPLAAFPNRSAPLVPCRGGWRYAQAHSTIVSEFDLVCVNAWMLDLTQAILNLGFLTGAFTLGYAADRYGRIVIYLLSCLGVGVTGVVVAFAPNFPVFVIFRFLQGVFGKGTWMTCYVIVTEIVGSKQRRIVGIVIQMFFTLGIIILPGIAYFIPNWQGIQLAITLPSFLFLLYYWVVPESPRWLITRKKGDKALQILRRIAKCNGKYLSSNYSEITVTDEEVSNPSFLDLVRTPQMRKCTLILMFAWFTSAVVYQGLVMRLGIIGGNLYIDFFISGVVELPGALLILLTIERLGRRLPFAASNIVAGVACLVTAFLPEGIAWLRTTVATLGRLGITMAFEIVYLVNSELYPTTLRNFGVSLCSGLCDFGGIIAPFLLFRLAAVWLELPL.... The pIC50 is 5.2. (6) The small molecule is COc1cc(Cc2ccccc2)c(O)cc1-c1ccnc2[nH]c3ccc(-c4ccc(N5CCN(C)CC5)cc4)cc3c12. The target protein sequence is MGAIGLLWLLPLLLSTAAVGSGMGTGQRAGSPAAGPPLQPREPLSYSRLQRKSLAVDFVVPSLFRVYARDLLLPPSSSELKAGRPEARGSLALDCAPLLRLLGPAPGVSWTAGSPAPAEARTLSRVLKGGSVRKLRRAKQLVLELGEEAILEGCVGPPGEAAVGLLQFNLSELFSWWIRQGEGRLRIRLMPEKKASEVGREGRLSAAIRASQPRLLFQIFGTGHSSLESPTNMPSPSPDYFTWNLTWIMKDSFPFLSHRSRYGLECSFDFPCELEYSPPLHDLRNQSWSWRRIPSEEASQMDLLDGPGAERSKEMPRGSFLLLNTSADSKHTILSPWMRSSSEHCTLAVSVHRHLQPSGRYIAQLLPHNEAAREILLMPTPGKHGWTVLQGRIGRPDNPFRVALEYISSGNRSLSAVDFFALKNCSEGTSPGSKMALQSSFTCWNGTVLQLGQACDFHQDCAQGEDESQMCRKLPVGFYCNFEDGFCGWTQGTLSPHTPQ.... The pIC50 is 5.9. (7) The small molecule is CCc1n[nH]c2ccc(C(=O)N3CCC4(CC3)CC(=O)c3nn(CC)c(C)c3O4)cc12. The target protein (P11497) has sequence MDEPSPLAKTLELNQHSRFIIGSVSEDNSEDEISNLVKLDLEEKEGSLSPASVSSDTLSDLGISALQDGLAFHMRSSMSGLHLVKQGRDRKKIDSQRDFTVASPAEFVTRFGGNKVIEKVLIANNGIAAVKCMRSIRRWSYEMFRNERAIRFVVMVTPEDLKANAEYIKMADHYVPVPGGANNNNYANVELILDIAKRIPVQAVWAGWGHASENPKLPELLLKNGIAFMGPPSQAMWALGDKIASSIVAQTAGIPTLPWSGSGLRVDWQENDFSKRILNVPQDLYEKGYVKDVDDGLKAAEEVGYPVMIKASEGGGGKGIRKVNNADDFPNLFRQVQAEVPGSPIFVMRLAKQSRHLEVQILADQYGNAISLFGRDCSVQRRHQKIIEEAPAAIATPAVFEHMEQCAVKLAKMVGYVSAGTVEYLYSQDGSFYFLELNPRLQVEHPCTEMVADVNLPAAQLQIAMGIPLFRIKDIRMMYGVSPWGDAPIDFENSAHVPCP.... The pIC50 is 7.4. (8) The compound is Cc1ccc(NC(=O)c2cccc(C(F)F)c2)cc1-c1cc(N2CCOCC2)nc(N2CC(O)C2)c1. The target protein sequence is MEHIQGAWKTISNGFGFKDAVFDGSSCISPTIVQQFGYQRRASDDGKLTDPSKTSNTIRVFLPNKQRTVVNVRNGMSLHDCLMKALKVRGLQPECCAVFRLLHEHKGKKARLDWNTDAASLIGEELQVDFLDHVPLTTHNFARKTFLKLAFCDICQKFLLNGFRCQTCGYKFHEHCSTKVPTMCVDWSNIRQLLLFPNSTIGDSGVPALPSLTMRRMRESVSRMPVSSQHRYSTPHAFTFNTSSPSSEGSLSQRQRSTSTPNVHMVSTTLPVDSRMIEDAIRSHSESASPSALSSSPNNLSPTGWSQPKTPVPAQRERAPVSGTQEKNKIRPRGQRDSSEEWEIEASEVMLSTRIGSGSFGTVYKGKWHGDVAVKILKVVDPTPEQFQAFRNEVAVLRKTRHVNILLFMGYMTKDNLAIVTQWCEGSSLYKHLHVQETKFQMFQLIDIARQTAQGMDYLHAKNIIHRDMKSNNIFLHEGLTVKIGDFGLATVKSRWSGSQ.... The pIC50 is 9.2.