Dataset: Drug-target binding data from BindingDB using IC50 measurements. Task: Regression. Given a target protein amino acid sequence and a drug SMILES string, predict the binding affinity score between them. We predict pIC50 (pIC50 = -log10(IC50 in M); higher means more potent). Dataset: bindingdb_ic50. (1) The drug is O=[N+]([O-])c1ccc(CNc2nc(NCC3CCCCC3)c3ncn(CCO)c3n2)cc1. The target protein (P49891) has sequence METSMPEYYEVFGEFRGVLMDKRFTKYWEDVEMFLARPDDLVIATYPKSGTTWISEVVYMIYKEGDVEKCKEDAIFNRIPYLECRNEDLINGIKQLKEKESPRIVKTHLPPKLLPASFWEKNCKMIYLCRNAKDVAVSYYYFLLMITSYPNPKSFSEFVEKFMQGQVPYGSWYDHVKAWWEKSKNSRVLFMFYEDMKEDIRREVVKLIEFLERKPSAELVDRIIQHTSFQEMKNNPSTNYTMMPEEMMNQKVSPFMRKGIIGDWKNHFPEALRERFDEHYKQQMKDCTVKFRMEL. The pIC50 is 3.7. (2) The compound is COc1ccc2c(c1)C=C1Cn3c-2c(C2CCCCC2)c2ccc(cc23)C(=O)NS(=O)(=O)N(C)C/C=C/CNC1=O. The target protein sequence is SMSYTWTGALITPCAAEESKLPINPLSNSLLRHHNMVYATTSRSASLRQKKVTFDRLQVLDDHYRDVLKEMKAKASTVKAKLLSIEEACKLTPPHSAKSKFGYGAKDVRNLSSRAVNHIRSVWEDLLEDTETPIDTTIMAKSEVFCVQPEKGGRKPARLIVFPDLGVRVCEKMALYDVVSTLPQAVMGSSYGFQYSPKQRVEFLVNTWKSKKCPMGFSYDTRCFDSTVTESDIRVEESIYQCCDLAPEARQAIRSLTERLYIGGPLTNSKGQNCGYRRCRASGVLTTSCGNTLTCYLKATAACRAAKLQDCTMLVNGDDLVVICESAGTQEDAAALRAFTEAMTRYSAPPGDPPQPEYDLELITSCSSNVSVAHDASGKRVYYLTRDPTTPLARAAWETARHTPINSWLGNIIMYAPTLWARMILMTHFFSILLAQEQLEKALDCQIYGACYSIEPLDLPQIIERLHGLSAFTLHSYSPGEINRVASCLRKLGVPPLRTW.... The pIC50 is 6.7. (3) The small molecule is O=C1c2ccc([N+](=O)[O-])cc2C(=O)N1CCNc1ccccc1. The pIC50 is 3.7. The target protein (Q02401) has sequence MELPWTALFLSTVLLGLSCQGSDWESDRNFISAAGPLTNDLVLNLNYPPGKQGSDVVSGNTDHLLCQQPLPSFLSQYFSSLRASQVTHYKVLLSWAQLLPTGSSKNPDQEAVQCYRQLLQSLKDAQLEPMVVLCHQTPPTSSAIQREGAFADLFADYATLAFQSFGDLVEIWFTFSDLEKVIMDLPHKDLKASALQTLSNAHRRAFEIYHRKFSSQGGKLSVVLKAEDIPELLPDPALAALVQGSVDFLSLDLSYECQSVATLPQKLSELQNLEPKVKVFIYTLKLEDCPATGTSPSSLLISLLEAINKDQIQTVGFDVNAFLSCTSNSEESPSCSLTDSLALQTEQQQETAVPSSPGSAYQRVWAAFANQSREERDAFLQDVFPEGFLWGISTGAFNVEGGWAEGGRGPSIWDHYGNLNAAEGQATAKVASDSYHKPASDVALLRGIRAQVYKFSISWSGLFPLGQKSTPNRQGVAYYNKLIDRLLDSHIEPMATLFHW.... (4) The compound is Cc1cc(C)cc(-c2cccc3c2N[C@H](C)CC(=O)N3)c1. The target protein (Q92793) has sequence MAENLLDGPPNPKRAKLSSPGFSANDSTDFGSLFDLENDLPDELIPNGGELGLLNSGNLVPDAASKHKQLSELLRGGSGSSINPGIGNVSASSPVQQGLGGQAQGQPNSANMASLSAMGKSPLSQGDSSAPSLPKQAASTSGPTPAASQALNPQAQKQVGLATSSPATSQTGPGICMNANFNQTHPGLLNSNSGHSLINQASQGQAQVMNGSLGAAGRGRGAGMPYPTPAMQGASSSVLAETLTQVSPQMTGHAGLNTAQAGGMAKMGITGNTSPFGQPFSQAGGQPMGATGVNPQLASKQSMVNSLPTFPTDIKNTSVTNVPNMSQMQTSVGIVPTQAIATGPTADPEKRKLIQQQLVLLLHAHKCQRREQANGEVRACSLPHCRTMKNVLNHMTHCQAGKACQVAHCASSRQIISHWKNCTRHDCPVCLPLKNASDKRNQQTILGSPASGIQNTIGSVGTGQQNATSLSNPNPIDPSSMQRAYAALGLPYMNQPQTQL.... The pIC50 is 6.4. (5) The small molecule is CN[C@H](C)C(=O)Nc1cc(-c2ccccc2F)cc(C#Cc2ccccc2)n1. The target protein sequence is TLRFSISNLSMQTHAARMRTFMYWPSSVPVQPEQLASAGFYYVGRNDDVKCFCCDGGLRCWESGDDPWVEHAKWFPRCEFLIRMKGQEFVDEIQGRYPHLLEQLLSTS. The pIC50 is 5.4. (6) The target protein (Q13522) has sequence MEQDNSPRKIQFTVPLLEPHLDPEAAEQIRRRRPTPATLVLTSDQSSPEIDEDRIPNPHLKSTLAMSPRQRKKMTRITPTMKELQMMVEHHLGQQQQGEEPEGAAESTETQESRPPGIPDTEVESRLGTSGTAKKTAECIPKTHERGSKEPSTKEPSTHIPPLDSKGANSV. The compound is COC[C@@H](C(O)[C@H](O)C(=O)NCCC(C)c1nc(/C=C/CC2OC3(C[C@@H](O)[C@@H]2C)O[C@H]([C@H](C[C@H](O)C(C)[C@H](O)C(C)/C=C(C)/C(C)=C/C=C/C(C)=C/C#N)OC)[C@H](OP(=O)(O)O)C3(C)C)co1)N(C)C. The pIC50 is 6.7. (7) The compound is c1cc(Nc2ncc3c4sncc4n(C4CCCC4)c3n2)ncc1N1CCNCC1. The target protein (Q00534) has sequence MEKDGLCRADQQYECVAEIGEGAYGKVFKARDLKNGGRFVALKRVRVQTGEEGMPLSTIREVAVLRHLETFEHPNVVRLFDVCTVSRTDRETKLTLVFEHVDQDLTTYLDKVPEPGVPTETIKDMMFQLLRGLDFLHSHRVVHRDLKPQNILVTSSGQIKLADFGLARIYSFQMALTSVVVTLWYRAPEVLLQSSYATPVDLWSVGCIFAEMFRRKPLFRGSSDVDQLGKILDVIGLPGEEDWPRDVALPRQAFHSKSAQPIEKFVTDIDELGKDLLLKCLTFNPAKRISAYSALSHPYFQDLERCKENLDSHLPPSQNTSELNTA. The pIC50 is 5.0. (8) The compound is O=C1[C@@H](N2CC[C@H](c3cc(Cl)cc(Cl)c3)C2)CCN1c1ccc(S(=O)(=O)Nc2nccs2)cc1. The target protein (P04774) has sequence MEQTVLVPPGPDSFNFFTRESLAAIERRIAEEKAKNPKPDKKDDDENGPKPNSDLEAGKNLPFIYGDIPPEMVSEPLEDLDPYYINKKTFIVLNKGKAIFRFSATSALYILTPFNPLRKIAIKILVHSLFSMLIMCTILTNCVFMTMSNPPDWTKNVEYTFTGIYTFESLIKIIARGFCLEDFTFLRDPWNWLDFTVITFAYVTEFVDLGNVSALRTFRVLRALKTISVIPGLKTIVGALIQSVKKLSDVMILTVFCLSVFALIGLQLFMGNLRNKCVQWPPTNASLEEHSIEKNVTTDYNGTLVNETVFEFDWKSYIQDSRYHYFLEGVLDALLCGNSSDAGQCPEGYMCVKAGRNPNYGYTSFDTFSWAFLSLFRLMTQDFWENLYQLTLRAAGKTYMIFFVLVIFLGSFYLINLILAVVAMAYEEQNQATLEEAEQKEAEFQQMLEQLKKQQEAAQQAAAATASEHSREPSAAGRLSDSSSEASKLSSKSAKERRNR.... The pIC50 is 5.7. (9) The small molecule is COc1cc(-c2ccc(=O)[nH]n2)ccc1OC(F)F. The target protein sequence is DVLSYHATCSKAEVDKFKAANIPLVSELAIDDIHFDDFSLDVDAMITAALRMFMELGMVQKFKIDYETLCRWLLTVRKNYRMVLYHNWRHAFNVCQLMFAMLTTAGFQDILTEVEILAVIVGCLCHDLDHRGTNNAFQAKSGSALAQLYGTSATLEHHHFNHAVMILQSEGHNIFANLSSKEYSDLMQLLKQSILATDLTLYFERRTEFFELVSKGEYDWNIKNHRDIFRSMLMTACDLGAVTKPWEISRQVAELVTSEFFEQGDRERLELKLTPSAIFDRNRKDELPRLQLEWIDSICMPLYQALVKVNVKLKPMLDSVATNRSKWEELHQKRLLASTASSSPASVMVAKEDRN. The pIC50 is 3.9. (10) The drug is O=C(O)Cn1c2c(c3cc(F)ccc31)CCN(Cc1ccc(Br)cc1F)C2=S. The target protein (O60218) has sequence MATFVELSTKAKMPIVGLGTWKSPLGKVKEAVKVAIDAGYRHIDCAYVYQNEHEVGEAIQEKIQEKAVKREDLFIVSKLWPTFFERPLVRKAFEKTLKDLKLSYLDVYLIHWPQGFKSGDDLFPKDDKGNAIGGKATFLDAWEAMEELVDEGLVKALGVSNFSHFQIEKLLNKPGLKYKPVTNQVECHPYLTQEKLIQYCHSKGITVTAYSPLGSPDRPWAKPEDPSLLEDPKIKEIAAKHKKTAAQVLIRFHIQRNVIVIPKSVTPARIVENIQVFDFKLSDEEMATILSFNRNWRACNVLQSSHLEDYPFNAEY. The pIC50 is 6.5.