This data is from Drug-target binding data from BindingDB using IC50 measurements. The task is: Regression. Given a target protein amino acid sequence and a drug SMILES string, predict the binding affinity score between them. We predict pIC50 (pIC50 = -log10(IC50 in M); higher means more potent). Dataset: bindingdb_ic50. (1) The compound is O=C(O)CC/C=C\CC1COC(c2ccco2)OC1c1cccnc1. The target protein (P24557) has sequence MEALGFLKLEVNGPMVTVALSVALLALLKWYSTSAFSRLEKLGLRHPKPSPFIGNLTFFRQGFWESQMELRKLYGPLCGYYLGRRMFIVISEPDMIKQVLVENFSNFTNRMASGLEFKSVADSVLFLRDKRWEEVRGALMSAFSPEKLNEMVPLISQACDLLLAHLKRYAESGDAFDIQRCYCNYTTDVVASVAFGTPVDSWQAPEDPFVKHCKRFFEFCIPRPILVLLLSFPSIMVPLARILPNKNRDELNGFFNKLIRNVIALRDQQAAEERRRDFLQMVLDARHSASPMGVQDFDIVRDVFSSTGCKPNPSRQHQPSPMARPLTVDEIVGQAFIFLIAGYEIITNTLSFATYLLATNPDCQEKLLREVDVFKEKHMAPEFCSLEEGLPYLDMVIAETLRMYPPAFRFTREAAQDCEVLGQRIPAGAVLEMAVGALHHDPEHWPSPETFNPERFTAEARQQHRPFTYLPFGAGPRSCLGVRLGLLEVKLTLLHVLHKF.... The pIC50 is 6.8. (2) The small molecule is CC(C)(O/N=C(\C(=O)N[C@H]1CON(C2(C(=O)O)C[C@H](N3C(=O)c4cc(O)c(O)cc4C3=O)C(=O)O2)C1=O)c1csc(N)n1)C(=O)O. The target protein (Q07806) has sequence MRLLKFLWWTCVTLICGVLLSFSGAYLYLSPSLPSVEALRNVQLQIPLKVYSEDGKLISEFGEMRRTPIRFADIPQDFIHALLSAEDDNFANHYGVDVKSLMRAAAQLLKSGHIQTGGSTITMQVAKNYFLTNERSFSRKINEILLALQIERQLTKDEILELYVNKIYLGNRAYGIEAAAQVYYGKPIKDLSLAEMAMIAGLPKAPSRYNPLVNPTRSTERRNWILERMLKLGFIDQQRYQAAVEEPINASYHVQTPELNAPYIAEMARAEMVGRYGSEAYTEGYKVITTVRSDLQNAASQSVRDGLIDYDQRHGYRGPETRLPGQTRDAWLKHLGQQRSIGGLEPAIVTQVEKSGIMVMTRDGKEEAVTWDSMKWARPFLSNNSMGPMPRQPADVAQAGDQIRVQRQEDGTLRFVQIPAAQSALISLDPKDGAIRSLVGGFSFEQSNYNRAIQAKRQPGSSFKPFIYSAALDNGFTAASLVNDAPIVFVDEYLDKVWRP.... The pIC50 is 7.5. (3) The drug is Cc1c(Oc2ccc(Cl)cc2O)c(=O)oc2ccccc12. The target protein sequence is MSVLHRFYLFFLFTKFFHCYKISYVLKNAKLAPNHAIKNINSLNLLSENKKENYYYCGENKVALVTGAGRGIGREIAKMLAKSVSHVICISRTQKSCDSVVDEIKSFGYESSGYAGDVSKKEEISEVINKILTEHKNVDILVNNAGITRDNLFLRMKNDEWEDVLRTNLNSLFYITQPISKRMINNRYGRIINISSIVGLTGNVGQANYSSSKAGVIGFTKSLAKELASRNITVNAIAPGFISSDMTDKISEQIKKNIISNIPAGRMGTPEEVANLACFLSSDKSGYINGRVFVIDGGLSP. The pIC50 is 4.0. (4) The small molecule is OC[C@H]1O[C@H](CNC2CCCCCC2)[C@@H](O)[C@@H]1O. The target protein (Q27686) has sequence MSQLAHNLTLSIFDPVANYRAARIICTIGPSTQSVEALKGLIQSGMSVARMNFSHGSHEYHQTTINNVRQAAAELGVNIAIALDTKGPEIRTGQFVGGDAVMERGATCYVTTDPAFADKGTKDKFYIDYQNLSKVVRPGNYIYIDDGILILQVQSHEDEQTLECTVTNSHTISDRRGVNLPGCDVDLPAVSAKDRVDLQFGVEQGVDMIFASFIRSAEQVGDVRKALGPKGRDIMIICKIENHQGVQNIDSIIEESDGIMVARGDLGVEIPAEKVVVAQKILISKCNVAGKPVICATQMLESMTYNPRPTRAEVSDVANAVFNGADCVMLSGETAKGKYPNEVVQYMARICLEAQSALNEYVFFNSIKKLQHIPMSADEAVCSSAVNSVYETKAKAMVVLSNTGRSARLVAKYRPNCPIVCVTTRLQTCRQLNITQGVESVFFDADKLGHDEGKEHRVAAGVEFAKSKGYVQTGDYCVVIHADHKVKGYANQTRILLVE. The pIC50 is 4.2. (5) The target protein (P08253) has sequence MEALMARGALTGPLRALCLLGCLLSHAAAAPSPIIKFPGDVAPKTDKELAVQYLNTFYGCPKESCNLFVLKDTLKKMQKFFGLPQTGDLDQNTIETMRKPRCGNPDVANYNFFPRKPKWDKNQITYRIIGYTPDLDPETVDDAFARAFQVWSDVTPLRFSRIHDGEADIMINFGRWEHGDGYPFDGKDGLLAHAFAPGTGVGGDSHFDDDELWTLGEGQVVRVKYGNADGEYCKFPFLFNGKEYNSCTDTGRSDGFLWCSTTYNFEKDGKYGFCPHEALFTMGGNAEGQPCKFPFRFQGTSYDSCTTEGRTDGYRWCGTTEDYDRDKKYGFCPETAMSTVGGNSEGAPCVFPFTFLGNKYESCTSAGRSDGKMWCATTANYDDDRKWGFCPDQGYSLFLVAAHEFGHAMGLEHSQDPGALMAPIYTYTKNFRLSQDDIKGIQELYGASPDIDLGTGPTPTLGPVTPEICKQDIVFDGIAQIRGEIFFFKDRFIWRTVTPR.... The pIC50 is 4.0. The small molecule is COc1cccc(COC(=O)c2cccc(=S)[nH]2)c1. (6) The compound is CCCC[C@H]1C(=O)N(C)[C@@H](CCCC)C(=O)N[C@@H](C)C(=O)N[C@H](C(=O)NCC(N)=O)CSCC(=O)N[C@@H](Cc2ccc(O)cc2)C(=O)N(C)[C@@H](C)C(=O)N[C@H]2CC(=O)NCCC[C@H](NC(=O)[C@H](Cc3cnc[nH]3)NC(=O)[C@@H]3CCCN3C2=O)C(=O)N2C[C@H](O)C[C@H]2C(=O)N[C@@H](Cc2c[nH]c3ccccc23)C(=O)N[C@@H](CO)C(=O)N[C@@H](Cc2c[nH]c3ccccc23)C(=O)N1C. The target protein sequence is FTVTVPKDLYVVEYGSNMTIECKFPVEKQLDLAALIVYWEMEDKNIIQFVHGEEDLKVQHSSYRQRARLLKDQLSLGNAALQITDVKLQDAGVYRCMISYGGADYKRITVKVNAPYNKINQRILVVDPVTSEHELTCQAEGYPKAEVIWTSSDHQVLSGKTTTTNSKREEKLFNVTSTLRINTTTNEIFYCTFRRLDPEENHTAELVIPELPLAHPPNERTGSSETVRFQGHHHHHH. The pIC50 is 7.3. (7) The drug is N[C@H](C(=O)O)C(NS(=O)(=O)c1ccc(Cl)cc1)C(=O)O. The target protein (P43005) has sequence MGKPARKGCEWKRFLKNNWVLLSTVAAVVLGITTGVLVREHSNLSTLEKFYFAFPGEILMRMLKLIILPLIISSMITGVAALDSNVSGKIGLRAVVYYFCTTLIAVILGIVLVVSIKPGVTQKVGEIARTGSTPEVSTVDAMLDLIRNMFPENLVQACFQQYKTKREEVKPPSDPEMNMTEESFTAVMTTAISKNKTKEYKIVGMYSDGINVLGLIVFCLVFGLVIGKMGEKGQILVDFFNALSDATMKIVQIIMCYMPLGILFLIAGKIIEVEDWEIFRKLGLYMATVLTGLAIHSIVILPLIYFIVVRKNPFRFAMGMAQALLTALMISSSSATLPVTFRCAEENNQVDKRITRFVLPVGATINMDGTALYEAVAAVFIAQLNDLDLGIGQIITISITATSASIGAAGVPQAGLVTMVIVLSAVGLPAEDVTLIIAVDWLLDRFRTMVNVLGDAFGTGIVEKLSKKELEQMDVSSEVNIVNPFALESTILDNEDSDTK.... The pIC50 is 6.0. (8) The drug is CN(C)c1ccc2cc(C(=O)N[C@@H](CCCNC(=N)CCl)C(=O)NCc3ccc(NC(=O)CCCNC(=O)CCCCC4SCC5NC(=O)NC54)cc3)ccc2c1. The target protein (Q6TGC4) has sequence MVSVEGRAMSFQSIIHLSLDSPVHAVCVLGTEICLDLSGCAPQKCQCFTIHGSGRVLIDVANTVISEKEDATIWWPLSDPTYATVKMTSPSPSVDADKVSVTYYGPNEDAPVGTAVLYLTGIEVSLEVDIYRNGQVEMSSDKQAKKKWIWGPSGWGAILLVNCNPADVGQQLEDKKTKKVIFSEEITNLSQMTLNVQGPSCILKKYRLVLHTSKEESKKARVYWPQKDNSSTFELVLGPDQHAYTLALLGNHLKETFYVEAIAFPSAEFSGLISYSVSLVEESQDPSIPETVLYKDTVVFRVAPCVFIPCTQVPLEVYLCRELQLQGFVDTVTKLSEKSNSQVASVYEDPNRLGRWLQDEMAFCYTQAPHKTTSLILDTPQAADLDEFPMKYSLSPGIGYMIQDTEDHKVASMDSIGNLMVSPPVKVQGKEYPLGRVLIGSSFYPSAEGRAMSKTLRDFLYAQQVQAPVELYSDWLMTGHVDEFMCFIPTDDKNEGKKGF.... The pIC50 is 5.5.