Dataset: Drug-target binding data from BindingDB using IC50 measurements. Task: Regression. Given a target protein amino acid sequence and a drug SMILES string, predict the binding affinity score between them. We predict pIC50 (pIC50 = -log10(IC50 in M); higher means more potent). Dataset: bindingdb_ic50. (1) The drug is CC(C)[C@H](NS(=O)(=O)c1ccc(-c2ccc(NC(=O)c3cc4cc([N+](=O)[O-])ccc4o3)cc2)cc1)C(=O)O. The target protein (O77656) has sequence MHPRVLAGFLFFSWTACWSLPLPSDGDSEDLSEEDFQFAESYLKSYYYPQNPAGILKKTAASSVIDRLREMQSFFGLEVTGRLDDNTLDIMKKPRCGVPDVGEYNVFPRTLKWSKMNLTYRIVNYTPDLTHSEVEKAFRKAFKVWSDVTPLNFTRIHNGTADIMISFGTKEHGDFYPFDGPSGLLAHAFPPGPNYGGDAHFDDDETWTSSSKGYNLFLVAAHEFGHSLGLDHSKDPGALMFPIYTYTGKSHFMLPDDDVQGIQSLYGPGDEDPYSKHPKTPDKCDPSLSLDAITSLRGETLIFKDRFFWRLHPQQVEAELFLTKSFGPELPNRIDAAYEHPSHDLIFIFRGRKFWALSGYDILEDYPKKISELGFPKHVKKISAALHFEDSGKTLFFSENQVWSYDDTNHVMDKDYPRLIEEVFPGIGDKVDAVYQKNGYIYFFNGPIQFEYSIWSNRIVRVMTTNSLLWC. The pIC50 is 9.1. (2) The drug is O=C1CCNC(=O)CCc2ccc(c(Br)c2)Oc2cc(cc(Br)c2O)CCN1. The target protein (P21817) has sequence MGDAEGEDEVQFLRTDDEVVLQCSATVLKEQLKLCLAAEGFGNRLCFLEPTSNAQNVPPDLAICCFVLEQSLSVRALQEMLANTVEAGVESSQGGGHRTLLYGHAILLRHAHSRMYLSCLTTSRSMTDKLAFDVGLQEDATGEACWWTMHPASKQRSEGEKVRVGDDIILVSVSSERYLHLSTASGELQVDASFMQTLWNMNPICSRCEEGFVTGGHVLRLFHGHMDECLTISPADSDDQRRLVYYEGGAVCTHARSLWRLEPLRISWSGSHLRWGQPLRVRHVTTGQYLALTEDQGLVVVDASKAHTKATSFCFRISKEKLDVAPKRDVEGMGPPEIKYGESLCFVQHVASGLWLTYAAPDPKALRLGVLKKKAMLHQEGHMDDALSLTRCQQEESQAARMIHSTNGLYNQFIKSLDSFSGKPRGSGPPAGTALPIEGVILSLQDLIIYFEPPSEDLQHEEKQSKLRSLRNRQSLFQEEGMLSMVLNCIDRLNVYTTAA.... The pIC50 is 5.2. (3) The small molecule is O=c1oc2c(O)c(O)cc3c(=O)oc4c(O)c(O)cc1c4c23. The target protein (Q07014) has sequence MGCIKSKRKDNLNDDGVDMKTQPVRNTDRTIYVRDPTSNKQQRPVPESQLLPGQRFQAKDPEEQGDIVVALYPYDGIHPDDLSFKKGEKMKVLEEHGEWWKAKSLSSKREGFIPSNYVAKVNTLETEEWFFKDITRKDAERQLLAPGNSAGAFLIRESETLKGSFSLSVRDYDPMHGDVIKHYKIRSLDNGGYYISPRITFPCISDMIKHYQKQSDGLCRRLEKACISPKPQKPWDKDAWEIPRESIKLVKKLGAGQFGEVWMGYYNNSTKVAVKTLKPGTMSAQAFLEEANLMKTLQHDKLVRLYAVVTKEEPIYIITEFMAKGSLLDFLKSDEGSKVLLPKLIDFSAQIAEGMAYIERKNYIHRDLRAANVLVSESLMCKIADFGLARVIEDNEYTAREGAKFPIKWTAPEAINFGCFTIKSDVWSFGILLYEIVTYGKIPYPGRTNADVMTALSQGYRMPRMENCPDELYDIMKMCWKESAEERPTFDYLQSVLDDF.... The pIC50 is 5.5. (4) The small molecule is C[C@]1(/C=C/C#N)[C@H](C(=O)[O-])N2C(=O)C[C@H]2S1(=O)=O. The target protein sequence is MLKRLKEKSNDEIVQNTINKRINFIFGVIVFIFAVLVLRLGYLQIAQGSHYKQIIKNDENITVNESVPRGRILDRNGKVLVDNASKMAITYTRGRKTTQSEMLDTAEKLSKLIKMDTKKITERDKKDFWIQLHPKKAKAMMTKEQAMLADGSIKQDQYDKQLLSKIGKSQLDELSSKDLQVLAIFREMNAGTVLDPQMIKNEDVSEKEYAAVSQQLSKLPGVNTSMDWDRKYPYGDTLRGIFGDVSTPAEGIPKELTEHYLSKGYSRNDRVGKSYLEYQYEDVLRGKKKEMKYTTDKSGKVTSSEVLNPGARGQDLKLTIDIDLQKEVEALLDKQIKKLRSQGAKDMDNAMMVVQNPKNGDILALAGKQINKSGKMTDYDIGTFTSQFAVGSSVKGGTLLAGYQNKAIKVGETMVDEPLHFQGGLTKRSYFNKNGHVTINDKQALMHSSNVYMFKTALKLAGDPYYSGMALPSDISSPAQKLRRGLNQVGLGVKTGIDLP.... The pIC50 is 3.5. (5) The drug is Nc1nc(N)c2c(CCCc3csc(C(=O)N[C@@H](CCC(=O)O)C(=O)O)c3)coc2n1. The target protein (P41440) has sequence MVPSSPAVEKQVPVEPGPDPELRSWRHLVCYLCFYGFMAQIRPGESFITPYLLGPDKNFTREQVTNEITPVLSYSYLAVLVPVFLLTDYLRYTPVLLLQGLSFVSVWLLLLLGHSVAHMQLMELFYSVTMAARIAYSSYIFSLVRPARYQRVAGYSRAAVLLGVFTSSVLGQLLVTVGRVSFSTLNYISLAFLTFSVVLALFLKRPKRSLFFNRDDRGRCETSASELERMNPGPGGKLGHALRVACGDSVLARMLRELGDSLRRPQLRLWSLWWVFNSAGYYLVVYYVHILWNEVDPTTNSARVYNGAADAASTLLGAITSFAAGFVKIRWARWSKLLIAGVTATQAGLVFLLAHTRHPSSIWLCYAAFVLFRGSYQFLVPIATFQIASSLSKELCALVFGVNTFFATIVKTIITFIVSDVRGLGLPVRKQFQLYSVYFLILSIIYFLGAMLDGLRHCQRGHHPRQPPAQGLRSAAEEKAAQALSVQDKGLGGLQPAQSP.... The pIC50 is 7.3.