From a dataset of Drug-target binding data from BindingDB using IC50 measurements. Regression. Given a target protein amino acid sequence and a drug SMILES string, predict the binding affinity score between them. We predict pIC50 (pIC50 = -log10(IC50 in M); higher means more potent). Dataset: bindingdb_ic50. (1) The target protein (P40261) has sequence MESGFTSKDTYLSHFNPRDYLEKYYKFGSRHSAESQILKHLLKNLFKIFCLDGVKGDLLIDIGSGPTIYQLLSACESFKEIVVTDYSDQNLQELEKWLKKEPEAFDWSPVVTYVCDLEGNRVKGPEKEEKLRQAVKQVLKCDVTQSQPLGAVPLPPADCVLSTLCLDAACPDLPTYCRALRNLGSLLKPGGFLVIMDALKSSYYMIGEQKFSSLPLGREAVEAAVKEAGYTIEWFEVISQSYSSTMANNEGLFSLVARKLSRPL. The pIC50 is 5.8. The drug is Nc1ncnc2c1c(Br)cn2[C@@H]1O[C@H](CSCC[C@H](N)C(=O)O)[C@@H](O)[C@H]1O. (2) The compound is O=C(Cc1ccccc1)NNC(=O)Cc1ccccc1. The target protein (Q9Y5S8) has sequence MGNWVVNHWFSVLFLVVWLGLNVFLFVDAFLKYEKADKYYYTRKILGSTLACARASALCLNFNSTLILLPVCRNLLSFLRGTCSFCSRTLRKQLDHNLTFHKLVAYMICLHTAIHIIAHLFNFDCYSRSRQATDGSLASILSSLSHDEKKGGSWLNPIQSRNTTVEYVTFTSIAGLTGVIMTIALILMVTSATEFIRRSYFEVFWYTHHLFIFYILGLGIHGIGGIVRGQTEESMNESHPRKCAESFEMWDDRDSHCRRPKFEGHPPESWKWILAPVILYICERILRFYRSQQKVVITKVVMHPSKVLELQMNKRGFSMEVGQYIFVNCPSISLLEWHPFTLTSAPEEDFFSIHIRAAGDWTENLIRAFEQQYSPIPRIEVDGPFGTASEDVFQYEVAVLVGAGIGVTPFASILKSIWYKFQCADHNLKTKKIYFYWICRETGAFSWFNNLLTSLEQEMEELGKVGFLNYRLFLTGWDSNIVGHAALNFDKATDIVTGLK.... The pIC50 is 4.8. (3) The target protein sequence is MRFKKISCLLLSPLFIFSTSIYAGNTPKDQEIKKLVDQNFKPLLEKYDVPGMAVGVIQNNKKYEIYYGLQSVQDKKAVNSSTIFELGSVSKLFTATAGGYAKTKGTISFKDTPGKYWKELKNTPIDQVNLLQLATYTSGNLALQFPDEVQTDQQVLTFFKDWKPKNPIGEYRQYSNPSIGLFGKVVALSMNKPFDQVLEKTIFPGLSLKHSYVNVPKTQMQNYAFGYNQENQPIRVNPGPLDAPAYGVKSTLPDMLKFINANLNPQKYPADIQRAINETHQGFYQVGTMYQALGWEEFSYPAPLQTLLDSNSEQIVMKPNKVTAISKEPSVKMFHKTGSTNGFGTYVVFIPKENIGLVMLTNKRIPNEERFKAAYAVLNAIKK. The pIC50 is 3.6. The compound is O=C(O)[C@H]1/C(=C/CO)O[C@@H]2CC(=O)N21. (4) The compound is Cc1ccnc2c1NC(=O)c1cccnc1N2C1CC1. The target protein sequence is PISPIETVPVKLKPGMDGPKVKQWPLTEEKIKALVEICTEMEKEGKISKIGPENPYNTPVFAIKKKDSTKWRKLVDFRELNKRTQDFWEVQLGIPHPAGIKKNKSVTVLDVGDAYFSVPLDEDFRKYTAFTIPSINNETPGIRYQYNVLPQGWKGSPAIFQSSMTKILEPFKKQNPDIVIYQYMDDLYVGSDLEIGQHRTKIEELRQHLLRWGLTTPDKKHQKEPPFLWMGYELHPDKWTVQPIVLPEKDSWTVNDIQKLVGKLNWASQIYPGIKVRQLSKLLRGTKALTEVIPLTEEAELELAENREILKEPVHGVYYDPSKDLIAEIQKQGQGQWTYQIYQEPFKNLKTGKYARMRGAHTNDVKQLTEAVQKITTESIVIWGKTPKFKLPIQKETWETWWTEYWQATWIPEWEFVNTPPLVKLWYQLEKEPIVGAETFYVDGAANRETKLGKAGYVTNKGRQKVVPLTNTTNQKTELQAIYLALQDSGLEVNIVTDSQ.... The pIC50 is 4.1. (5) The compound is O=C(Cc1ccccc1)NC(CSSCC(NC(=O)Cc1ccccc1)C(=O)O)C(=O)O. The target protein (Q9JJX7) has sequence MASGSSSDAAEPAGPAGRAASAPEAAQAEEDRVKRRRLQCLGFALVGGCDPTMVPSVLRENDWQTQKALSAYFELPENDQGWPRQPPTSFKSEAYVDLTNEDANDTTILEASPSGTPLEDSSTISFITWNIDGLDGCNLPERARGVCSCLALYSPDVVFLQEVIPPYCAYLKKRAASYTIITGNEEGYFTAILLKKGRVKFKSQEIIPFPNTKMMRNLLCVNVSLGGNEFCLMTSHLESTREHSAERIRQLKTVLGKMQEAPDSTTVIFAGDTNLRDQEVIKCGGLPDNVFDAWEFLGKPKHCQYTWDTKANNNLRIPAAYKHRFDRIFFRAEEGHLIPQSLDLVGLEKLDCGRFPSDHWGLLCTLNVVL. The pIC50 is 4.0. (6) The drug is C=C1OC(=O)C(C(=O)CCCCCCCCCCCCCCC)C1=O. The target protein (P30306) has sequence MEVPLQKSAPGSALSPARVLGGIQRPRHLSVFEFESDGFLGSPEPTASSSPVTTLTQTMHNLAGLGSEPPKAQVGSLSFQNRLADLSLSRRTSECSLSSESSESSDAGLCMDSPSPVDPQMAERTFEQAIQAASRVIQNEQFTIKRFRSLPVRLLEHSPVLQSITNSRALDSWRKTEAGYRAAANSPGEDKENDGYIFKMPQELPHSSSAQALAEWVSRRQAFTQRPSSAPDLMCLTTEWKMEVEELSPVAQSSSLTPVERASEEDDGFVDILESDLKDDEKVPAGMENLISAPLVKKLDKEEEQDLIMFSKCQRLFRSPSMPCSVIRPILKRLERPQDRDVPVQSKRRKSVTPLEEQQLEEPKARVFRSKSLCHEIENILDSDHRGLIGDYSKAFLLQTVDGKHQDLKYISPETMVALLTGKFSNIVEKFVIVDCRYPYEYEGGHIKNAVNLPLERDAETFLLQRPIMPCSLDKRIILIFHCEFSSERGPRMCRFIRER.... The pIC50 is 6.0. (7) The compound is FC1CCNCCC1Oc1cccc2ccc(-c3nnc4ccccn34)nc12. The target protein (O70444) has sequence MLLSKFGSLAHLCGPGGVDHLPVKILQPAKADKESFEKVYQVGAVLGSGGFGTVYAGSRIADGLPVAVKHVVKERVTEWGSLGGMAVPLEVVLLRKVGAAGGARGVIRLLDWFERPDGFLLVLERPEPAQDLFDFITERGALDEPLARRFFAQVLAAVRHCHNCGVVHRDIKDENLLVDLRSGELKLIDFGSGAVLKDTVYTDFDGTRVYSPPEWIRYHRYHGRSATVWSLGVLLYDMVCGDIPFEQDEEILRGRLFFRRRVSPECQQLIEWCLSLRPSERPSLDQIAAHPWMLGTEGSVPENCDLRLCALDTDDGASTTSSSESL. The pIC50 is 7.4. (8) The compound is C=C1C(=O)O[C@H]2C[C@H](C)C(C(CC(C)=O)OC(C)=O)=CC[C@H]12. The target protein (P01120) has sequence MPLNKSNIREYKLVVVGGGGVGKSALTIQLTQSHFVDEYDPTIEDSYRKQVVIDDEVSILDILDTAGQEEYSAMREQYMRNGEGFLLVYSITSKSSLDELMTYYQQILRVKDTDYVPIVVVGNKSDLENEKQVSYQDGLNMAKQMNAPFLETSAKQAINVEEAFYTLARLVRDEGGKYNKTLTENDNSKQTSQDTKGSGANSVPRNSGGHRKMSNAANGKNVNSSTTVVNARNASIESKTGLAGNQATNGKTQTDRTNIDNSTGQAGQANAQSANTVNNRVNNNSKAGQVSNAKQARKQQAAPGGNTSEASKSGSGGCCIIS. The pIC50 is 3.8. (9) The compound is Cc1ccc(S(=O)(=O)N2C(=O)NC(=O)C23c2ccccc2-c2ccccc23)cc1. The target protein sequence is AHNIVLYTGAKMPILGLGTWKSPPGKVTEAVKVAIDLGYRHIDCAHVYQNENEVGLALQAKLQEQVVKREDLFIVSKLWCTYHDKDLVKGACQKTLSDLKLDYLDLYLIHWPTGFKPGKDFFPLDEDGNVIPSEKDFVDTWTAMEELVDEGLVKAIGVSNFNHLQVEKILNKPGLKYKPAVNQIECHPYLTQEKLIQYCNSKGIVVTAYSPLGSPDRPWAKPEDPSILEDPRIKAIADKYNKTTAQVLIRFPIQRNLIVIPKSVTPERIAENFQVFDFELDKEDMNTLLSYNRDWRACALVSCASHRDYPFHEEF. The pIC50 is 5.6. (10) The target protein (Q8NB78) has sequence MATPRGRTKKKASFDHSPDSLPLRSSGRQAKKKATETTDEDEDGGSEKKYRKCEKAGCTATCPVCFASASERCAKNGYTSRWYHLSCGEHFCNECFDHYYRSHKDGYDKYTTWKKIWTSNGKTEPSPKAFMADQQLPYWVQCTKPECRKWRQLTKEIQLTPQIAKTYRCGMKPNTAIKPETSDHCSLPEDLRVLEVSNHWWYSMLILPPLLKDSVAAPLLSAYYPDCVGMSPSCTSTNRAAATGNASPGKLEHSKAALSVHVPGMNRYFQPFYQPNECGKALCVRPDVMELDELYEFPEYSRDPTMYLALRNLILALWYTNCKEALTPQKCIPHIIVRGLVRIRCVQEVERILYFMTRKGLINTGVLSVGADQYLLPKDYHNKSVIIIGAGPAGLAAARQLHNFGIKVTVLEAKDRIGGRVWDDKSFKGVTVGRGAQIVNGCINNPVALMCEQLGISMHKFGERCDLIQEGGRITDPTIDKRMDFHFNALLDVVSEWRKD.... The compound is O=C(N[C@@H](CCCCNC1CC1c1ccccc1)C(=O)NCc1ccccc1)c1ccc(-c2ccccc2)cc1. The pIC50 is 3.6.