From a dataset of Drug-target binding data from BindingDB using IC50 measurements. Regression. Given a target protein amino acid sequence and a drug SMILES string, predict the binding affinity score between them. We predict pIC50 (pIC50 = -log10(IC50 in M); higher means more potent). Dataset: bindingdb_ic50. (1) The compound is CCOC(=O)c1ccc2ncc(C(=O)OCC)c(Nc3ccc(NC(C)=O)cc3)c2c1. The target protein (Q61214) has sequence MHTGGETSACKPSSVRLAPSFSFHAAGLQMAAQMPHSHQYSDRRQPSISDQQVSALPYSDQIQQPLTNQVMPDIVMLQRRMPQTFRDPATAPLRKLSVDLIKTYKHINEVYYAKKKRRHQQGQGDDSSHKKERKVYNDGYDDDNYDYIVKNGEKWMDRYEIDSLIGKGSFGQVVKAYDRVEQEWVAIKIIKNKKAFLNQAQIEVRLLELMNKHDTEMKYYIVHLKRHFMFRNHLCLVFEMLSYNLYDLLRNTNFRGVSLNLTRKFAQQMCTALLFLATPELSIIHCDLKPENILLCNPKRSAIKIVDFGSSCQLGQRIYQYIQSRFYRSPEVLLGMPYDLAIDMWSLGCILVEMHTGEPLFSGANEVDQMNKIVEVLGIPPAHILDQAPKARKFFEKLPDGTWSLKKTKDGKREYKPPGTRKLHNILGVETGGPGGRRAGESGHTVADYLKFKDLILRMLDYDPKTRIQPYYALQHSFFKKTADEGTNTSNSVSTSPAME.... The pIC50 is 5.0. (2) The small molecule is CCc1cc(C2CCN(C)CC2)ccc1Nc1ncc(C(F)(F)F)c(CCc2ccccc2CC(N)=O)n1. The target protein sequence is CNMRRPAHADIKTGYLSIIMDPGEVPLEEQCEYLSYDASQWEFPRERLHLGRVLGYGAFGKVVEASAFGIHKGSSCDTVAVKMLKEGATASEHRALMSELKILIHIGNHLNVVNLLGACTKPQGPLMVIVEFCKYGNLSNFLRAKRDAFSPCAEKSPEQRGRFRAMVELARLDRRRPGSSDRVLFARFSKTEGGARRASPDQEAEDLWLSPLTMEDLVCYSFQVARGMEFLASRKCIHRDLAARNILLSESDVVKICDFGLARDIYKDPDYVRKGSARLPLKWMAPESIFDKVYTTQSDVWSFGVLLWEIFSLGASPYPGVQINEEFCQRLRDGTRMRAPELATPAIRRIMLNCWSGDPKARPAFSELVEILGDLLQGRGLQEEEEVCMAPRSSQSSEEGSFSQVSTMALHIAQADAEDSPPSLQRHSLAARYYNWVSFPGCLARGAETRGSSRMKTFEEFPMTPTTYKGSVDNQTDSGMVLASEEFEQIE. The pIC50 is 6.0. (3) The drug is CCc1ccccc1-n1nc(-c2ccccc2)c2c(N)ncnc21. The target protein (Q8KRU5) has sequence MTSRYRSSEAHQGLASFSPRRRTVVKAAAATAVLAGPLAAALPARATTGTPAFLHGVASGDPLPDGVLLWTRVTPTADATPGSGLGPDTEVGWTVATDKAFTNVVAKGSTTATAASDHTVKADIRGLAPATDHWFRFSAGGTDSPAGRARTAPAADAAVAGLRFGVVSCANWEAGYFAAYRHLAARGDLDAWLHLGDYIYEYGAGEYGTRGTSVRSHAPAHEILTLADYRVRHGRYKTDPDLQALHAAAPVVAIWDDHEIANDTWSGGAENHTEGVEGAWAARQAAAKQAYFEWMPVRPAIAGTTYRRLRFGKLADLSLLDLRSFRAQQVSLGDGDVDDPDRTLTGRAQLDWLKAGLKSSDTTWRLVGNSVMIAPFAIGSLSAELLKPLAKLLGLPQEGLAVNTDQWDGYTDDRRELLAHLRSNAIRNTVFLTGDIHMAWANDVPVNAGTYPLSASAATEFVVTSVTSDNLDDLVKVPEGTVSALASPVIRAANRHVHWV.... The pIC50 is 4.5. (4) The compound is CC(=O)OC[C@@]1(C)[C@H](C(=O)[O-])N2C(=O)/C(=C/c3ccccn3)C2S1(=O)=O. The target protein (P00807) has sequence MKKLIFLIVIALVLSACNSNSSHAKELNDLEKKYNAHIGVYALDTKSGKEVKFNSDKRFAYASTSKAINSAILLEQVPYNKLNKKVHINKDDIVAYSPILEKYVGKDITLKALIEASMTYSDNTANNKIIKEIGGIKKVKQRLKELGDKVTNPVRYEIELNYYSPKSKKDTSTPAAFGKTLNKLIANGKLSKENKKFLLDLMLNNKSGDTLIKDGVPKDYKVADKSGQAITYASRNDVAFVYPKGQSEPIVLVIFTNKDNKSDKPNDKLISETAKSVMKEF. The pIC50 is 8.4. (5) The drug is O=C(Nc1cccc(Nc2ccc3c(c2)NC(=O)/C3=C\c2ccc[nH]2)c1)Nc1cccc(C(F)(F)F)c1. The target protein (P25911) has sequence MGCIKSKRKDNLNDDEVDSKTQPVRNTDRTIYVRDPTSNKQQRPVPEFHLLPGQRFQTKDPEEQGDIVVALYPYDGIHPDDLSFKKGEKMKVLEEHGEWWKAKSLSSKREGFIPSNYVAKVNTLETEEWFFKDITRKDAERQLLAPGNSAGAFLIRESETLKGSFSLSVRDYDPMHGDVIKHYKIRSLDNGGYYISPRITFPCISDMIKHYQKQSDGLCRRLEKACISPKPQKPWDKDAWEIPRESIKLVKKLGAGQFGEVWMGYYNNSTKVAVKTLKPGTMSVQAFLEEANLMKTLQHDKLVRLYAVVTKEEPIYIITEFMAKGSLLDFLKSDEGGKVLLPKLIDFSAQIAEGMAYIERKNYIHRDLRAANVLVSESLMCKIADFGLARVIEDNEYTAREGAKFPIKWTAPEAINFGCFTIKSDVWSFGILLYEIVTYGKIPYPGRTNADVMSALSQGYRMPRMENCPDELYDIMKMCWKEKAEERPTFDYLQSVLDDF.... The pIC50 is 5.7. (6) The small molecule is Cc1nnc2n1-c1sc(CCc3ccc4c(c3)CCC4)cc1C(c1ccccc1Cl)=NC2. The target protein (P25105) has sequence MEPHDSSHMDSEFRYTLFPIVYSIIFVLGVIANGYVLWVFARLYPCKKFNEIKIFMVNLTMADMLFLITLPLWIVYYQNQGNWILPKFLCNVAGCLFFINTYCSVAFLGVITYNRFQAVTRPIKTAQANTRKRGISLSLVIWVAIVGAASYFLILDSTNTVPDSAGSGNVTRCFEHYEKGSVPVLIIHIFIVFSFFLVFLIILFCNLVIIRTLLMQPVQQQRNAEVKRRALWMVCTVLAVFIICFVPHHVVQLPWTLAELGFQDSKFHQAINDAHQVTLCLLSTNCVLDPVIYCFLTKKFRKHLTEKFYSMRSSRKCSRATTDTVTEVVVPFNQIPGNSLKN. The pIC50 is 7.7. (7) The small molecule is CC(=O)[C@@]1(O)CCC2C3CCC4=C(Cl)C(=O)CCC4(C)C3CCC21C. The target protein sequence is MEHAAQPWRWQRRRGWRRSACWPRSPTCSAPWAARSSRGIVRRTQCTAATRCPATGSECRRGPPGWCRSCPRWPCRSTSTPASPPRVSAARPTASSWPCSSSTTGIGFGLWLTGMLINIHSDHILRNLRKPGDTGYKIPRGGLFEYVTAANYFGEIMEWCGYALASWSVQGAAFAFFTFCFLSGRAKEHHEWYLRKFEEYPKFRKIIIPFLF. The pIC50 is 7.2. (8) The drug is CNC1=Nc2ccccc2C(c2ccccc2)=NC1c1cccs1. The target protein (Q836J0) has sequence MSNQEAIGLIDSGVGGLTVLKEALKQLPNERLIYLGDTARCPYGPRPAEQVVQFTWEMADFLLKKRIKMLVIACNTATAVALEEIKAALPIPVVGVILPGARAAVKVTKNNKIGVIGTLGTIKSASYEIAIKSKAPTIEVTSLDCPKFVPIVESNQYRSSVAKKIVAETLQALQLKGLDTLILGCTHYPLLRPVIQNVMGSHVTLIDSGAETVGEVSMLLDYFDIAHTPEAPTQPHEFYTTGSAKMFEEIASSWLGIENLKAQQIHLGGNEND. The pIC50 is 3.4. (9) The drug is CCOC(=O)c1cc(C2NCC(O)C2O)oc1C. The target protein (P29853) has sequence MKLSSACAIALLAAQAAGASIKHRINGFTLTEHSDPAKRELLQKYVTWDDKSLFINGERIMIFSGEFHPFRLPVKELQLDIFQKVKALGFNCVSFYVDWALVEGKPGEYRADGIFDLEPFFDAASEAGIYLLARPGPYINAESSGGGFPGWLQRVNGTLRSSDKAYLDATDNYVSHVAATIAKYQITNGGPIILYQPENEYTSGCSGVEFPDPVYMQYVEDQARNAGVVIPLINNDASASGNNAPGTGKGAVDIYGHDSYPLGFDCANPTVWPSGDLPTNFRTLHLEQSPTTPYAIVEFQGGSYDPWGGPGFAACSELLNNEFERVFYKNDFSFQIAIMNLYMIFGGTNWGNLGYPNGYTSYDYGSAVTESRNITREKYSELKLLGNFAKVSPGYLTASPGNLTTSGYADTTDLTVTPLLGNSTGSFFVVRHSDYSSEESTSYKLRLPTSAGSVTIPQLGGTLTLNGRDSKIHVTDHNVSGTNIIYSTAEVFTWKKFADG.... The pIC50 is 3.4.