Regression. Given a target protein amino acid sequence and a drug SMILES string, predict the binding affinity score between them. We predict pIC50 (pIC50 = -log10(IC50 in M); higher means more potent). Dataset: bindingdb_ic50. From a dataset of Drug-target binding data from BindingDB using IC50 measurements. (1) The compound is O=C1c2cccc3c(NCCO)ccc(c23)C(=O)N1c1cccc(Br)c1. The target protein sequence is SMSYTWTGALITPCAAEESKLPINALSNSLLRHHNMVYATTSRSAGLRQKKVTFDRLQVLDDHYRDVLKEMKAKASTVKAKLLSVEEACKLTPPHSAKSKFGYGAKDVRNLSSKAVNHIHSVWKDLLEDTVTPIDTTIMAKNEVFCVQPEKGGRKPARLIVFPDLGVRVCEKMALYDVVSTLPQVVMGSSYGFQYSPGQRVEFLVNTWKSKKNPMGFSYDTRCFDSTVTENDIRVEESIYQCCDLAPEARQAIKSLTERLYIGGPLTNSKGQNCGYRRCRASGVLTTSCGNTLTCYLKASAACRAAKLQDCTMLVNGDDLVVICESAGTQEDAASLRVFTEAMTRYSAPPGDPPQPEYDLELITSCSSNVSVAHDASGKRVYYLTRDPTTPLARAAWETARHTPVNSWLGNIIMYAPTLWARMILMTHFFSILLAQEQLEKALDCQIYGACYSIEPLDLPQIIERLHGLSAFSLHSYSPGEINRVASCLRKLGVAPLRVW.... The pIC50 is 7.7. (2) The drug is COc1cccc2c1C(=O)c1c(O)c3c(c(O)c1C2=O)C[C@@](O)(C(=O)CO)C[C@@H]3O[C@H]1C[C@H](N)[C@H](O)[C@H](C)O1. The target protein (Q7YR26) has sequence MSGDHLHNDSQIEADFRLNDSHKHKDKHKDREHRHKEHKKDKEKDREKSKHSNSEHKDSEKKHKEKEKTKHKDGSSEKHKDKHKDRDKEKRKEEKVRASGDAKIKKEKENGFSSPPQIKDEPEDDGYFVPPKEDIKPLKRPRDEDDADYKPKKIKTEDIKKEKKRKLEEEEDGKLRKPKNKDKDKKVPEPDNKKKKPKKEEEQKWKWWEEERYPEGIKWKFLEHKGPVFAPPYEPLPDSVKFYYDGKVMKLSPKAEEVATFFAKMLDHEYTTKEIFRKNFFKDWRKEMTNEEKNIITNLSKCDFTQMSQYFKAQTEARKQMSKEEKLKIKEENEKLLKEYGFCIMDNHKERIANFKIEPPGLFRGRGNHPKMGMLKRRIMPEDIIINCSKDAKVPSPPPGHKWKEVRHDNKVTWLVSWTENIQGSIKYIMLNPSSRIKGEKDWQKYETARRLKKCVDKIRNQYREDWKSKEMKVRQRAVALYFIDKLALRAGNEKEEGET.... The pIC50 is 3.7. (3) The drug is CNC(=O)c1ccc(-c2[nH]c3ncnc(NC[C@@H]4CCCO4)c3c2-c2ccccc2)cc1. The target protein (O54967) has sequence MQPEEGTGWLLELLSEVQLQQYFLRLRDDLNITRLSHFEYVKNEDLEKIGMGRPGQRRLWEAVKRRKAMCKRKSWMSKVFSGKRLEAEFPSQHSQSTFRKPSPTPGSLPGEGTLQSLTCLIGEKDLRLLEKLGDGSFGVVRRGEWDAPAGKTVSVAVKCLKPDVLSQPEAMDDFIREVNAMHSLDHRNLIRLYGVVLTLPMKMVTELAPLGSLLDRLRKHQGHFLLGTLSRYAVQVAEGMAYLESKRFIHRDLAARNLLLATRDLVKIGDFGLMRALPQNDDHYVMQEHRKVPFAWCAPESLKTRTFSHASDTWMFGVTLWEMFTYGQEPWIGLNGSQILHKIDKEGERLPRPEDCPQDIYNVMVQCWAHKPEDRPTFVALRDFLLEAQPTDMRALQDFEEPDKLHIQMNDVITVIEGRAENYWWRGQNTRTLCVGPFPRNVVTSVAGLSAQDISQPLQNSFIHTGHGDSDPRHCWGFPDRIDELYLGNPMDPPDLLSVE.... The pIC50 is 7.4. (4) The compound is CCCCn1c(NC(=O)c2ccc(Br)o2)c(C#N)c2nc3ccccc3nc21. The target protein (P9WIS7) has sequence MAFSVQMPALGESVTEGTVTRWLKQEGDTVELDEPLVEVSTDKVDTEIPSPAAGVLTKIIAQEDDTVEVGGELAVIGDAKDAGEAAAPAPEKVPAAQPESKPAPEPPPVQPTSGAPAGGDAKPVLMPELGESVTEGTVIRWLKKIGDSVQVDEPLVEVSTDKVDTEIPSPVAGVLVSISADEDATVPVGGELARIGVAADIGAAPAPKPAPKPVPEPAPTPKAEPAPSPPAAQPAGAAEGAPYVTPLVRKLASENNIDLAGVTGTGVGGRIRKQDVLAAAEQKKRAKAPAPAAQAAAAPAPKAPPAPAPALAHLRGTTQKASRIRQITANKTRESLQATAQLTQTHEVDMTKIVGLRARAKAAFAEREGVNLTFLPFFAKAVIDALKIHPNINASYNEDTKEITYYDAEHLGFAVDTEQGLLSPVIHDAGDLSLAGLARAIADIAARARSGNLKPDELSGGTFTITNIGSQGALFDTPILVPPQAAMLGTGAIVKRPRVV.... The pIC50 is 6.0. (5) The small molecule is Nc1nc(F)nc2c1ncn2C1CC(O)C(CO)O1. The target protein (Q04400) has sequence MSGSKSVSPPGYAAQTAASPAPRGGPEHRAAWGEADSRANGYPHAPGGSTRGSTKRSGGAVTPQQQQRLASRWRGGDDDEDPPLSGDDPLVGGFGFSFRSKSAWQERGGDDGGRGSRRQRRGAAGGGSTRAPPAGGSGSSAAAAAAAGGTEVRPRSVEVGLEERRGKGRAAEELEPGTGTVEDGDGSEDGGSSVASGSGTGTVLSLGACCLALLQIFRSKKFPSDKLERLYQRYFFRLNQSSLTMLMAVLVLVCLVMLAFHAARPPLQVVYLAVLAAAVGVILIMAVLCNRAAFHQDHMGLACYALIAVVLAVQVVGLLLPQPRSASEGIWWTVFFIYTIYTLLPVRMRAAVLSGVLLSALHLAISLHTNAQDQFLLKQLVSNVLIFSCTNIVGVCTHYPAEVSQRQAFQETRECIQARLHSQRENQQQERLLLSVLPRHVAMEMKADINAKQEDMMFHKIYIQKHDNVSILFADIEGFTSLASQCTAQELVMTLNELFA.... The pIC50 is 4.0.