Dataset: Drug-target binding data from BindingDB using IC50 measurements. Task: Regression. Given a target protein amino acid sequence and a drug SMILES string, predict the binding affinity score between them. We predict pIC50 (pIC50 = -log10(IC50 in M); higher means more potent). Dataset: bindingdb_ic50. (1) The drug is COc1ccc2c(c1)C[C@@H]1Cn3c-2c(C2CCCCC2)c2ccc(cc23)C(=O)NS(=O)(=O)N(C)CCOCCN(C)C1=O. The target protein sequence is SMSYTWTGALITPCAAEETKLPINALSNSLLRHHNLVYATTSRSASLRQKKVTFDRLQVLDDHYRDVLKEMKAKASTVKAKLLSVEEACKLTPPHSARSKFGYGAKDVRNLSSKAVNHIRSVWKDLLEDTETPIDTTIMAKNEVFCVQPEKGGRKPARLIVFPDLGVRVCEKMALYDVVSTLPQAVMGSSYGFQYSPGQRVEFLVNAWKAKKCPMGFAYDTRCFDSTVTENDIRVEESIYQCCDLAPEARQAIRSLTERLYIGGPLTNSKGQNCGYRRCRASGVLTTSCGNTLTCYLKAAAACRAAKLQDCTMLVCGDDLVVICESAGTQEDEASLRAFTEAMTRYSAPPGDPPKPEYDLELITSCSSNVSVAHDASGKRVYYLTRDPTTPLARAAWETARHTPVNSWLGNIIMYAPTLWARMILMTHFFSILLAQEQLEKALDCQIYGACYSIEPLDLPQIIQRLHGLSAFSLHSYSPGEINRVASCLRKLGVPPLRVW.... The pIC50 is 6.3. (2) The pIC50 is 7.3. The compound is CCOC(=O)CC(O)[C@H](CC(C)C)NC(=O)[C@@H](NC(=O)[C@@H](NC(=O)OC(C)(C)C)C1CCCCC1)[C@@H](C)CC. The target protein (P42893) has sequence MGSLRPPQGLGLQWSSFFLGKKGPGLTVSLPLLASSLQVNFRSPRSGQRCWAARTSVEKRLVVLVTLLAAGLVACLAALGIQYRTRTPPVCLTEACVSVTSSILNSMDPTVDPCQDFFSYACGGWIKANPVPDGHSRWGTFSNLWEHNQAIIKHLLENSTASASEAEKKAQVYYRACMNETRIEELRAKPLMELIEKLGGWNITGPWAKDNFQDTLQVVTAHYRTSPFFSVYVSADSKNSNSNVIQVDQSGLGLPSRDYYLNKTENEKVLTGYLNYMVQLGKLLGGGDEDSIRPQMQQILDFETALANITIPQEKRRDEELIYHKVTAAELQTLAPAINWLPFLNAIFYPVEINESEPIVVYDKEYLRQVSTLINSTDKCLLNNYMMWNLVRKTSSFLDQRFQDADEKFMEVMYGTKKTCLPRWKFCVSDTENNLGFALGPMFVKATFAEDSKNIASEIILEIKKAFEESLSTLKWMDEDTRRSAKEKADAIYNMIGYPN.... (3) The small molecule is CC1(C)CN(C(=O)c2ccc(-c3cccc4nc(NC(=O)C5CC5)nn34)cc2)C1. The target protein sequence is CRYDPLQDNTGEVVAVKKLQHSTEEHLRDFEREIEILKSLQHDNIVKYKGVCYSAGRRNLKLIMEYLPYGSLRDYLQKHKERIDHIKLLQYTSQICKGMEYLGTKRYIHRDLATRNILVENENRVKIGDFGLTKVLPQDKEYYKVKEPGESPIFWYAPESLTESKFSVASDVWSFGVVLYELFTYIEKSK. The pIC50 is 7.2. (4) The compound is O=P(O)(OCC1NC[C@H](O)[C@@H]1O)OP(=O)(O)OC[C@H]1OC(n2cnc3c(NCc4ccccc4)ncnc32)[C@H](O)[C@@H]1O. The target protein (O02776) has sequence MSAGPGCEPCTKRPRWDAAATSPPAASDARSFPGRQRRVLDSKDAPVQFRVPPSSSGCALGRAGQHRGSATSLVFKQKTITSWMDTKGIKTVESESLHSKENNNTREESMMSSVQKDNFYQHNMEKLENVSQLGFDKSPVEKGTQYLKQHQTAAMCKWQNEGPHSERLLESEPPAVTLVPEQFSNANVDQSSPKDDHSDTNSEESRDNQQFLTHVKLANAKQTMEDEQGREARSHQKCGKACHPAEACAGCQQEETDVVSESPLSDTGSEDVGTGLKNANRLNRQESSLGNSPPFEKESEPESPMDVDNSKNSCQDSEADEETSPGFDEQEDSSSAQTANKPSRFQPREADTELRKRSSAKGGEIRLHFQFEGGESRAGMNDVNAKRPGSTSSLNVECRNSKQHGRKDSKITDHFMRVPKAEDKRKEQCEMKHQRTERKIPKYIPPHLSPDKKWLGTPIEEMRRMPRCGIRLPPLRPSANHTVTIRVDLLRIGEVPKPFP.... The pIC50 is 3.0. (5) The small molecule is COc1ccc(C(=O)NC(=S)Nc2ccc3[nH]c(=O)[nH]c3c2)c(OC)c1. The target protein (P04578) has sequence MRVKEKYQHLWRWGWRWGTMLLGMLMICSATEKLWVTVYYGVPVWKEATTTLFCASDAKAYDTEVHNVWATHACVPTDPNPQEVVLVNVTENFNMWKNDMVEQMHEDIISLWDQSLKPCVKLTPLCVSLKCTDLKNDTNTNSSSGRMIMEKGEIKNCSFNISTSIRGKVQKEYAFFYKLDIIPIDNDTTSYKLTSCNTSVITQACPKVSFEPIPIHYCAPAGFAILKCNNKTFNGTGPCTNVSTVQCTHGIRPVVSTQLLLNGSLAEEEVVIRSVNFTDNAKTIIVQLNTSVEINCTRPNNNTRKRIRIQRGPGRAFVTIGKIGNMRQAHCNISRAKWNNTLKQIASKLREQFGNNKTIIFKQSSGGDPEIVTHSFNCGGEFFYCNSTQLFNSTWFNSTWSTEGSNNTEGSDTITLPCRIKQIINMWQKVGKAMYAPPISGQIRCSSNITGLLLTRDGGNSNNESEIFRPGGGDMRDNWRSELYKYKVVKIEPLGVAPTK.... The pIC50 is 5.2. (6) The compound is COCc1nc2c(C(=O)Nc3cccc(Cl)c3C)cc(NC(=O)c3ccc(F)cc3Cl)cc2[nH]1. The target protein (O14684) has sequence MPAHSLVMSSPALPAFLLCSTLLVIKMYVVAIITGQVRLRKKAFANPEDALRHGGPQYCRSDPDVERCLRAHRNDMETIYPFLFLGFVYSFLGPNPFVAWMHFLVFLVGRVAHTVAYLGKLRAPIRSVTYTLAQLPCASMALQILWEAARHL. The pIC50 is 8.4. (7) The compound is O=C(Nc1ccc(C(=O)c2ccccc2)cc1)c1ccc(-c2ccc(C(=O)O)o2)cc1. The target protein (P00563) has sequence MPFGNTHNKYKLNYKSEEEYPDLSKHNNHMAKVLTPDLYKKLRDKETPSGFTLDDVIQTGVDNPGHPFIMTVGCVAGDEESYTVFKDLFDPIIQDRHGGFKPTDKHKTDLNHENLKGGDDLDPHYVLSSRVRTGRSIKGYTLPPHCSRGERRAVEKLSVEALNSLTGEFKGKYYPLKSMTEQEQQQLIDDHFLFDKPVSPLLLASGMARDWPDARGIWHNDNKSFLVWVNEEDHLRVISMEKGGNMKEVFRRFCVGLQKIEEIFKKAGHPFMWNEHLGYVLTCPSNLGTGLRGGVHVKLAHLSKHPKFEEILTRLRLQKRGTGGVDTAAVGSVFDISNADRLGSSEVEQVQLVVDGVKLMVEMEKKLEKGQSIDDMIPAQK. The pIC50 is 4.3. (8) The small molecule is CC1=C(CCC(=O)O)/C2=C/c3c(CCC(=O)O)c(C)c4n3[Cu]n3/c(c(C)c(C(COC(=O)C5=CC5)OC(=O)C5=CC5)/c3=C/C3=N/C(=C\4)C(C(COC(=O)C4=CC4)OC(=O)C4=CC4)=C3C)=C\C1=N2. The target protein sequence is MTGDTPINIFGRNILTALGMSLNLPVARIEPIKITLKPGKDGPRLKQWPLTKEKVEALKEICEKMEKEGQLEEAPPTNPYNTPTFAIKKKDKNKWRMLIDFRELNRVTQDFTEIQLGIPHPAGLAKKKRITVLDVGDAYFSIPLYEDFRPYTAFTLPSVNNVEPGKRYIYKVLPQGWKGSPAIFQYTMRQILEPFRKANPDVILIQYMDDILIASDRTGLEHDKVVLQLKELLNGLGFSTPEEKFQKDPPFQWMGYELWPTKWKLQKIQLPQKETWTVNDIQKLVGILNWAAQIYPGIKTKHLCRLIRGKMTLTEEVQWTELAEAELEENRIILDQEQEGHYYQEEKELEATIQKSQDNQWTYKIHQEEKILKVGKYAKIKNTHTNGVRLLAQVVQKIGKEALVIWGRIPKFHLPVERETWEQWWDNYWQVTWIPEWDFVSTPPLVRLTFNLVGDPIPGTETFYTDGSCNRQSKEGKAGYVTDRGRDKVRVLEQTTNQQA.... The pIC50 is 5.7. (9) The compound is Oc1ccc(-c2csc(Nc3ccccc3)n2)c(O)c1. The target protein (P00636) has sequence MTDQAAFDTNIVTLTRFVMEEGRKARGTGEMTQLLNSLCTAVKAISTAVRKAGIAHLYGIAGSTNVTGDQVKKLDVLSNDLVINVLKSSFATCVLVSEEDKNAIIVEPEKRGKYVVCFDPLDGSSNIDCLVSIGTIFGIYRKNSTDEPSEKDALQPGRNLVAAGYALYGSATMLVLAMVNGVNCFMLDPAIGEFILVDRDVKIKKKGSIYSINEGYAKEFDPAITEYIQRKKFPPDNSAPYGARYVGSMVADVHRTLVYGGIFMYPANKKSPKGKLRLLYECNPMAYVMEKAGGLATTGKEAVLDIVPTDIHQRAPIILGSPEDVTELLEIYQKHAAK. The pIC50 is 3.5. (10) The compound is S=c1[nH]nc(-c2ccc(Br)cc2)o1. The target protein (Q8XB35) has sequence MFLAQEIIRKKRDGHALSDEEIRFFINGIRDNTISEGQIAALAMTIFFHDMTMPERVSLTMAMRDSGTVLDWKSLHLNGPIVDKHSTGGVGDVTSLMLGPMVAACGGYIPMISGRGLGHTGGTLDKLESIPGFDIFPDDNRFREIIKDVGVAIIGQTSSLAPADKRFYATRDITATVDSIPLITASILAKKLAEGLDALVMDVKVGSGAFMPTYELSEALAEAIVGVANGAGVRTTALLTDMNQVLASSAGNAVEVREAVQFLTGEYRNPRLFDVTMALCVEMLISGKLAKDDAEARAKLQAVLDNGKAAEVFGRMVAAQKGPTDFVENYAKYLPTAMLTKAVYADTEGFVSEMDTRALGMAVVAMGGGRRQASDTIDYSVGFTDMARLGDQVDGQRPLAVIHAKDENSWQEAAKAVKAAIKLADKAPESTPTVYRRISE. The pIC50 is 4.2.