This data is from Drug-target binding data from BindingDB using IC50 measurements. The task is: Regression. Given a target protein amino acid sequence and a drug SMILES string, predict the binding affinity score between them. We predict pIC50 (pIC50 = -log10(IC50 in M); higher means more potent). Dataset: bindingdb_ic50. (1) The compound is O=C(NCc1ccc(S(=O)(=O)c2cc(F)cc(F)c2)cc1)c1ccc2nccn2c1. The target protein (Q80Z29) has sequence MNAAAEAEFNILLATDSYKVTHYKQYPPNTSKVYSYFECREKKTENSKVRKVKYEETVFYGLQYILNKYLKGKVVTKEKIQEAKEVYREHFQDDVFNERGWNYILEKYDGHLPIEVKAVPEGSVIPRGNVLFTVENTDPECYWLTNWIETILVQSWYPITVATNSREQKKILAKYLLETSGNLDGLEYKLHDFGYRGVSSQETAGIGASAHLVNFKGTDTVAGIALIKKYYGTKDPVPGYSVPAAEHSTITAWGKDHEKDAFEHIVTQFSSVPVSVVSDSYDIYNACEKIWGEDLRHLIVSRSTEAPLIIRPDSGNPLDTVLKVLDILGKKFPVSENSKGYKLLPPYLRVIQGDGVDINTLQEIVEGMKQKKWSIENVSFGSGGALLQKLTRDLLNCSFKCSYVVTNGLGVNVFKDPVADPNKRSKKGRLSLHRTPAGTFVTLEEGKGDLEEYGHDLLHTVFKNGKVTKSYSFDEVRKNAQLNMEQDVAPH. The pIC50 is 7.6. (2) The small molecule is O=C(O)CCNc1cc(N2CCc3ccccc3CC2)nc(-c2ccccn2)n1. The target protein sequence is MEPGSDDFLPPPECPVFEPSWAEFRDPLGYIAKIRPIAEKSGICKIRPPADWQPPFAVEVDNFRFTPRIQRLNELEAQTRVKLNYLDQIAKFWEIQGSSLKIPNVERRILDLYSLSKIVVEEGGYEAICKDRRWARVAQRLNYPPGKNIGSLLRSHYERIVYPYEMYQSGANLVQCNTRPFDNEEKDKEYKPHSIPLRQSVQPSKFNSYGRRAKRLQPDPEPTEEDIEKNPELKKLQIYGAGPKMMGLGLMAKDKTLRKKDKEGPECPPTVVVKEELGGDVKVESTSPKTFLESKEELSHSPEPCTKMTMRLRRNHSNAQFIESYVCRMCSRGDEDDKLLLCDGCDDNYHIFCLLPPLPEIPKGVWRCPKCVMAECKRPPEAFGFEQATREYTLQSFGEMADSFKADYFNMPVHMVPTELVEKEFWRLVNSIEEDVTVEYGADIHSKEFGSGFPVSDSKRHLTPEEEEYATSGWNLNVMPVLEQSVLCHINADISGMKVP.... The pIC50 is 5.8. (3) The drug is COc1ccc2c(c1)N(C)/C(=C1/S/C(=N\Cc3ccccc3)N(c3ccccc3)C1=O)S2. The target protein (O14786) has sequence MERGLPLLCAVLALVLAPAGAFRNDKCGDTIKIESPGYLTSPGYPHSYHPSEKCEWLIQAPDPYQRIMINFNPHFDLEDRDCKYDYVEVFDGENENGHFRGKFCGKIAPPPVVSSGPFLFIKFVSDYETHGAGFSIRYEIFKRGPECSQNYTTPSGVIKSPGFPEKYPNSLECTYIVFVPKMSEIILEFESFDLEPDSNPPGGMFCRYDRLEIWDGFPDVGPHIGRYCGQKTPGRIRSSSGILSMVFYTDSAIAKEGFSANYSVLQSSVSEDFKCMEALGMESGEIHSDQITASSQYSTNWSAERSRLNYPENGWTPGEDSYREWIQVDLGLLRFVTAVGTQGAISKETKKKYYVKTYKIDVSSNGEDWITIKEGNKPVLFQGNTNPTDVVVAVFPKPLITRFVRIKPATWETGISMRFEVYGCKITDYPCSGMLGMVSGLISDSQITSSNQGDRNWMPENIRLVTSRSGWALPPAPHSYINEWLQIDLGEEKIVRGIII.... The pIC50 is 4.0. (4) The compound is O=C(CSc1nnc(-c2ccco2)c(-c2ccco2)n1)Nc1ccccc1[N+](=O)[O-]. The target protein sequence is MAPTWGPGMVSVVGPMGLLVVLLVGGCAAEEPPRFIKEPKDQIGVSGRVASFVCQATGDPKPRVTWNKKGKKVNSQRFETIEFDESAGAVLRIQPLRTPRDENVYECVAQNSVGEITVHAKLTVLREDQLPSGFPNIDMGPQLKVVERTRTATMLCAASGNPDPEITWFKDFLPVDPSASNGRIKQLRSGALQIESSEETDQGKYECVATNSAGVRYSSPANLYVRVRRVAPRFSILPMSHEIMPGGNVNITCVAVGSPMPYVKWMQGAEDLTPEDDMPVGRNVLELTDVKDSANYHPCVAMSSLGVIEAVAQITVKSLPKAPGTPMVTENTATSITITWDSGNPDPVSYYVIEYKSKSQDGPYQIKEDITTTRYSIGGLSPNSEYEIWVSAVNSIGQGPPSESVVTRTGEQAPARPPRNVQARMLSATTMIVQWEEPVEPNGLIRGYRVYYTMEPEHPVGNWQKHNVDDSLLTTVGSLLEDETYTVRVLAFTSVGDGPL.... The pIC50 is 5.9. (5) The small molecule is O=[N+]([O-])c1ccc(C2CC(c3ccco3)=Nc3ccccc3S2)cc1. The target protein (P59071) has sequence SLLEFGKMILEETGKLAIPSYSSYGCYCGWGGKGTPKDATDRCCFVHDCCYGNLPDCNPKSDRYKYKRVNGAIVCEKGTSCENRICECDKAAAICFRQNLNTYSKKYMLYPDFLCKGELKC. The pIC50 is 4.2. (6) The small molecule is Oc1c(C(Nc2cccc(F)c2)c2ccccn2)ccc2cccnc12. The target protein (P14174) has sequence MPMFIVNTNVPRASVPDGFLSELTQQLAQATGKPPQYIAVHVVPDQLMAFGGSSEPCALCSLHSIGKIGGAQNRSYSKLLCGLLAERLRISPDRVYINYYDMNAANVGWNNSTFA. The pIC50 is 5.6. (7) The small molecule is CSC[C@@H]1CO[C@@](CCc2ccc(Cl)cc2)(Cn2ccnc2)O1. The target protein (P04800) has sequence MDLLSALTLETWVLLAVVLVLLYGFGTRTHGLFKKQGIPGPKPLPFFGTVLNYYMGLWKFDVECHKKYGKIWGLFDGQMPLFAITDTEMIKNVLVKECFSVFTNRRDFGPVGIMGKAVSVAKDEEWKRYRALLSPTFTSGRLKEMFPIIEQYGDILVKYLKQEAETGKPVTMKKVFGAYSMDVITSTSFGVNVDSLNNPKDPFVEKTKKLLRFDFFDPLFLSVVLFPFLTPIYEMLNICMFPKDSIEFFKKFVYRMKETRLDSVQKHRVDFLQLMMNAHNDSKDKESHTALSDMEITAQSIIFIFAGYEPTSSTLSFVLHSLATHPDTQKKLQEEIDRALPNKAPPTYDTVMEMEYLDMVLNETLRLYPIGNRLERVCKKDVEINGVFMPKGSVVMIPSYALHRDPQHWPEPEEFRPERFSKENKGSIDPYVYLPFGNGPRNCIGMRFALMNMKLALTKVLQNFSFQPCKETQIPLKLSRQGLLQPTKPIILKVVPRDEI.... The pIC50 is 5.7.