This data is from Drug-target binding data from BindingDB using IC50 measurements. The task is: Regression. Given a target protein amino acid sequence and a drug SMILES string, predict the binding affinity score between them. We predict pIC50 (pIC50 = -log10(IC50 in M); higher means more potent). Dataset: bindingdb_ic50. (1) The target protein (P21397) has sequence MENQEKASIAGHMFDVVVIGGGISGLSAAKLLTEYGVSVLVLEARDRVGGRTYTIRNEHVDYVDVGGAYVGPTQNRILRLSKELGIETYKVNVSERLVQYVKGKTYPFRGAFPPVWNPIAYLDYNNLWRTIDNMGKEIPTDAPWEAQHADKWDKMTMKELIDKICWTKTARRFAYLFVNINVTSEPHEVSALWFLWYVKQCGGTTRIFSVTNGGQERKFVGGSGQVSERIMDLLGDQVKLNHPVTHVDQSSDNIIIETLNHEHYECKYVINAIPPTLTAKIHFRPELPAERNQLIQRLPMGAVIKCMMYYKEAFWKKKDYCGCMIIEDEDAPISITLDDTKPDGSLPAIMGFILARKADRLAKLHKEIRKKKICELYAKVLGSQEALHPVHYEEKNWCEEQYSGGCYTAYFPPGIMTQYGRVIRQPVGRIFFAGTETATKWSGYMEGAVEAGERAAREVLNGLGKVTEKDIWVQEPESKDVPAVEITHTFWERNLPSVSG.... The compound is O=C(/C=C/c1ccccc1C(F)(F)F)Oc1ccccc1. The pIC50 is 4.0. (2) The small molecule is C[C@@H]1O[C@H](C)CN2c3c(cc4c(N5C(=O)OC[C@@H]5c5cnccn5)noc4c3F)CC3(C(=O)NC(=O)NC3=O)[C@@H]12. The pIC50 is 6.0. The target protein (P0AES4) has sequence MSDLAREITPVNIEEELKSSYLDYAMSVIVGRALPDVRDGLKPVHRRVLYAMNVLGNDWNKAYKKSARVVGDVIGKYHPHGDSAVYDTIVRMAQPFSLRYMLVDGQGNFGSIDGDSAAAMRYTEIRLAKIAHELMADLEKETVDFVDNYDGTEKIPDVMPTKIPNLLVNGSSGIAVGMATNIPPHNLTEVINGCLAYIDDEDISIEGLMEHIPGPDFPTAAIINGRRGIEEAYRTGRGKVYIRARAEVEVDAKTGRETIIVHEIPYQVNKARLIEKIAELVKEKRVEGISALRDESDKDGMRIVIEVKRDAVGEVVLNNLYSQTQLQVSFGINMVALHHGQPKIMNLKDIIAAFVRHRREVVTRRTIFELRKARDRAHILEALAVALANIDPIIELIRHAPTPAEAKTALVANPWQLGNVAAMLERAGDDAARPEWLEPEFGVRDGLYYLTEQQAQAILDLRLQKLTGLEHEKLLDEYKELLDQIAELLRILGSADRLME.... (3) The compound is O=C(CCc1ccc(Cl)c(Cl)c1)N[C@@H](Cc1ccccc1)C(=O)NC(CCS)C(=O)O. The target protein (P14925) has sequence MAGRARSGLLLLLLGLLALQSSCLAFRSPLSVFKRFKETTRSFSNECLGTIGPVTPLDASDFALDIRMPGVTPKESDTYFCMSMRLPVDEEAFVIDFKPRASMDTVHHMLLFGCNMPSSTGSYWFCDEGTCTDKANILYAWARNAPPTRLPKGVGFRVGGETGSKYFVLQVHYGDISAFRDNHKDCSGVSVHLTRVPQPLIAGMYLMMSVDTVIPPGEKVVNADISCQYKMYPMHVFAYRVHTHHLGKVVSGYRVRNGQWTLIGRQNPQLPQAFYPVEHPVDVTFGDILAARCVFTGEGRTEATHIGGTSSDEMCNLYIMYYMEAKYALSFMTCTKNVAPDMFRTIPAEANIPIPVKPDMVMMHGHHKEAENKEKSALMQQPKQGEEEVLEQGDFYSLLSKLLGEREDVHVHKYNPTEKTESGSDLVAEIANVVQKKDLGRSDAREGAEHEEWGNAILVRDRIHRFHQLESTLRPAESRAFSFQQPGEGPWEPEPSGDFH.... The pIC50 is 7.3. (4) The compound is Cc1cc(C(O)(c2ccc(Nc3ccccc3)cc2)c2ccc(Nc3ccc(S(=O)(=O)[O-])cc3)cc2)ccc1N. The target protein sequence is MSTNGKPQRKTKRNTNRRPQDVKFPGGGQIVGGVYLLPRRGPRLGVRATRKTWERSQPRGRRQPIPKARQPEGRAWAQPGYPWPLYGNEGLGWAGWLVSPRGSRPNWGPTDPRRRSRNLGKVIDTLTCGFADLMGYIPLVGAPLGGVARALAHGVRVLEDGVNYATGNLPGCSFSIFLLALLSCLTIPASAYEVHNVSGIYHVTNDCSNSSIVYEAADMIMHTPGCVPCVRENNSSRCWVALTPTLAARNNSVPTATIRRHVDLLVGAAAFCSAMYVGDLCGSVFLVSQLFTFSPRRYETVQDCNCSIYPGHVTGHRMAWDMMMNWSPTTALVVSQLLRIPQAVVDMVGGAHWGVLAGLAYYSMVGNWAKVLIVMLLFAGVDGSTIVSGGTVARTTHSLASLFTQGASQKIQLINTNGSWHINRTALNCNDSLQTGFLASLFYAHRFNASGCPERMASCRSIDKFDQGWGPITYTEADIQDQRPYCWHYAPRPCGIVPAS.... The pIC50 is 4.9. (5) The drug is CNC(=O)c1ccc(-c2nccnc2C2CCN(c3ccc4cc(F)ccc4n3)CC2)cc1F. The target protein (O00408) has sequence MGQACGHSILCRSQQYPAARPAEPRGQQVFLKPDEPPPPPQPCADSLQDALLSLGSVIDISGLQRAVKEALSAVLPRVETVYTYLLDGESQLVCEDPPHELPQEGKVREAIISQKRLGCNGLGFSDLPGKPLARLVAPLAPDTQVLVMPLADKEAGAVAAVILVHCGQLSDNEEWSLQAVEKHTLVALRRVQVLQQRGPREAPRAVQNPPEGTAEDQKGGAAYTDRDRKILQLCGELYDLDASSLQLKVLQYLQQETRASRCCLLLVSEDNLQLSCKVIGDKVLGEEVSFPLTGCLGQVVEDKKSIQLKDLTSEDVQQLQSMLGCELQAMLCVPVISRATDQVVALACAFNKLEGDLFTDEDEHVIQHCFHYTSTVLTSTLAFQKEQKLKCECQALLQVAKNLFTHLDDVSVLLQEIITEARNLSNAEICSVFLLDQNELVAKVFDGGVVDDESYEIRIPADQGIAGHVATTGQILNIPDAYAHPLFYRGVDDSTGFRTR.... The pIC50 is 5.0. (6) The compound is O=c1cc(N2CCOCC2)oc2c(-c3cccc4c3sc3ccccc34)cccc12. The target protein sequence is MAGSGAGVRCSLLRLQETLSAADRCGAALAGHQLIRGLGQECVLSSSPAVLALQTSLVFSRDFGLLVFVRKSLNSIEFRECREEILKFLCIFLEKMGQKIAPYSVEIKNTCTSVYTKDRAAKCKIPALDLLIKLLQTFRSSRLMDEFKIGELFSKFYGELALKKKIPDTVLEKVYELLGLLGEVHPSEMINNAENLFRAFLGELKTQMTSAVREPKLPVLAGCLKGLSSLLCNFTKSMEEDPQTSREIFNFVLKAIRPQIDLKRYAVPSAGLRLFALHASQFSTCLLDNYVSLFEVLLKWCAHTNVELKKAALSALESFLKQVSNMVAKNAEMHKNKLQYFMEQFYGIIRNVDSNNKELSIAIRGYGLFAGPCKVINAKDVDFMYVELIQRCKQMFLTQTDTGDDRVYQMPSFLQSVASVLLYLDTVPEVYTPVLEHLVVMQIDSFPQYSPKMQLVCCRAIVKVFLALAAKGPVLRNCISTVVHQGLIRICSKPVVLPKG.... The pIC50 is 7.9. (7) The drug is COCc1cc(C(=O)NC(Cc2ccc(-c3ccccc3)cc2)C(=O)NCCN(C)C)ccc1OC. The pIC50 is 4.0. The target protein (Q9BW19) has sequence MDPQRSPLLEVKGNIELKRPLIKAPSQLPLSGSRLKRRPDQMEDGLEPEKKRTRGLGATTKITTSHPRVPSLTTVPQTQGQTTAQKVSKKTGPRCSTAIATGLKNQKPVPAVPVQKSGTSGVPPMAGGKKPSKRPAWDLKGQLCDLNAELKRCRERTQTLDQENQQLQDQLRDAQQQVKALGTERTTLEGHLAKVQAQAEQGQQELKNLRACVLELEERLSTQEGLVQELQKKQVELQEERRGLMSQLEEKERRLQTSEAALSSSQAEVASLRQETVAQAALLTEREERLHGLEMERRRLHNQLQELKGNIRVFCRVRPVLPGEPTPPPGLLLFPSGPGGPSDPPTRLSLSRSDERRGTLSGAPAPPTRHDFSFDRVFPPGSGQDEVFEEIAMLVQSALDGYPVCIFAYGQTGSGKTFTMEGGPGGDPQLEGLIPRALRHLFSVAQELSGQGWTYSFVASYVEIYNETVRDLLATGTRKGQGGECEIRRAGPGSEELTVT.... (8) The compound is CCCCCOc1ccc(-c2ccc(-c3ccc(C(=O)N[C@@H]4C[C@@H](O)[C@@H](O)NC(=O)C5[C@@H](OC)[C@@H](OC)CN5C(=O)[C@H]([C@@H](C)O)NC(=O)[C@H]([C@H](O)[C@@H](O)c5ccc(O)cc5)NC(=O)[C@@H]5C[C@@H](O)CN5C(=O)[C@H]([C@@H](C)O)NC4=O)cc3)cc2)cc1. The target protein sequence is MSYNDNNNHYYDPNQQGGMPPHQGGEGYYQQQYDDMGQQPHQQDYYDPNAQYQQQPYDMDGYQDQANYGGQPMNAQGYNADPEAFSDFSYGGQTPGTPGYDQYGTQYTPSQMSYGGDPRSSGASTPIYGGQGQGYDPTQFNMSSNLPYPAWSADPQAPIKIEHIEDIFIDLTNKFGFQRDSMRNMFDYFMTLLDSRSSRMSPAQALLSLHADYIGGDNANYRKWYFSSQQDLDDSLGFANMTLGKIGRKARKASKKSKKARKAAEEHGQDVDALANELEGDYSLEAAEIRWKAKMNSLTPEERVRDLALYLLIWGEANQVRFTPECLCYIYKSATDYLNSPLCQQRQEPVPEGDYLNRVITPLYRFIRSQVYEIYDGRFVKREKDHNKVIGYDDVNQLFWYPEGISRIIFEDGTRLVDIPQEERFLKLGEVEWKNVFFKTYKEIRTWLHFVTNFNRIWIIHGTIYWMYTAYNSPTLYTKHYVQTINQQPLASSRWAACAI.... The pIC50 is 7.7.