This data is from Drug-target binding data from BindingDB using IC50 measurements. The task is: Regression. Given a target protein amino acid sequence and a drug SMILES string, predict the binding affinity score between them. We predict pIC50 (pIC50 = -log10(IC50 in M); higher means more potent). Dataset: bindingdb_ic50. (1) The drug is CCOC(=O)c1ccc(-c2[nH]c(-c3ccccc3)cc2-c2ccncc2)cc1. The target protein (P70618) has sequence MSQERPTFYRQELNKTVWEVPERYQNLSPVGSGAYGSVCAAFDTKTGHRVAVKKLSRPFQSIIHAKRTYRELRLLKHMKHENVIGLLDVFTPARSLEEFNDVYLVTHLMGADLNNIVKCQKLTDDHVQFLIYQILRGLKYIHSADIIHRDLKPSNLAVNEDCELKILDFGLARHTDDEMTGYVATRWYRAPEIMLNWMHYNQTVDIWSVGCIMAELLTGRTLFPGTDHIDQLKLILRLVGTPGAELLKKISSESARNYIQSLAQMPKMNFANVFIGANPLAVDLLEKMLVLDSDKRITAAQALAHAYFAQYHDPDDEPVAEPYDQSFESRDFLIDEWKSLTYDEVISFVPPPLDQEEMES. The pIC50 is 5.0. (2) The small molecule is Cn1cnc(-c2cc(-c3c[nH]nn3)ccn2)c1-c1ccc(F)cc1. The target protein sequence is MASVPVYCLCRLPYDVTRFMIECDMCQDWFHGSCVGVEEEKAADIDLYHCPNCEVLHGPSIMKKRRGSSKGHDTHKGKPVKTGSPTFVRELRSRTFDSSDEVILKPTGNQLTVEFLEENSFSVPILVLKKDGLGMTLPSPSFTVRDVEHYVGSDKEIDVIDVTRQADCKMKLGDFVKYYYSGKREKVLNVISLEFSDTRLSNLVETPKIVRKLSWVENLWPEECVFERPNVQKYCLMSVRDSYTDFHIDFGGTSVWYHVLKGEKIFYLIRPTNANLTLFECWSSSSNQNEMFFGDQVDKCYKCSVKQGQTLFIPTGWIHAVLTPVDCLAFGGNFLHSLNIEMQLKAYEIEKRLSTADLFRFPNFETICWYVGKHILDIFRGLRENRRHPASYLVHGGKALNLAFRAWTRKEALPDHEDEIPETVRTVQLIKDLAREIRLVEDIFQQNVGKTSNIFGLQRIFPAGSIPLTRPAHSTSVSMSRLSLPSKNGSKKKGLKPKEL.... The pIC50 is 6.3. (3) The small molecule is COc1ccc2cc(C3=CN4CCC3CC4)ccc2c1. The target protein sequence is MGFFSDSVAMMRVKWQMRSVKIQVPPEETDLRFCYDIMNDVSRSFAVVVAQLADQQLRDAICIFYLVLRALDTLEDDMSVPVDVKLKELPKFHTHTSDMSWCMSGVGEGRERELLAKYPCVSREFKKLKKEYQDVIANICERMANGMCEFLKRPVVTKDDYNQYCHYVAGLVGHGLTQLFARCGFEDPSLDDDLTSSNHMGLFLQKTNIIRDYYEDIREEPPRMFWPKEIWGTYVTELKELKSESNNAAAVQCLNAMVADALVHVPYIVDYLSALRDPSVFRFCAIPQVMAIATLKEVYNNPDTFQVKVKVSRPESCRIMLKATTLYSSLSMFRDYCVELQEKLDMQDASSVSIANSLAAAIERIDLQLKKCQDVSYTRSLLARYPGLGGQFLLTVMDTVAGFFGGRKEIAGHA. The pIC50 is 6.3. (4) The compound is CC1=C(CO)C2=C(C)C3(CC3)[C@@](C)(O)C(=O)C2=C1. The target protein (P97584) has sequence MVQAKTWTLKKHFEGFPTDSNFELRTTELPPLNNGEVLLEALFLSVDPYMRVAAKKLKEGDSMMGEQVARVVESKNSAFPTGTIVVALLGWTSHSISDGNGLRKLPAEWPDKLPLSLALGTVGMPGLTAYFGLLDICGLKGGETVLVNAAAGAVGSVVGQIAKLKGCKVVGTAGSDEKVAYLKKLGFDVAFNYKTVKSLEEALRTASPDGYDCYFDNVGGEFSNTVILQMKTFGRIAICGAISQYNRTGPCPPGPSPEVIIYQQLRMEGFIVTRWQGEVRQKALTDLMNWVSEGKIRYHEYITEGFEKMPAAFMGMLKGDNLGKTIVKA. The pIC50 is 7.3. (5) The compound is CC(C)c1n[nH]c(O[C@@H]2O[C@H](CO)[C@@H](O)[C@H](O)[C@H]2O)c1Cc1ccccc1OCc1ccccc1. The target protein (P53790) has sequence MDSSTLSPAVTATDAPIQSYERIRNAADISVIVIYFVVVMAVGLWAMFSTNRGTVGGFFLAGRSMVWWPIGASLFASNIGSGHFVGLAGTGAAAGIAMGGFEWNALVFVVVLGWLFVPIYIKAGVVTMPEYLRKRFGGKRIQIYLSVLSLLLYIFTKISADIFSGAIFINLALGLDIYLAIFILLAITALYTITGGLAAVIYTDTLQTAIMLVGSFILTGFAFREVGGYEAFMDKYMKAIPTLVSDGNITVKEECYTPRADSFHIFRDPITGDMPWPGLIFGLSILALWYWCTDQVIVQRCLSAKNMSHVKAGCTLCGYLKLLPMFLMVMPGMISRILYTDKIACVLPSECKKYCGTPVGCTNIAYPTLVVELMPNGLRGLMLSVMMASLMSSLTSIFNSASTLFTMDIYTKIRKGASEKELMIAGRLFILVLIGISIAWVPIVQSAQSGQLFDYIQSITSYLGPPIAAVFLLAIFCKRVNEPGAFWGLILGFLIGISRM.... The pIC50 is 6.5. (6) The drug is O=c1c(O)c(-c2ccc(O)cc2O)oc2cc(O)cc(O)c12. The target protein (P03126) has sequence MHQKRTAMFQDPQERPRKLPQLCTELQTTIHDIILECVYCKQQLLRREVYDFAFRDLCIVYRDGNPYAVCDKCLKFYSKISEYRHYCYSLYGTTLEQQYNKPLCDLLIRCINCQKPLCPEEKQRHLDKKQRFHNIRGRWTGRCMSCCRSSRTRRETQL. The pIC50 is 5.4. (7) The drug is Cc1ccc(NC(=O)c2ccnc(N3CCOCC3)c2)cc1-c1ccc(C(=O)NCC2CC2)cc1. The target protein (P04792) has sequence MTERRVPFSLLRGPSWDPFRDWYPHSRLFDQAFGLPRLPEEWSQWLGGSSWPGYVRPLPPAAIESPAVAAPAYSRALSRQLSSGVSEIRHTADRWRVSLDVNHFAPDELTVKTKDGVVEITGKHEERQDEHGYISRCFTRKYTLPPGVDPTQVSSSLSPEGTLTVEAPMPKLATQSNEITIPVTFESRAQLGGPEAAKSDETAAK. The pIC50 is 7.6.