This data is from Drug-target binding data from BindingDB using IC50 measurements. The task is: Regression. Given a target protein amino acid sequence and a drug SMILES string, predict the binding affinity score between them. We predict pIC50 (pIC50 = -log10(IC50 in M); higher means more potent). Dataset: bindingdb_ic50. (1) The small molecule is Cc1ccc(-c2c(-c3nn(C)c4ncnc(N5CC[C@H](N6CCCCC6)C5)c34)cnn2C)cc1. The target protein sequence is MDLIPNLAVETWLLLAVSLILLYLYGTRTHGLFKKLGIPGPTPLPFLGNALSFRKGYWTFDMECYKKYRKVWGIYDCQQPMLAITDPDMIKTVLVKECYSVFTNRRPFGPVGFMKNAISIAEDEEWKRIRSLLSPTFTSGKLKEMVPIIAQYGDVLVRNLRREAETGKPVTLKHVFGAYSMDVITSTSFGVSIDSLNNPQDPFVENTKKLLRFNPLDPFVLSIKVFPFLTPILEALNITVFPRKVISFLTKSVKQIKEGRLKETQKHRVDFLQLMIDSQNSKDSETHKALSDLELMAQSIIFIFAGYETTSSVLSFIIYELATHPDVQQKVQKEIDTVLPNKAPPTYDTVLQLEYLDMVVNETLRLFPVAMRLERVCKKDVEINGMFIPKGVVVMIPSYVLHHDPKYWREPEKFLPERFSKKNKDNIDPYIYTPFGSGPRNCIGMRFALVNMKLALVRVLQNFSFKPCKETQIPLKLRFGGLLLTEKPIVLKAESRDETV.... The pIC50 is 4.5. (2) The small molecule is O=C(NCC(F)(F)F)[C@@H]1CN(Cc2cc3ccccc3o2)CCN1C[C@@H](O)C[C@@H](Cc1cccnc1)C(=O)N[C@H]1c2ccccc2OC[C@H]1O. The target protein sequence is PQITLWKRPIVTIKIGGQLKEALLDTGADDTVLEEMSLPGKWKPKIIGGIGGFVKVRQYDQVPIEICGHKVIGTVLIGPTPANIIGRNLMTQLGCTLNF. The pIC50 is 8.6. (3) The drug is CCCC(=O)c1cnn(-c2ccc(NC(=O)c3cn(CC(=O)N4C[C@@H]5C[C@H]4CN5C)c4ccc(C)cc34)cc2)c1C. The target protein (Q9EPX4) has sequence MEVPGANATSANTTSIPGTSTLCSRDYKITQVLFPLLYTVLFFAGLITNSLAMRIFFQIRSKSNFIIFLKNTVISDLLMILTFPFKILSDAKLGAGHLRTLVCQVTSVTFYFTMYISISFLGLITIDRYLKTTRPFKTSSPSNLLGAKILSVAIWAFMFLLSLPNMILTNRRPKDKDITKCSFLKSEFGLVWHEIVNYICQVIFWINFLIVIVCYSLITKELYRSYVRTRGSAKAPKKRVNIKVFIIIAVFFICFVPFHFARIPYTLSQTRAVFDCNAENTLFYVKESTLWLTSLNACLDPFIYFFLCKSFRNSLMSMLRCSTSGANKKKGQEGGDPSEETPM. The pIC50 is 5.9. (4) The target protein (P0A1V9) has sequence MAIRIFAILFSIFSLATFAHAQEGTLERSDWRKFFSEFQAKGTIVVADERQADRAMLVFDPVRSKKRYSPASTFKIPHTLFALDAGAVRDEFQIFRWDGVNRGFAGHNQDQDLRSAMRNSTVWVYELFAKEIGDDKARRYLKKIDYGNADPSTSNGDYWIEGSLAISAQEQIAFLRKLYRNELPFRVEHQRLVKDLMIVEAGRNWILRAKTGWEGRMGWWVGWVEWPTGSVFFALNIDTPNRMDDLFKREAIVRAILRSIEALPPNPAVNSDAAR. The compound is C[C@]1(/C=C/C#N)[C@H](C(=O)[O-])N2C(=O)C[C@H]2S1(=O)=O. The pIC50 is 6.7. (5) The compound is COc1cc(-c2cnc3[nH]c(=O)n(-c4ccc5[nH]ccc5c4)c3n2)cc(OC)c1OC. The target protein (Q80XI6) has sequence MEPLKNLFLKSPLGSWNGSGSGGGGGTGGVRPEGSPKATAAYANPVWTALFDYEPNGQDELALRKGDRVEVLSRDAAISGDEGWWAGQVGGQVGIFPSNYVSRGGGPPPCEVASFQELRLEEVIGIGGFGKVYRGSWRGELVAVKAARQDPDEDISVTAESVRQEARLFAMLAHPNIIALKAVCLEEPNLCLVMEYAAGGPLSRALAGRRVPPHVLVNWAVQIARGMHYLHCEALVPVIHRDLKSNNILLLQPIEGDDMEHKTLKITDFGLAREWHKTTQMSAAGTYAWMAPEVIKASTFSKGSDVWSFGVLLWELLTGEVPYRGIDCLAVAYGVAVNKLTLPIPSTCPEPFAQLMADCWAQDPHRRPDFASILQQLEALEAQVLREMPRDSFHSMQEGWKREIQGLFDELRAKEKELLSREEELTRAAREQRSQAEQLRRREHLLAQWELEVFERELTLLLQQVDRERPHVRRRRGTFKRSKLRARDGGERISMPLDFK.... The pIC50 is 4.6. (6) The drug is Cc1nnc2sc(-c3ccc(N)cc3)nn12. The target protein (P06493) has sequence MEDYTKIEKIGEGTYGVVYKGRHKTTGQVVAMKKIRLESEEEGVPSTAIREISLLKELRHPNIVSLQDVLMQDSRLYLIFEFLSMDLKKYLDSIPPGQYMDSSLVKSYLYQILQGIVFCHSRRVLHRDLKPQNLLIDDKGTIKLADFGLARAFGIPIRVYTHEVVTLWYRSPEVLLGSARYSTPVDIWSIGTIFAELATKKPLFHGDSEIDQLFRIFRALGTPNNEVWPEVESLQDYKNTFPKWKPGSLASHVKNLDENGLDLLSKMLIYDPAKRISGKMALNHPYFNDLDNQIKKM. The pIC50 is 3.9. (7) The small molecule is CC(=O)Nc1ccc(S(=O)(=O)Nc2nncs2)cc1. The target protein sequence is MSDVAIVKEGWLHKRGEYIKTWRPRYFLLKNDGTFIGYKERPQDVDQREAPLNNFSVAQCQLMKTERPRPNTFIIRCLQWTTVIERTFHVETPEEREEWTTAIQTVADGLKKQEEEEMDFRSG. The pIC50 is 4.3.