This data is from Drug-target binding data from BindingDB using IC50 measurements. The task is: Regression. Given a target protein amino acid sequence and a drug SMILES string, predict the binding affinity score between them. We predict pIC50 (pIC50 = -log10(IC50 in M); higher means more potent). Dataset: bindingdb_ic50. (1) The small molecule is CN(Cc1ccccc1)c1ccc(-c2cccc3nc(NC(=O)C4CC4)nn23)cc1. The target protein sequence is EQNPDIVSEKKPATEVDPTHFEKRFLKRIRDLGEGHFGKVELCRYDPEGDNTGEQVAVKSLKPESGGNHIADLKKEIEILRNLYHENIVKYKGICTEDGGNGIKLIMEFLPSGSLKEYLPKNKNKINLKQQLKYAVQICKGMDYLGSRQYVHRDLAARNVLVESEHQVKIGDFGLTKAIETDKEYYTVKDDRDSPVFWYAPECLMQSKFYIASDVWSFGVTLHELLTYCDSDSSPMALFLKMIGPTHGQMTVTRLVNTLKEGKRLPCPPNCPDEVYQLMRKCWEFQPSNRTSFQNLIEGFEALLK. The pIC50 is 7.0. (2) The small molecule is CC[C@H](C)[C@H](NC(=O)[C@@H](NC(=O)[C@H](CC(=O)O)NC(=O)[C@H](CC(C)C)NC(=O)[C@H](CC(C)C)NC(=O)[C@H](CC(C)C)NC(=O)[C@H](CCCNC(=N)N)NC(=O)[C@H](CC(N)=O)NC(=O)[C@H](CC(N)=O)NC(=O)[C@H](C)NC(=O)[C@H](C)NC(=O)[C@H](CCC(N)=O)NC(=O)[C@H](CCC(N)=O)NC(=O)[C@H](CCCCN)NC(=O)[C@H](CCC(=O)O)NC(=O)[C@H](CCCCN)NC(=O)[C@H](CCC(=O)O)NC(=O)[C@H](CCC(N)=O)NC(=O)[C@H](CCCCN)NC(=O)[C@H](CCC(=O)O)NC(=O)[C@@H](NC(=O)[C@H](CCC(=O)O)NC(=O)[C@@H](NC(=O)[C@H](CCSC)NC(=O)[C@H](CCCCN)NC(=O)[C@H](CCCNC(=N)N)NC(=O)[C@H](CC(C)C)NC(=O)[C@H](CC(C)C)NC(=O)[C@H](CCC(=O)O)NC(=O)[C@H](CC(C)C)NC(=O)[C@H](CO)NC(=O)[C@H](CC(C)C)NC(=O)[C@H](CC(=O)O)NC(=O)[C@@H](NC(=O)[C@H](CO)NC(=O)[C@@H](NC(=O)[C@@H]1CCCN1C(=O)[C@@H]1CCCN1C(=O)CNC(=O)[C@@H]1CCC(=O)N1)[C@@H](C)CC)[C@@H](C)CC)[C@@H](C)CC)[C@@H](C)CC)[C@@H](C)O)C(=O)O. The target protein sequence is MDAALLHSLLEANCSLALAEELLLDGWGMSLDPEGRYFYCNTTLDQIGTCWPRSAAGALVERPCPEYFNGIKYNTTRNAYRECLENGTWASRINYSQCEPILDDKQRKYDLHYRIALVVNYLGHCVSVAALVAAFLIFLALRSIRCLRNVIHWNLIATFILRNVLWFLLQLIDHEVHESNEVWCRCITTVFNYFVVTNFFWMFAEGCYLHTAIVMTYSTERLRKWLFLFIGWCVPCPIIIAWAIGKLYYENKQCWFGKEPGDLVDYIYQGPIILVLLINFIFLFNIVRILMTKLRASTTSETIQYRKAVKATLVLLPLLGITYMLFFVSPGEDELSQIVFIYFNSFLQSFQGFFVSVFYCFFNGEVRAAVRKRWHRWQDHHSLRVPVARAMSIPTSPTRISFHSIKQTAAV. The pIC50 is 9.2. (3) The drug is CCC(C)N1C(=O)C2(N=C1N)c1cc(C#N)ccc1C[C@]21CC[C@@H](OC)CC1. The target protein sequence is MAQALPWLLLWMGAGVLPAHGTQHGIRLPLRSGLGGAPLGLRLPRETDEEPEEPGRRGSFVEMVDNLRGKSGQGYYVEMTVGSPPQTLNILVDTGSSNFAVGAAPHPFLHRYYQRQLSSTYRDLRKGVYVPYTQGKWEGELGTDLPDDSLEPFFDSLVKQTHVPNLFSLQLCGAGFPLNQSEVLASVGGSMIIGGIDHSLYTGSLWYTPIRREWYYEVIIVRVEINGQDLKMDCKEYNYDKSIVDSGTTNLRLPKKVFEAAVKSIKAASSTEKFPDGFWLGEQLVCWQAGTTPWNIFPVISLYLMGEVTNQSFRITILPQQYLRPVEDVATSQDDCYKFAISQSSTGTVMGAVIMEGFYVVFDRARKRIGFAVSACHVHDEFRTAAVEGPFVTLDMEDCGYNIPQTDESTLMTIAYVMAAICALFMLPLCLMVCQWCCLRCLRQQHDDFADDISLLK. The pIC50 is 6.6. (4) The compound is Cc1c(C(=O)Nc2ccc(Oc3ccnc4c3CNC(=O)N4)c(F)c2)cnn1-c1ccccc1. The target protein sequence is RRKQLVLPPNLNDLASLDQTAGATPLPILYSGSDYRSGLALPAIDGLDSTTCVHGASFSDSEDESCVPLLRKESIQLRDLDSALLAEVKDVLIPHERVVTHSDRVIGKGHFGVVYHGEYIDQAQNRIQCAIKSLSRITEMQQVEAFLREGLLMRGLNHPNVLALIGIMLPPEGLPHVLLPYMCHGDLLQFIRSPQRNPTVKDLISFGLQVARSMEYLAEQKFVHRDLAARNCMLDESFTVKVADFGLARDILDREYYSVQQHRHARLPVKWMALESLQTYRFTTKSDVWSFGVLLWELLTRGAPPYRHIDPFDLTHFLAQGRRLPQPEYCPDSLYQVMQQCWEADPAVRPTFRVLVGEVEQIVSALLGDHYVQLPATYMNLGPSTSHEMNVRPEQPQFSPMPGNVRRPRPLSEPPRPT. The pIC50 is 8.0. (5) The compound is O=C(N[C@@H](c1ccc(C(F)(F)F)c(F)c1)[C@@H]1CCCN1)c1ccc2cnc(NC3CCOCC3)nc2c1. The target protein sequence is GAGPEMVRGQVFDVGPRYTNLSYIGEGAYGMVCSAYDNVNKVRVAIKKISPFEHQTYCQRTLREIKILLRFRHENIIGINDIIRAPTIEQMKDVYIVQDLMETDLYKLLKTQHLSNDHICYFLYQILRGLKYIHSANVLHRDLKPSNLLLNTTCDLKICDFGLARVADPDHDHTGFLTEYVATRWYRAPEIMLNSKGYTKSIDIWSVGCILAEMLSNRPIFPGKHYLDQLNHILGILGSPSQEDLNCIINLKARNYLLSLPHKNKVPWNRLFPNADSKALDLLDKMLTFNPHKRIEVEQALAHPYLEQYYDPSDEPIAEAPFKFDMELDDLPKEKLKELIFEETARFQPGYRS. The pIC50 is 8.7. (6) The compound is Cc1nnc(C)n1-c1ccc(O[C@H]2c3cc(Cl)cc(Cl)c3C[C@@H]2N2CC[C@@H](O)C2)c(F)c1.Cl. The target protein (P26433) has sequence MWHPALGPGWKPLLALAVAVTSLRGVRGIEEEPNSGGSFQIVTFKWHHVQDPYIIALWILVASLAKIVFHLSHKVTSVVPESALLIVLGLVLGGIVWAADHIASFTLTPTLFFFYLLPPIVLDAGYFMPNRLFFGNLGTILLYAVIGTIWNAATTGLSLYGVFLSGLMGELKIGLLDFLLFGSLIAAVDPVAVLAVFEEVHVNEVLFIIVFGESLLNDAVTVVLYNVFESFVTLGGDAVTGVDCVKGIVSFFVVSLGGTLVGVIFAFLLSLVTRFTKHVRIIEPGFVFVISYLSYLTSEMLSLSAILAITFCGICCQKYVKANISEQSATTVRYTMKMLASGAETIIFMFLGISAVDPVIWTWNTAFVLLTLVFISVYRAIGVVLQTWILNRYRMVQLETIDQVVMSYGGLRGAVAYALVVLLDEKKVKEKNLFVSTTLIVVFFTVIFQGLTIKPLVQWLKVKRSEQREPKLNEKLHGRAFDHILSAIEDISGQIGHNYL.... The pIC50 is 8.0. (7) The compound is C=C(CCCC)[C@@H](O)[C@H](C)C(=O)N1C(=O)OC[C@@H]1C(C)C. The target protein (P40189) has sequence MLTLQTWLVQALFIFLTTESTGELLDPCGYISPESPVVQLHSNFTAVCVLKEKCMDYFHVNANYIVWKTNHFTIPKEQYTIINRTASSVTFTDIASLNIQLTCNILTFGQLEQNVYGITIISGLPPEKPKNLSCIVNEGKKMRCEWDGGRETHLETNFTLKSEWATHKFADCKAKRDTPTSCTVDYSTVYFVNIEVWVEAENALGKVTSDHINFDPVYKVKPNPPHNLSVINSEELSSILKLTWTNPSIKSVIILKYNIQYRTKDASTWSQIPPEDTASTRSSFTVQDLKPFTEYVFRIRCMKEDGKGYWSDWSEEASGITYEDRPSKAPSFWYKIDPSHTQGYRTVQLVWKTLPPFEANGKILDYEVTLTRWKSHLQNYTVNATKLTVNLTNDRYLATLTVRNLVGKSDAAVLTIPACDFQATHPVMDLKAFPKDNMLWVEWTTPRESVKKYILEWCVLSDKAPCITDWQQEDGTVHRTYLRGNLAESKCYLITVTPVY.... The pIC50 is 5.0. (8) The small molecule is O=C(Nc1ccc(Oc2ncnc3n[nH]cc23)cc1)c1ccc(Br)cc1. The target protein (Q5GIT4) has sequence MAKTSYALLLLDILLTFNVAKAIELRFVPDPPTLNITEKTIKINASDTLQITCRGRQILEWSTPHNRTSSETRLTISDCSGDGLFCSTLTLSKAVANETGEYRCFYKSLPKEDGKTSVAVYVFIQDYRTPFVRIAQDYDVVFIREGEQVVIPCLVSVEDLNVTLYTKYPVKELSTDGKEVIWDSRRGFILPSRVVSYAGVVYCQTTIRNETFQSSPYIVAVVGYKIYDLTLSPQHERLTVGERLILNCTAHTELNVGIDFQWTFPHEKRSVNGSMSTSRYKTSSNKKKLWNSLELSNTLTVENVTLNDTGEYICTASSGQMQKIAQASLIVYEKPFIALSDQLWQTVEAKAGDAEAKILVKYYAYPEPAVRWYKNDQLIVLRDEYRMKFYRGVHLTIYGVTEKDAGNYTVVMTNKITKEEQRRTFQLVVNDLPRIFEKDVSLDRDVHMYGSSPTLTCTASGGSSPVTIKWQWMPREDCPVRFLPKSDTRMAKCDKWREMS.... The pIC50 is 4.1.