From a dataset of Drug-target binding data from BindingDB using IC50 measurements. Regression. Given a target protein amino acid sequence and a drug SMILES string, predict the binding affinity score between them. We predict pIC50 (pIC50 = -log10(IC50 in M); higher means more potent). Dataset: bindingdb_ic50. (1) The compound is Cc1ccc(NC(=O)c2ccnc(C(C)(C)C#N)c2)cc1-c1cnc(OCCO)c(N2CCOCC2)c1. The target protein sequence is MAALSGGGGGGAEPGQALFNGDMEPEAGAGAGAAASSAADPAIPEEVWNIKQMIKLTQEHIEALLDKFGGEHNPPSIYLEAYEEYTSKLDALQQREQQLLESLGNGTDFSVSSSASMDTVTSSSSSSLSVLPSSLSVFQNPTDVARSNPKSPQKPIVRVFLPNKQRTVVPARCGVTVRDSLKKALMMRGLIPECCAVYRIQDGEKKPIGWDTDISWLTGEELHVEVLENVPLTTHNFVRKTFFTLAFCDFCRKLLFQGFRCQTCGYKFHQRCSTEVPLMCVNYDQLDLLFVSKFFEHHPIPQEEASLAETALTSGSSPSAPASDSIGPQILTSPSPSKSIPIPQPFRPADEDHRNQFGQRDRSSSAPNVHINTIEPVNIDDLIRDQGFRGDGGSTTGLSATPPASLPGSLTNVKALQKSPGPQRERKSSSSSEDRNRMKTLGRRDSSDDWEIPDGQITVGQRIGSGSFGTVYKGKWHGDVAVKMLNVTAPTPQQLQAFKN.... The pIC50 is 8.5. (2) The compound is c1ccc(-c2cccc3[nH]c(-c4n[nH]c5cnc(-c6cncc(CNCC7CCCC7)c6)cc45)nc23)nc1. The target protein (P04628) has sequence MGLWALLPGWVSATLLLALAALPAALAANSSGRWWGIVNVASSTNLLTDSKSLQLVLEPSLQLLSRKQRRLIRQNPGILHSVSGGLQSAVRECKWQFRNRRWNCPTAPGPHLFGKIVNRGCRETAFIFAITSAGVTHSVARSCSEGSIESCTCDYRRRGPGGPDWHWGGCSDNIDFGRLFGREFVDSGEKGRDLRFLMNLHNNEAGRTTVFSEMRQECKCHGMSGSCTVRTCWMRLPTLRAVGDVLRDRFDGASRVLYGNRGSNRASRAELLRLEPEDPAHKPPSPHDLVYFEKSPNFCTYSGRLGTAGTAGRACNSSSPALDGCELLCCGRGHRTRTQRVTERCNCTFHWCCHVSCRNCTHTRVLHECL. The pIC50 is 6.2. (3) The drug is COC(=O)[C@H](NC(=O)[C@H](Cc1ccccc1)NS(=O)(=O)N1CCOCC1)C(=O)N[C@@H](CC1CCCCC1)[C@@H](O)[C@@H](O)CC(C)C. The target protein (P80209) has sequence VIRIPLHKFTSIRRTMSEAAGVLIAKGPISKYATGEPAVRQGPIPELLKNYMDAQYYGEIGIGTPPQCFTVVFDTGSANLWVPSIHCKLLDIACWTHRKYNSDKSSTYVKNGTTFDIHYGSGSLSGYLSQDTVSVPCNPSSSSPGGVTVQRQTFGEAIKQPGVVFIAAKFDGILGMAYPRISVNNVLPVFDNLMQQKLVDKNVFSFFLNRDPKAQPGGELMLGGTDSKYYRGSLMFHNVTRQAYWQIHMDQLDVGSSLTVCKGGCEAIVDTGTSLIVGPVEEVRELQKAIGAVPLIQGEYMIPCEKVSSLPEVTVKLGGKDYALSPEDYALKVSQAETTVCLSGFMGMDIPPPGGPLWILGDVFIGRYYTVFDRDQNRVGLAEAARL. The pIC50 is 7.5. (4) The compound is O=C(O)CCC(=O)N1N=C(c2c(-c3cc(F)cc(F)c3)c3ccccc3[nH]c2=O)CC1c1ccc(Cl)cc1. The target protein (Q62645) has sequence MRGAGGPRGPRGPAKMLLLLALACASPFPEEVPGPGAVGGGTGGARPLNVALVFSGPAYAAEAARLGPAVAAAVRSPGLDVRPVALVLNGSDPRSLVLQLCDLLSGLRVHGVVFEDDSRAPAVAPILDFLSAQTSLPIVAVHGGAALVLTPKEKGSTFLQLGSSTEQQLQVIFEVLEEYDWTSFVAVTTRAPGHRAFLSYIEVLTDGSLVGWEHRGALTLDPGAGEAVLGAQLRSVSAQIRLLFCAREEAEPVFRAAEEAGLTGPGYVWFMVGPQLAGGGGSGVPGEPLLLPGGSPLPAGLFAVRSAGWRDDLARRVAAGVAVVARGAQALLRDYGFLPELGHDCRTQNRTHRGESLHRYFMNITWDNRDYSFNEDGFLVNPSLVVISLTRDRTWEVVGSWEQQTLRLKYPLWSRYGRFLQPVDDTQHLTVATLEERPFVIVEPADPISGTCIRDSVPCRSQLNRTHSPPPDAPRPEKRCCKGFCIDILKRLAHTIGFSY.... The pIC50 is 6.1. (5) The drug is [NH3+][Pt]([NH3+])(Cl)[n+]1ccccc1. The target protein (P0A7G6) has sequence MAIDENKQKALAAALGQIEKQFGKGSIMRLGEDRSMDVETISTGSLSLDIALGAGGLPMGRIVEIYGPESSGKTTLTLQVIAAAQREGKTCAFIDAEHALDPIYARKLGVDIDNLLCSQPDTGEQALEICDALARSGAVDVIVVDSVAALTPKAEIEGEIGDSHMGLAARMMSQAMRKLAGNLKQSNTLLIFINQIRMKIGVMFGNPETTTGGNALKFYASVRLDIRRIGAVKEGENVVGSETRVKVVKNKIAAPFKQAEFQILYGEGINFYGELVDLGVKEKLIEKAGAWYSYKGEKIGQGKANATAWLKDNPETAKEIEKKVRELLLSNPNSTPDFSVDDSEGVAETNEDF. The pIC50 is 4.3. (6) The small molecule is CCCCCCCP(=O)(O)OCC. The target protein (P9WQN9) has sequence MTFFEQVRRLRSAATTLPRRLAIAAMGAVLVYGLVGTFGGPATAGAFSRPGLPVEYLQVPSASMGRDIKVQFQGGGPHAVYLLDGLRAQDDYNGWDINTPAFEEYYQSGLSVIMPVGGQSSFYTDWYQPSQSNGQNYTYKWETFLTREMPAWLQANKGVSPTGNAAVGLSMSGGSALILAAYYPQQFPYAASLSGFLNPSEGWWPTLIGLAMNDSGGYNANSMWGPSSDPAWKRNDPMVQIPRLVANNTRIWVYCGNGTPSDLGGDNIPAKFLEGLTLRTNQTFRDTYAADGGRNGVFNFPPNGTHSWPYWNEQLVAMKADIQHVLNGATPPAAPAAPAA. The pIC50 is 6.0. (7) The compound is O=c1[nH]c(=O)n([C@H]2C[C@H](O)[C@@H](COP(=O)([O-])[O-])O2)cc1-c1ccc(F)cc1. The target protein (P9WFR9) has sequence MTPYEDLLRFVLETGTPKSDRTGTGTRSLFGQQMRYDLSAGFPLLTTKKVHFKSVAYELLWFLRGDSNIGWLHEHGVTIWDEWASDTGELGPIYGVQWRSWPAPSGEHIDQISAALDLLRTDPDSRRIIVSAWNVGEIERMALPPCHAFFQFYVADGRLSCQLYQRSADLFLGVPFNIASYALLTHMMAAQAGLSVGEFIWTGGDCHIYDNHVEQVRLQLSREPRPYPKLLLADRDSIFEYTYEDIVVKNYDPHPAIKAPVAV. The pIC50 is 4.3. (8) The drug is C/C=C/C1=CC2=CC(=O)[C@@](C)(O)[C@@H](OC(=O)c3c(C)cc(O)cc3O)[C@@H]2CO1. The target protein (P24547) has sequence MADYLISGGTSYVPDDGLTAQQLFNCGDGLTYNDFLILPGYIDFTADQVDLTSALTKKITLKTPLVSSPMDTVTEAGMAIAMALTGGIGFIHHNCTPEFQANEVRKVKKYEQGFITDPVVLSPKDRVRDVFEAKARHGFCGIPITDTGRMGSRLVGIISSRDIDFLKEEEHDRFLEEIMTKREDLVVAPAGVTLKEANEILQRSKKGKLPIVNENDELVAIIARTDLKKNRDYPLASKDAKKQLLCGAAIGTHEDDKYRLDLLALAGVDVVVLDSSQGNSIFQINMIKYIKEKYPSLQVIGGNVVTAAQAKNLIDAGVDALRVGMGSGSICITQEVLACGRPQATAVYKVSEYARRFGVPVIADGGIQNVGHIAKALALGASTVMMGSLLAATTEAPGEYFFSDGIRLKKYRGMGSLDAMDKHLSSQNRYFSEADKIKVAQGVSGAVQDKGSIHKFVPYLIAGIQHSCQDIGAKSLTQVRAMMYSGELKFEKRTSSAQVE.... The pIC50 is 3.7.