From a dataset of Drug-target binding data from BindingDB using IC50 measurements. Regression. Given a target protein amino acid sequence and a drug SMILES string, predict the binding affinity score between them. We predict pIC50 (pIC50 = -log10(IC50 in M); higher means more potent). Dataset: bindingdb_ic50. (1) The compound is CC(C)(C)c1cc(NC(=O)Nc2ccc(-n3nc(-c4ccc(OCCN5CCOCC5)cc4)c4c(N)ncnc43)cc2)no1. The target protein sequence is MGDEKDSWKVKTLDEILQEKKRRKEQEEKAEIKRLKNSDDRDSKRDSLEEGELRDHRMEITIRNSPYRREDSMEDRGEEDDSLAIKPPQQMSRKEKAHHRKDEKRKEKRRHRSHSAEGGKHARVKEKEREHERRKRHREEQDKARREWERQKRREMAREHSRRERDRLEQLERKRERERKMREQQKEQREQKERERRAEERRKEREARREVSAHHRTMREDYSDKVKASHWSRSPPRPPRERFELGDGRKPGEARPAPAQKPAQLKEEKMEERDLLSDLQDISDSERKTSSAESSSAESGSGSEEEEEEEEEEEEEGSTSEESEEEEEEEEEEEEETGSNSEEASEQSAEEVSEEEMSEDEERENENHLLVVPESRFDRDSGESEEAEEEVGEGTPQSSALTEGDYVPDSPALSPIELKQELPKYLPALQGCRSVEEFQCLNRIEEGTYGVVYRAKDKKTDEIVALKRLKMEKEKEGFPITSLREINTILKAQHPNIVTV.... The pIC50 is 7.4. (2) The pIC50 is 5.0. The target protein sequence is MQTVGVHSIVQQLHRNSIQFTDGYEVKEDIGVGSYSVVKRCIHKATNMEFAVKIIDKSKRDPTEEIEILLRYGQHPNIITLKDVYDDGKYVYVVTELMKGGELLDKILRQKFFSEREASAVLFTITKTVEYLHAQGVVHRDLKPSNILYVDESGNPESIRICDFGFAKQLRAENGLLMTPCYTANFVAPEVLKRQGYDAACDIWSLGVLLYTMLTGYTPFANGPDDTPEEILARIGSGKFSLSGGYWNSVSDTAKDLVSKMLHVDPHQRLTAALVLRHPWIVHWDQLPQYQLNRQDAPHLVKGAMAATYSALNRNQSPVLEPVGRSTLAQRRGIKKITSTAL. The drug is Cc1ccc(-c2c(/C=C(\C#N)C(=O)OC(C)(C)C)n(CCCO)c3ncnc(N)c23)cc1. (3) The drug is C/C=C(C)/C=C(\C)[C@H]1C(C)=C[C@@]2(C)C[C@@H](C)CC[C@@H]2[C@H]1C(=O)C1=CC(O)(Cc2ccccc2)NC1=O. The target protein (O60883) has sequence MRWLWPLAVSLAVILAVGLSRVSGGAPLHLGRHRAETQEQQSRSKRGTEDEEAKGVQQYVPEEWAEYPRPIHPAGLQPTKPLVATSPNPGKDGGTPDSGQELRGNLTGAPGQRLQIQNPLYPVTESSYSAYAIMLLALVVFAVGIVGNLSVMCIVWHSYYLKSAWNSILASLALWDFLVLFFCLPIVIFNEITKQRLLGDVSCRAVPFMEVSSLGVTTFSLCALGIDRFHVATSTLPKVRPIERCQSILAKLAVIWVGSMTLAVPELLLWQLAQEPAPTMGTLDSCIMKPSASLPESLYSLVMTYQNARMWWYFGCYFCLPILFTVTCQLVTWRVRGPPGRKSECRASKHEQCESQLNSTVVGLTVVYAFCTLPENVCNIVVAYLSTELTRQTLDLLGLINQFSTFFKGAITPVLLLCICRPLGQAFLDCCCCCCCEECGGASEASAANGSDNKLKTEVSSSIYFHKPRESPPLLPLGTPC. The pIC50 is 2.6. (4) The compound is CNC(=O)NC(=N)c1ccc(OCCCCCCCOc2ccc(C(=N)NC(=O)NC)cc2)cc1. The target protein (P04631) has sequence MSELEKAMVALIDVFHQYSGREGDKHKLKKSELKELINNELSHFLEEIKEQEVVDKVMETLDEDGDGECDFQEFMAFVSMVTTACHEFFEHE. The pIC50 is 3.0. (5) The compound is COc1ccc2oc(CS(=O)c3ccccc3)c(C(=O)O)c2c1. The target protein (Q9Y239) has sequence MEEQGHSEMEIIPSESHPHIQLLKSNRELLVTHIRNTQCLVDNLLKNDYFSAEDAEIVCACPTQPDKVRKILDLVQSKGEEVSEFFLYLLQQLADAYVDLRPWLLEIGFSPSLLTQSKVVVNTDPVSRYTQQLRHHLGRDSKFVLCYAQKEELLLEEIYMDTIMELVGFSNESLGSLNSLACLLDHTTGILNEQGETIFILGDAGVGKSMLLQRLQSLWATGRLDAGVKFFFHFRCRMFSCFKESDRLCLQDLLFKHYCYPERDPEEVFAFLLRFPHVALFTFDGLDELHSDLDLSRVPDSSCPWEPAHPLVLLANLLSGKLLKGASKLLTARTGIEVPRQFLRKKVLLRGFSPSHLRAYARRMFPERALQDRLLSQLEANPNLCSLCSVPLFCWIIFRCFQHFRAAFEGSPQLPDCTMTLTDVFLLVTEVHLNRMQPSSLVQRNTRSPVETLHAGRDTLCSLGQVAHRGMEKSLFVFTQEEVQASGLQERDMQLGFLRA.... The pIC50 is 4.7. (6) The small molecule is N#CC(CCCN1CCC(C(N)=O)CC1)(c1ccccc1)c1ccccc1. The target protein (P69332) has sequence MTPTTTTAELTTEFDYDEDATPCVFTDVLNQSKPVTLFLYGVVFLFGSIGNFLVIFTITWRRRIQCSGDVYFINLAAADLLFVCTLPLWMQYLLDHNSLASVPCTLLTACFYVAMFASLCFITEIALDRYYAIVYMRYRPVKQACLFSIFWWIFAVIIAIPHFMVVTKKDNQCMTDYDYLEVSYPIILNVELMLGAFVIPLSVISYCYYRISRIVAVSQSRHKGRIVRVLIAVVLVFIIFWLPYHLTLFVDTLKLLKWISSSCEFERSLKRALILTESLAFCHCCLNPLLYVFVGTKFRQELHCLLAEFRQRLFSRDVSWYHSMSFSRRSSPSRRETSSDTLSDEVCRVSQIIP. The pIC50 is 4.0.