This data is from Drug-target binding data from BindingDB using IC50 measurements. The task is: Regression. Given a target protein amino acid sequence and a drug SMILES string, predict the binding affinity score between them. We predict pIC50 (pIC50 = -log10(IC50 in M); higher means more potent). Dataset: bindingdb_ic50. (1) The target protein (Q8TDV5) has sequence MESSFSFGVILAVLASLIIATNTLVAVAVLLLIHKNDGVSLCFTLNLAVADTLIGVAISGLLTDQLSSPSRPTQKTLCSLRMAFVTSSAAASVLTVMLITFDRYLAIKQPFRYLKIMSGFVAGACIAGLWLVSYLIGFLPLGIPMFQQTAYKGQCSFFAVFHPHFVLTLSCVGFFPAMLLFVFFYCDMLKIASMHSQQIRKMEHAGAMAGGYRSPRTPSDFKALRTVSVLIGSFALSWTPFLITGIVQVACQECHLYLVLERYLWLLGVGNSLLNPLIYAYWQKEVRLQLYHMALGVKKVLTSFLLFLSARNCGPERPRESSCHIVTISSSEFDG. The pIC50 is 8.1. The drug is N#Cc1cnccc1COc1cnc(N2CC3CCC(C2)N3C(=O)OCC(F)(F)F)nc1. (2) The drug is CN(C)c1nc(-c2cccn2-c2ccccc2)nc(N2CCCCCC2)n1. The target protein (Q6GFD7) has sequence MAKTYIFGHKNPDTDAISSAIIMAEFEQLRGNSGAKAYRLGDVSAETQFALDTFNVPAPELLTDDLDGQDVILVDHNEFQQSSDTIASATIKHVIDHHRIANFETAGPLCYRAEPVGCTATILYKMFRERGFEIKPEIAGLMLSAIISDSLLFKSPTCTQQDVKAAEELKDIAKVDIQKYGLDMLKAGASTTDKSVEFLLNMDAKSFTMGDYVTRIAQVNAVDLDEVLNRKEDLEKEMLAVSAQEKYDLFVLVVTDIINSDSKILVVGAEKDKVGEAFNVQLEDDMAFLSGVVSRKKQIVPQITEALTK. The pIC50 is 3.5. (3) The small molecule is C[C@H](c1cccc2ccccc12)N1CCC(C(=O)NCc2ccc3c(c2)OCO3)CC1. The target protein sequence is EVKTIKVFTTVDNTNLHTQLVDMSMTYGQQFGPTYLDGADVTKIKPHVNHEGKTFFVLPSDDTLRSEAFEYYHTLDESFLGRYMSALNHTKKWKFPQVGGLTSIKWADNNCYLSSVLLALQQLEVKFNAPALQEAYYRARAGDAANFCALILAYSNKTVGELGDVRETMTHLLQHANLESAKRVLNVVCKHCGQKTTTLTGVEAVMYMGTLSYDNLKTGVSIPCVCGRDATQYLVQQESSFVMMSAPPAEYKLQQGTFLCANEYTGNYQCGHYTHITAKETLYRIDGAHLTKMSEYKGPVTDVFYKETSYTTTIK. The pIC50 is 5.9. (4) The drug is Nc1ncc(-c2ccc(Cl)cc2)c(N)n1. The target protein (O02604) has sequence MEDLSDVFDIYAICACCKVAPTSEGTKNEPFSPRTFRGLGNKGTLPWKCNSVDMKYFSSVTTYVDESKYEKLKWKRERYLRMEASQGGGDNTSGGDNTHGGDNADKLQNVVVMGRSSWESIPKQYKPLPNRINVVLSKTLTKEDVKEKVFIIDSIDDLLLLLKKLKYYKCFIIGGAQVYRECLSRNLIKQIYFTRINGAYPCDVFFPEFDESQFRVTSVSEVYNSKGTTLDFLVYSKVGGGVDGGASNGSTATALRRTAMRSTAMRRNVAPRTAAPPMGPHSRANGERAPPRARARRTTPRQRKTTSCTSALTTKWGRKTRSTCKILKFTTASRLMQHPEYQYLGIIYDIIMNGNKQGDRTGVGVMSNFGYMMKFNLSEYFPLLTTKKLFLRGIIEELLWFIRGETNGNTLLNKNVRIWEANGTREFLDNRKLFHREVNDLGPIYGFQWRHFGAEYTNMHDNYEDKGVDQLKNVIHLIKNEPTSRRIILCAWNVKDLDQM.... The pIC50 is 5.8. (5) The drug is N=C(N)c1ccc(CNC(=O)CC2OCCN(Nc3ccc(-c4ccccc4)cc3)C2=O)cc1. The target protein (P00734) has sequence MAHVRGLQLPGCLALAALCSLVHSQHVFLAPQQARSLLQRVRRANTFLEEVRKGNLERECVEETCSYEEAFEALESSTATDVFWAKYTACETARTPRDKLAACLEGNCAEGLGTNYRGHVNITRSGIECQLWRSRYPHKPEINSTTHPGADLQENFCRNPDSSTTGPWCYTTDPTVRRQECSIPVCGQDQVTVAMTPRSEGSSVNLSPPLEQCVPDRGQQYQGRLAVTTHGLPCLAWASAQAKALSKHQDFNSAVQLVENFCRNPDGDEEGVWCYVAGKPGDFGYCDLNYCEEAVEEETGDGLDEDSDRAIEGRTATSEYQTFFNPRTFGSGEADCGLRPLFEKKSLEDKTERELLESYIDGRIVEGSDAEIGMSPWQVMLFRKSPQELLCGASLISDRWVLTAAHCLLYPPWDKNFTENDLLVRIGKHSRTRYERNIEKISMLEKIYIHPRYNWRENLDRDIALMKLKKPVAFSDYIHPVCLPDRETAASLLQAGYKGR.... The pIC50 is 5.3. (6) The compound is C=CS(=O)(=O)Nc1ccc(-c2nn(C(C)C)c3ncnc(N)c23)cc1. The target protein sequence is QTQGLAKDAWEIPRESLRLEVKLGQGCFGEVWMGTWNGTTRVAIKTLKPGTMSPEAFLQEAQVMKKLRHEKLVQLYAVVSEEPIYIVCEYMSKGSLLDFLKGEMGKYLRLPQLVDMAAQIASGMAYVERMNYVHRDLRAANILVGENLVCKVADFGLARLIEDNEYTARQGAKFPIKWTAPEAALYGRFTIKSDVWSFGILLTELTTKGRVPYPGMVNREVLDQVERGYRMPCPPECPESLHDLMCQCWRKDPEERPTFEYLQAFLEDYFTSTEPQYQPGENL. The pIC50 is 6.8. (7) The drug is CCCCn1c(=O)[nH]c2nc3cc(NC(C)=O)ccc3nc2c1=O. The target protein sequence is MAGRSGDSDEELIRTVRLIKLLYQSNPPPNPEGTRQARRNRRRRWRERQRQIHSISERILGTYLGRSAEPVPLQLPPLERLTLDCNEDCGTSGTQGVGSPQILVESPTVLESGTKE. The pIC50 is 5.2. (8) The compound is N[C@@H](Cc1cnc[nH]1)C(=O)Cc1ccc(Cl)cc1. The target protein (Q8G2R2) has sequence MVTTLRQTDPDFEQKFAAFLSGKREVSEDVDRAVREIVDRVRREGDSALLDYSRRFDRIDLEKTGIAVTEAEIDAAFDAAPASTVEALKLARDRIEKHHARQLPKDDRYTDALGVELGSRWTAIEAVGLYVPGGTASYPSSVLMNAMPAKVAGVDRIVMVVPAPDGNLNPLVLVAARLAGVSEIYRVGGAQAIAALAYGTETIRPVAKIVGPGNAYVAAAKRIVFGTVGIDMIAGPSEVLIVADKDNNPDWIAADLLAQAEHDTAAQSILMTNDEAFAHAVEEAVERQLHTLARTETASASWRDFGAVILVKDFEDAIPLANRIAAEHLEIAVADAEAFVPRIRNAGSIFIGGYTPEVIGDYVGGCNHVLPTARSARFSSGLSVLDYMKRTSLLKLGSEQLRALGPAAIEIARAEGLDAHAQSVAIRLNL. The pIC50 is 8.2. (9) The small molecule is C[C@@]1(c2cc(NC(=O)c3ccc(F)cn3)ccc2F)CC2(CCC2)SC(N)=N1. The target protein (P56817) has sequence MAQALPWLLLWMGAGVLPAHGTQHGIRLPLRSGLGGAPLGLRLPRETDEEPEEPGRRGSFVEMVDNLRGKSGQGYYVEMTVGSPPQTLNILVDTGSSNFAVGAAPHPFLHRYYQRQLSSTYRDLRKGVYVPYTQGKWEGELGTDLVSIPHGPNVTVRANIAAITESDKFFINGSNWEGILGLAYAEIARPDDSLEPFFDSLVKQTHVPNLFSLQLCGAGFPLNQSEVLASVGGSMIIGGIDHSLYTGSLWYTPIRREWYYEVIIVRVEINGQDLKMDCKEYNYDKSIVDSGTTNLRLPKKVFEAAVKSIKAASSTEKFPDGFWLGEQLVCWQAGTTPWNIFPVISLYLMGEVTNQSFRITILPQQYLRPVEDVATSQDDCYKFAISQSSTGTVMGAVIMEGFYVVFDRARKRIGFAVSACHVHDEFRTAAVEGPFVTLDMEDCGYNIPQTDESTLMTIAYVMAAICALFMLPLCLMVCQWRCLRCLRQQHDDFADDISLL.... The pIC50 is 8.1.