This data is from Drug-target binding data from BindingDB using IC50 measurements. The task is: Regression. Given a target protein amino acid sequence and a drug SMILES string, predict the binding affinity score between them. We predict pIC50 (pIC50 = -log10(IC50 in M); higher means more potent). Dataset: bindingdb_ic50. (1) The pIC50 is 2.3. The target protein (P20456) has sequence MADPWQECMDYAVTLAGQAGEVVREALKNEMNIMVKSSPADLVTATDQKVEKMLITSIKEKYPSHSFIGEESVAAGEKSILTDNPTWIIDPIDGTTNFVHGFPFVAVSIGFVVNKKMEFGIVYSCLEDKMYTGRKGKGAFCNGQKLQVSHQEDITKSLLVTELGSSRTPETVRIILSNIERLLCLPIHGIRGVGTAALNMCLVAAGAADAYYEMGIHCWDVAGAGIIVTEAGGVLLDVTGGPFDLMSRRVIASSNKTLAERIAKEIQIIPLQRDDED. The compound is O=P(O)(O)O[C@H]1CCC[C@H]1O. (2) The small molecule is Cc1ccc(C(=O)Nc2cccc(C(F)(F)F)c2)cc1-c1cc(N2CCOCC2)nc(N2CCOCC2)n1. The target protein sequence is QEKNKIRPRGQRDSSEEWEIEASEVMLSTRIGSGSFGTVYKGKWHGDVAVKILKVVDPTPEQFQAFRNEVAVLRKTRHVNILLFMGYMTKDNLAIVTQWCEGSSLYKHLHVQETKFQMFQLIDIARQTAQGMDYLHAKNIIHRDMKSNNIFLHEGLTVKIGDFGLATVKSRWSGSQQVEQPTGSVLWMAPEVIRMQDNNPFSFQSDVYSYGIVLYELMTGELPYSHINNRDQIIFMVGRGYASPDLSKLYKNCPKAMKRLVADCVKKVKEERPLFPQILSSIELLQHSLPKINRSASEPSLHRAAHTEDINACTLTTSPRLPVF. The pIC50 is 9.2. (3) The drug is O=C(Nc1ccc(S(=O)(=O)NC2CCOC2=O)cc1)c1sccc1Cl. The target protein sequence is MKYLPILVVEDDADLREAIVDTLSLAGYPTLEAADGGSALQKLAQEPVGLIISDAQMAPMDGYDLFEEAKKRYPGVPFILMTAYGVIERAIELLRAGAAHYLLKPFEPQSLLAEVDKHLLAMPGDGGGEVVAESAAMRQLFALAGRVAQSDASVMISGPSGSGKEVLARYIHRHSKRGSGPFVAVNCAAIPDNLLESTLFGHERGAFTGAAQALPGKFEQAQGGTILLDEVTEMPLPLQAKLLRVLQEREVERIGATRTIKLDIRVLATSNRDLQAAVEAGNFREDLYFRLNVFPLRIPALAERPEDILPLARFLLKKHAEAAGRASLVFSRDAERHLTAYSWEGNIRELDNVVQRAVILAAGAEILAADLMLGDIAGVGQFTRAESESDSSVSGETDMKTLEKRHILETLAAVGGVKKLAAEKLGISERTLRYKLQRYRDEDAADAGGNVPEGSGTE. The pIC50 is 5.0. (4) The drug is CCc1cccc(Oc2cccc(N(Cc3cccc(OC(F)(F)C(F)F)c3)CC(O)C(F)(F)F)c2)c1. The target protein (P22687) has sequence ACPKGASYEAGIVCRITKPALLVLNQETAKVVQTAFQRAGYPDVSGERAVMLLGRVKYGLHNLQISHLSIASSQVELVDAKTIDVAIQNVSVVFKGTLNYSYTSAWGLGINQSVDFEIDSAIDLQINTELTCDAGSVRTNAPDCYLAFHKLLLHLQGEREPGWLKQLFTNFISFTLKLILKRQVCNEINTISNIMADFVQTRAASILSDGDIGVDISVTGAPVITATYLESHHKGHFTHKNVSEAFPLRAFPPGLLGDSRMLYFWFSDQVLNSLARAAFQEGRLVLSLTGDEFKKVLETQGFDTNQEIFQELSRGLPTGQAQVAVHCLKVPKISCQNRGVVVSSSVAVTFRFPRPDGREAVAYRFEEDIITTVQASYSQKKLFLHLLDFQCVPASGRAGSSANLSVALRTEAKAVSNLTESRSESLQSSLRSLIATVGIPEVMSRLEVAFTALMNSKGLDLFEIINPEIITLDGCLLLQMDFGFPKHLLVDFLQSLS. The pIC50 is 7.2. (5) The drug is Cc1ncc(-c2cc(C(=O)Nc3ccc(OC(F)(F)F)cc3)cnc2N2CC[C@@H](O)C2)cc1F. The target protein sequence is NLFVALYDFVASGDNTLSITKGEKLRVLGYNHNGEWCEAQTKNGQGWVPSNYITPVNSLEKHSWYHGPVSRNAAEYLLSSGINGSFLVRESESSPGQRSISLRYEGRVYHYRINTASDGKLYVSSESRFNTLAELVHHHSTVADGLITTLHYPAPKRNKPTVYGVSPNYDKWEMERTDITMKHKLGGGQYGEVYEGVWKKYSLTVAVKTLKEDTMEVEEFLKEAAVMKEIKHPNLVQLLGVCTREPPFYIITEFMTYGNLLDYLRECNRQEVNAVVLLYMATQISSAMEYLEKKNFIHRDLAARNCLVGENHLVKVADFGLSRLMTGDTYTAHAGAKFPIKWTAPESLAYNKFSIKSDVWAFGVLLWEIATYGMSPYPGIDLSQVYELLEKDYRMERPEGCPEKVYELMRACWQWNPSDRPSFAEIHQAFETMFQESSISDEVEKELGKQGV. The pIC50 is 9.2. (6) The drug is Cn1cc2c(F)c(C(F)(F)c3nnc4ccc(-c5ccc(NC(=O)C6CC6)c(F)c5)nn34)ccc2n1. The target protein sequence is GDSDISSPLLQNTVHIDLSALNPELVQAVQHVVIGPSSLIVHFNEVIGRGHFGCVYHGTLLDNDGKKIHCAVKSLNRITDIGEVSQFLTEGIIMKDFSHPNVLSLLGICLRSEGSPLVVLPYMKHGDLRNFIRNETHNPTVKDLIGFGLQVAKGMKYLASKKFVHRDLAARNCMLDEKFTVKVADFGLARDMYDKEYYSVHNKTGAKLPVKWMALESLQTQKFTTKSDVWSFGVLLWELMTRGAPPYPDVNTFDITVYLLQGRRLLQPEYCPDPLYEVMLKCWHPKAEMRPSFSELVSRISAIFSTFIG. The pIC50 is 7.1. (7) The pIC50 is 3.8. The compound is CC(C)CC(N)C(=O)N1CCCC1P(=O)(O)O. The target protein (P14137) has sequence MLAKGLCLRSVLVKSCQPFLSPVWQGPGLATGNGAGISSTNSPRSFNEIPSPGDNGWINLYHFLRENGTHRIHYHHMQNFQKYGPIYREKLGNMESVYILDPKDAATLFSCEGPNPERYLVPPWVAYHQYYQRPIGVLFKSSDAWRKDRIVLNQEVMAPDSIKNFVPLLEGVAQDFIKVLHRRIKQQNSGKFSGDISDDLFRFAFESITSVVFGERLGMLEEIVDPESQRFIDAVYQMFHTSVPMLNMPPDLFRLFRTKTWKDHAAAWDVIFSKADEYTQNFYWDLRQKRDFSKYPGVLYSLLGGNKLPFKNIQANITEMLAGGVDTTSMTLQWNLYEMAHNLKVQEMLRAEVLAARRQAQGDMAKMVQLVPLLKASIKETLRLHPISVTLQRYIVNDLVLRNYKIPAKTLVQVASYAMGRESSFFPNPNKFDPTRWLEKSQNTTHFRYLGFGWGVRQCLGRRIAELEMTIFLINVLENFRIEVQSIRDVGTKFNLILMP.... (8) The small molecule is COc1cc2c(Oc3ccc(NC(=O)C4(C(=O)Nc5ccc(F)cc5)CC4)cc3F)ccnc2cc1OCCCN1CCOCC1. The target protein sequence is MRGARGAWDFLCVLLLLLRVQTGSSQPSVSPGEPSPPSIHPGKSDLIVRVGDEIRLLCTDPGFVKWTFEILDETNENKQNEWITEKAEATNTGKYTCTNKHGLSNSIYVFVRDPAKLFLVDRSLYGKEDNDTLVRCPLTDPEVTNYSLKGCQGKPLPKDLRFIPDPKAGIMIKSVKRAYHRLCLHCSVDQEGKSVLSEKFILKVRPAFKAVPVVSVSKASYLLREGEEFTVTCTIKDVSSSVYSTWKRENSQTKLQEKYNSWHHGDFNYERQATLTISSARVNDSGVFMCYANNTFGSANVTTTLEVVDKGFINIFPMINTTVFVNDGENVDLIVEYEAFPKPEHQQWIYMNRTFTDKWEDYPKSENESNIRYVSELHLTRLKGTEGGTYTFLVSNSDVNAAIAFNVYVNTKPEILTYDRLVNGMLQCVAAGFPEPTIDWYFCPGTEQRCSASVLPVDVQTLNSSGPPFGKLVVQSSIDSSAFKHNGTVECKAYNDVGKT.... The pIC50 is 8.2. (9) The small molecule is Cc1cc2nc(N)[nH]c(=O)c2n1Cc1ccc(Cl)cc1. The target protein (P0A884) has sequence MKQYLELMQKVLDEGTQKNDRTGTGTLSIFGHQMRFNLQDGFPLVTTKRCHLRSIIHELLWFLQGDTNIAYLHENNVTIWDEWADENGDLGPVYGKQWRAWPTPDGRHIDQITTVLNQLKNDPDSRRIIVSAWNVGELDKMALAPCHAFFQFYVADGKLSCQLYQRSCDVFLGLPFNIASYALLVHMMAQQCDLEVGDFVWTGGDTHLYSNHMDQTHLQLSREPRPLPKLIIKRKPESIFDYRFEDFEIEGYDPHPGIKAPVAI. The pIC50 is 5.8.