This data is from Drug-target binding data from BindingDB using IC50 measurements. The task is: Regression. Given a target protein amino acid sequence and a drug SMILES string, predict the binding affinity score between them. We predict pIC50 (pIC50 = -log10(IC50 in M); higher means more potent). Dataset: bindingdb_ic50. (1) The drug is CCN(CC)CC1N(CCO)C(=O)c2c(O)c(=O)ccn2N1C1c2ccccc2CCc2ccccc21. The target protein (P21675) has sequence MGPGCDLLLRTAATITAAAIMSDTDSDEDSAGGGPFSLAGFLFGNINGAGQLEGESVLDDECKKHLAGLGALGLGSLITELTANEELTGTDGALVNDEGWVRSTEDAVDYSDINEVAEDESRRYQQTMGSLQPLCHSDYDEDDYDADCEDIDCKLMPPPPPPPGPMKKDKDQDSITGEKVDFSSSSDSESEMGPQEATQAESEDGKLTLPLAGIMQHDATKLLPSVTELFPEFRPGKVLRFLRLFGPGKNVPSVWRSARRKRKKKHRELIQEEQIQEVECSVESEVSQKSLWNYDYAPPPPPEQCLSDDEITMMAPVESKFSQSTGDIDKVTDTKPRVAEWRYGPARLWYDMLGVPEDGSGFDYGFKLRKTEHEPVIKSRMIEEFRKLEENNGTDLLADENFLMVTQLHWEDDIIWDGEDVKHKGTKPQRASLAGWLPSSMTRNAMAYNVQQGFAATLDDDKPWYSIFPIDNEDLVYGRWEDNIIWDAQAMPRLLEPPVL.... The pIC50 is 7.4. (2) The drug is Cc1ccc(OC(C)(Cc2ccc(OCCc3nc(-c4ccccc4)oc3C)cc2)C(=O)O)cc1. The target protein (P23204) has sequence MVDTESPICPLSPLEADDLESPLSEEFLQEMGNIQEISQSIGEESSGSFGFADYQYLGSCPGSEGSVITDTLSPASSPSSVSCPVIPASTDESPGSALNIECRICGDKASGYHYGVHACEGCKGFFRRTIRLKLVYDKCDRSCKIQKKNRNKCQYCRFHKCLSVGMSHNAIRFGRMPRSEKAKLKAEILTCEHDLKDSETADLKSLGKRIHEAYLKNFNMNKVKARVILAGKTSNNPPFVIHDMETLCMAEKTLVAKMVANGVEDKEAEVRFFHCCQCMSVETVTELTEFAKAIPGFANLDLNDQVTLLKYGVYEAIFTMLSSLMNKDGMLIAYGNGFITREFLKNLRKPFCDIMEPKFDFAMKFNALELDDSDISLFVAAIICCGDRPGLLNIGYIEKLQEGIVHVLKLHLQSNHPDDTFLFPKLLQKMVDLRQLVTEHAQLVQVIKKTESDAALHPLLQEIYRDMY. The pIC50 is 5.9. (3) The compound is C[C@]12CC[C@@H]3c4ccc(OS(N)(=O)=O)cc4CC[C@H]3[C@@H]1CC[C@@]2(O)Cc1ccccc1. The target protein (P08842) has sequence MPLRKMKIPFLLLFFLWEAESHAASRPNIILVMADDLGIGDPGCYGNKTIRTPNIDRLASGGVKLTQHLAASPLCTPSRAAFMTGRYPVRSGMASWSRTGVFLFTASSGGLPTDEITFAKLLKDQGYSTALIGKWHLGMSCHSKTDFCHHPLHHGFNYFYGISLTNLRDCKPGEGSVFTTGFKRLVFLPLQIVGVTLLTLAALNCLGLLHVPLGVFFSLLFLAALILTLFLGFLHYFRPLNCFMMRNYEIIQQPMSYDNLTQRLTVEAAQFIQRNTETPFLLVLSYLHVHTALFSSKDFAGKSQHGVYGDAVEEMDWSVGQILNLLDELRLANDTLIYFTSDQGAHVEEVSSKGEIHGGSNGIYKGGKANNWEGGIRVPGILRWPRVIQAGQKIDEPTSNMDIFPTVAKLAGAPLPEDRIIDGRDLMPLLEGKSQRSDHEFLFHYCNAYLNAVRWHPQNSTSIWKAFFFTPNFNPVGSNGCFATHVCFCFGSYVTHHDPP.... The pIC50 is 8.8. (4) The pIC50 is 3.4. The compound is O=C(NC[C@@H]1O[C@H](CO)[C@@H](O)[C@H](O)[C@H]1O)c1cccc([N+](=O)[O-])c1. The target protein (P23739) has sequence MAKKKFSALEISLIVLFIIVTAIAIALVTVLATKVPAVEEIKSPTPTSNSTPTSTPTSTSTPTSTSTPSPGKCPPEQGEPINERINCIPEQHPTKAICEERGCCWRPWNNTVIPWCFFADNHGYNAESITNENAGLKATLNRIPSPTLFGEDIKSVILTTQTQTGNRFRFKITDPNNKRYEVPHQFVKEETGIPAADTLYDVQVSENPFSIKVIRKSNNKVLCDTSVGPLLYSNQYLQISTRLPSEYIYGFGGHIHKRFRHDLYWKTWPIFTRDEIPGDNNHNLYGHQTFFMGIGDTSGKSYGVFLMNSNAMEVFIQPTPIITYRVTGGILDFYIFLGDTPEQVVQQYQEVHWRPAMPAYWNLGFQLSRWNYGSLDTVSEVVRRNREAGIPYDAQVTDIDYMEDHKEFTYDRVKFNGLPEFAQDLHNHGKYIIILDPAISINKRANGAEYQTYVRGNEKNVWVNESDGTTPLIGEVWPGLTVYPDFTNPQTIEWWANECN.... (5) The small molecule is CCCCNCC(P(=O)(O)O)P(=O)(O)O. The target protein sequence is MLKTGLCRRAAATTTITSTVPSNLLTEDGRPFAMVAREVRMMQQNMAGLVSNSNNAVLNHIAKYVFSVSGKMLRPTLVAMMAHALLPPHVSEQIRAESIGSIDDISSGAIRPFLRLGEITELLHTATLVHDDVMDNSNTRRGQPTVHCLYDTKRAVLAGDFLLARASIWIAALGHSRVVLLMSTALEDLAAGEMMQMDGCFDIESYEQKSYCKTASLIANSLASTAVLAGLPNTAYEEAAAKFGKHLGIAFQIVDDCLDITGDDKNLGKPKMADMAEGIATLPVLLAAREETRVYEAVRRRFKNPGDTEMCMEAVERHGCVAEALEHAGEHCRRGVEALHALHTSPARDCLEAAMGLILTRQV. The pIC50 is 5.0. (6) The drug is CCCCC(=O)C=C(C)C=CCCC(=O)N1CCCC1=O. The target protein (P06855) has sequence MKKIILTIGCPGSGKSTWAREFIAKNPGFYNINRDDYRQSIMAHEERDEYKYTKKKEGIVTGMQFDTAKSILYGGDSVKGVIISDTNLNPERRLAWETFAKEYGWKVEHKVFDVPWTELVKRNSKRGTKAVPIDVLRSMYKSMREYLGLPVYNGTPGKPKAVIFDVDGTLAKMNGRGPYDLEKCDTDVINPMVVELSKMYALMGYQIVVVSGRESGTKEDPTKYYRMTRKWVEDIAGVPLVMQCQREQGDTRKDDVVKEEIFWKHIAPHFDVKLAIDDRTQVVEMWRRIGVECWQVASGDF. The pIC50 is 3.7. (7) The small molecule is O=C(CN1CCc2ccccc21)Nc1nnc(-c2cccs2)s1. The target protein sequence is MSEDAGLPVPRSQWVERGVSCATCGKRFSLFTAKSNCPCCGKLCCSDCVQAECAIVGGSAPSKVCIDCFSMLQSRRRVEPDEGSSFREFNAASAFPLQTRLLADGRVESGETSRVSPPNDGRVQHVSRANGYSNSLPVLDEYVDDLLRKSELLRMENDVLLNRLREQEAEIHALRLERDRAVARIVPDGGSMAGRSGLPQVSDEIVKELRGELAVAHLRIESVKRELKNALDRAKSSETMVRNLKQGLCNYKEEVVRPLQSREEVEMLPGVNGRRDMISTRRLPPSIVQDTILAVVPPKSCAAIGTDVDLRDWGFDTFEVASRVPSVLQSVAMHVALAWNFFASQEEAQKWAFLVAAVENNYRPNPYHNAIHAADVLQGTFSLVSAAKPLMEHLTPLECKAAAFAALTHDVCHPGRTNAFLAAVQDPVSFKFSGKGTLEQLHTVTAFELLNVTEFDFTSSMDNASFLEFKNIVSHLIGHTDMSLHSETIAKHGAKLSAGG.... The pIC50 is 5.0. (8) The small molecule is COc1ccc(CNc2c(-c3ccccc3O)nc3cnccn23)cc1. The target protein (P02554) has sequence MREIVHIQAGQCGNQIGAKFWEVISDEHGIDPTGSYHGDSDLQLERINVYYNEAAGNKYVPRAILVDLEPGTMDSVRSGPFGQIFRPDNFVFGQSGAGNNWAKGHYTEGAELVDSVLDVVRKESESCDCLQGFQLTHSLGGGTGSGMGTLLISKIREEYPDRIMNTFSVVPSPKVSDTVVEPYNATLSVHQLVENTDETYCIDNEALYDICFRTLKLTTPTYGDLNHLVSATMSGVTTCLRFPGQLNADLRKLAVNMVPFPRLHFFMPGFAPLTSRGSQQYRALTVPELTQQMFDAKNMMAACDPRHGRYLTVAAVFRGRMSMKEVDEQMLNVQNKNSSYFVEWIPNNVKTAVCDIPPRGLKMSATFIGNSTAIQELFKRISEQFTAMFRRKAFLHWYTGEGMDEMEFTEAESNMNDLVSEYQQYQDATADEQGEFEEEGEEDEA. The pIC50 is 5.2. (9) The compound is O=C(O)c1cc(F)c(CCNC(=O)[C@H](CC(F)F)NC(=O)[C@@H]2C[C@@H](c3ccccc3)CN2C(=O)C(O)Cc2ccccc2)c(F)c1. The target protein (P26664) has sequence MSTNPKPQKKNKRNTNRRPQDVKFPGGGQIVGGVYLLPRRGPRLGVRATRKTSERSQPRGRRQPIPKARRPEGRTWAQPGYPWPLYGNEGCGWAGWLLSPRGSRPSWGPTDPRRRSRNLGKVIDTLTCGFADLMGYIPLVGAPLGGAARALAHGVRVLEDGVNYATGNLPGCSFSIFLLALLSCLTVPASAYQVRNSTGLYHVTNDCPNSSIVYEAADAILHTPGCVPCVREGNASRCWVAMTPTVATRDGKLPATQLRRHIDLLVGSATLCSALYVGDLCGSVFLVGQLFTFSPRRHWTTQGCNCSIYPGHITGHRMAWDMMMNWSPTTALVMAQLLRIPQAILDMIAGAHWGVLAGIAYFSMVGNWAKVLVVLLLFAGVDAETHVTGGSAGHTVSGFVSLLAPGAKQNVQLINTNGSWHLNSTALNCNDSLNTGWLAGLFYHHKFNSSGCPERLASCRPLTDFDQGWGPISYANGSGPDQRPYCWHYPPKPCGIVPAK.... The pIC50 is 4.0.