This data is from Drug-target binding data from BindingDB using IC50 measurements. The task is: Regression. Given a target protein amino acid sequence and a drug SMILES string, predict the binding affinity score between them. We predict pIC50 (pIC50 = -log10(IC50 in M); higher means more potent). Dataset: bindingdb_ic50. (1) The compound is Cc1sc2ncnc(N3CCC(C(=O)Nc4ccc(C#N)cc4)CC3)c2c1C. The target protein (P04578) has sequence MRVKEKYQHLWRWGWRWGTMLLGMLMICSATEKLWVTVYYGVPVWKEATTTLFCASDAKAYDTEVHNVWATHACVPTDPNPQEVVLVNVTENFNMWKNDMVEQMHEDIISLWDQSLKPCVKLTPLCVSLKCTDLKNDTNTNSSSGRMIMEKGEIKNCSFNISTSIRGKVQKEYAFFYKLDIIPIDNDTTSYKLTSCNTSVITQACPKVSFEPIPIHYCAPAGFAILKCNNKTFNGTGPCTNVSTVQCTHGIRPVVSTQLLLNGSLAEEEVVIRSVNFTDNAKTIIVQLNTSVEINCTRPNNNTRKRIRIQRGPGRAFVTIGKIGNMRQAHCNISRAKWNNTLKQIASKLREQFGNNKTIIFKQSSGGDPEIVTHSFNCGGEFFYCNSTQLFNSTWFNSTWSTEGSNNTEGSDTITLPCRIKQIINMWQKVGKAMYAPPISGQIRCSSNITGLLLTRDGGNSNNESEIFRPGGGDMRDNWRSELYKYKVVKIEPLGVAPTK.... The pIC50 is 5.1. (2) The compound is CC[C@@H](C)[C@H](NC(=O)[C@H](CCCCN)NC(=O)[C@H](Cc1ccc(O)cc1)NC(=O)[C@H](CCCCN)NC(=O)[C@H](CCCCN)NC(=O)[C@@H](N)CCCNC(=N)N)C(=O)N[C@@H](CCCNC(=N)N)C(=O)N[C@@H](CCCNC(=N)N)C(=O)N[C@@H](CCCCN)C(N)=O. The target protein (P11799) has sequence MGDVKLVTSTRVSKTSLTLSPSVPAEAPAFTLPPRNIRVQLGATARFEGKVRGYPEPQITWYRNGHPLPEGDHYVVDHSIRGIFSLVIKGVQEGDSGKYTCEAANDGGVRQVTVELTVEGNSLKKYSLPSSAKTPGGRLSVPPVEHRPSIWGESPPKFATKPNRVVVREGQTGRFSCKITGRPQPQVTWTKGDIHLQQNERFNMFEKTGIQYLEIQNVQLADAGIYTCTVVNSAGKASVSAELTVQGPDKTDTHAQPLCMPPKPTTLATKAIENSDFKQATSNGIAKELKSTSTELMVETKDRLSAKKETFYTSREAKDGKQGQNQEANAVPLQESRGTKGPQVLQKTSSTITLQAVKAQPEPKAEPQTTFIRQAEDRKRTVQPLMTTTTQENPSLTGQVSPRSRETENRAGVRKSVKEEKREPLGIPPQFESRPQSLEASEGQEIKFKSKVSGKPKPDVEWFKEGVPIKTGEGIQIYEEDGTHCLWLKKACLGDSGSYS.... The pIC50 is 6.8. (3) The pIC50 is 8.7. The target protein (Q08493) has sequence MENLGVGEGAEACSRLSRSRGRHSMTRAPKHLWRQPRRPIRIQQRFYSDPDKSAGCRERDLSPRPELRKSRLSWPVSSCRRFDLENGLSCGRRALDPQSSPGLGRIMQAPVPHSQRRESFLYRSDSDYELSPKAMSRNSSVASDLHGEDMIVTPFAQVLASLRTVRSNVAALARQQCLGAAKQGPVGNPSSSNQLPPAEDTGQKLALETLDELDWCLDQLETLQTRHSVGEMASNKFKRILNRELTHLSETSRSGNQVSEYISRTFLDQQTEVELPKVTAEEAPQPMSRISGLHGLCHSASLSSATVPRFGVQTDQEEQLAKELEDTNKWGLDVFKVAELSGNRPLTAIIFSIFQERDLLKTFQIPADTLATYLLMLEGHYHANVAYHNSLHAADVAQSTHVLLATPALEAVFTDLEILAALFASAIHDVDHPGVSNQFLINTNSELALMYNDASVLENHHLAVGFKLLQAENCDIFQNLSAKQRLSLRRMVIDMVLATD.... The drug is Cc1ccc(-c2c(C(=O)N[C@@H]3C[C@@H]3F)nc3cccnn23)cc1Cl. (4) The compound is CC(C)C[C@H](NC(=O)Nc1cccc2ccccc12)C(=O)NO. The target protein (P97449) has sequence MAKGFYISKTLGILGILLGVAAVCTIIALSVVYAQEKNRNAENSATAPTLPGSTSATTATTTPAVDESKPWNQYRLPKTLIPDSYRVILRPYLTPNNQGLYIFQGNSTVRFTCNQTTDVIIIHSKKLNYTLKGNHRVVLRTLDGTPAPNIDKTELVERTEYLVVHLQGSLVEGRQYEMDSQFQGELADDLAGFYRSEYMEGDVKKVVATTQMQAADARKSFPCFDEPAMKAMFNITLIYPNNLIALSNMLPKESKPYPEDPSCTMTEFHSTPKMSTYLLAYIVSEFKNISSVSANGVQIGIWARPSAIDEGQGDYALNVTGPILNFFAQHYNTSYPLPKSDQIALPDFNAGAMENWGLVTYRESSLVFDSQSSSISNKERVVTVIAHELAHQWFGNLVTVAWWNDLWLNEGFASYVEYLGADYAEPTWNLKDLMVLNDVYRVMAVDALASSHPLSSPADEIKTPDQIMELFDSITYSKGASVIRMLSSFLTEDLFKKGLS.... The pIC50 is 5.7. (5) The target protein (Q9Z0J5) has sequence MAAIVAALRGSSGRFRPQTRVLTRGTRGAAGAASAAGGQQNFDLLVIGGGSGGLACAKEAAQLGRKVAVADYVEPSPRGTKWGLGGTCVNVGCIPKKLMHQAALLGGMIRDAQHYGWEVAQPVQHNWKAMAEAVQNHVKSLNWGHRVQLQDRKVKYFNIKASFVNEHTVHGVDKAGKVTQLSAKHIVIATGGRPKYPTQVKGALEHGITSDDIFWLKESPGKTLVVGASYVALECAGFLTGIGLDTTVMMRSVPLRGFDQQMASLVTEHMESHGTRFLKGCVPSLIRKLPTNQLQVTWEDLASGKEDVGTFDTVLWAIGRVPETRNLNLEKAGVNTNPKNQKIIVDAQEATSVPHIYAIGDVAEGRPELTPTAIKAGKLLAQRLFGKSSTLMNYSNVPTTVFTPLEYGCVGLSEEEAVALHGQEHIEVYHAYYKPLEFTVADRDASQCYIKMVCMREPPQLVLGLHFLGPNAGEVTQGFALGIQCGASYAQVMQTVGIHP.... The drug is COc1cc(O)c2c(c1)C(=O)c1cc(C)cc(O)c1C2=O. The pIC50 is 3.9. (6) The small molecule is [C-]#[N+]c1ccc(N[C@@H](c2nnc(-c3ccc(C#N)cc3)o2)[C@H](C)OC(=O)c2ccccc2)c(C)c1Cl. The target protein (P15207) has sequence MEVQLGLGRVYPRPPSKTYRGAFQNLFQSVREAIQNPGPRHPEAASIAPPGACLQQRQETSPRRRRRQQHPEDGSPQAHIRGTTGYLALEEEQQPSQQQSASEGHPESGCLPEPGAATAPGKGLPQQPPAPPDQDDSAAPSTLSLLGPTFPGLSSCSADIKDILSEAGTMQLLQQQQQQQQQQQQQQQQQQQQQQEVISEGSSSVRAREATGAPSSSKDSYLGGNSTISDSAKELCKAVSVSMGLGVEALEHLSPGEQLRGDCMYASLLGGPPAVRPTPCAPLAECKGLSLDEGPGKGTEETAEYSSFKGGYAKGLEGESLGCSGSSEAGSSGTLEIPSSLSLYKSGAVDEAAAYQNRDYYNFPLALSGPPHPPPPTHPHARIKLENPSDYGSAWAAAAAQCRYGDLASLHGGSVAGPSTGSPPATASSSWHTLFTAEEGQLYGPGGGGGSSSPSDAGPVAPYGYTRPPQGLASQEGDFSASEVWYPGGVVNRVPYPSPS.... The pIC50 is 6.9.