Dataset: Drug-target binding data from BindingDB using IC50 measurements. Task: Regression. Given a target protein amino acid sequence and a drug SMILES string, predict the binding affinity score between them. We predict pIC50 (pIC50 = -log10(IC50 in M); higher means more potent). Dataset: bindingdb_ic50. (1) The compound is CC(C)(C)C(O)CNC(=O)[C@@H]1C[C@H]2C[C@H]2N1C(=O)Cn1nc(C(N)=O)c2ccccc21. The target protein (P00746) has sequence MHSWERLAVLVLLGAAACAAPPRGRILGGREAEAHARPYMASVQLNGAHLCGGVLVAEQWVLSAAHCLEDAADGKVQVLLGAHSLSQPEPSKRLYDVLRAVPHPDSQPDTIDHDLLLLQLSEKATLGPAVRPLPWQRVDRDVAPGTLCDVAGWGIVNHAGRRPDSLQHVLLPVLDRATCNRRTHHDGAITERLMCAESNRRDSCKGDSGGPLVCGGVLEGVVTSGSRVCGNRKKPGIYTRVASYAAWIDSVLA. The pIC50 is 6.6. (2) The compound is CC(C)C(OC(=O)N1CCN(C(=O)N2C(=O)[C@H](CCCN=C(N)N)[C@H]2C(=O)O)CC1)C(C)C. The target protein (Q9BZJ3) has sequence MLLLAPQMLSLLLLALPVLASPAYVAPAPGQALQQTGIVGGQEAPRSKWPWQVSLRVRGPYWMHFCGGSLIHPQWVLTAAHCVEPDIKDLAALRVQLREQHLYYQDQLLPVSRIIVHPQFYIIQTGADIALLELEEPVNISSHIHTVTLPPASETFPPGMPCWVTGWGDVDNNVHLPPPYPLKEVEVPVVENHLCNAEYHTGLHTGHSFQIVRDDMLCAGSENHDSCQGDSGGPLVCKVNGT. The pIC50 is 8.4. (3) The small molecule is CSCCC(NC(=O)NCc1ccc(N)cc1)C(=O)N1CCCC1c1ccccc1SC. The target protein (Q08752) has sequence MSHPSPQAKPSNPSNPRVFFDVDIGGERVGRIVLELFADIVPKTAENFRALCTGEKGIGHTTGKPLHFKGCPFHRIIKKFMIQGGDFSNQNGTGGESIYGEKFEDENFHYKHDREGLLSMANAGRNTNGSQFFITTVPTPHLDGKHVVFGQVIKGIGVARILENVEVKGEKPAKLCVIAECGELKEGDDGGIFPKDGSGDSHPDFPEDADIDLKDVDKILLITEDLKNIGNTFFKSQNWEMAIKKYAEVLRYVDSSKAVIETADRAKLQPIALSCVLNIGACKLKMSNWQGAIDSCLEALELDPSNTKALYRRAQGWQGLKEYDQALADLKKAQGIAPEDKAIQAELLKVKQKIKAQKDKEKAVYAKMFA. The pIC50 is 6.2. (4) The small molecule is CCc1nc2ccc(N3CCC[C@@H]3C(=O)NCc3ccccc3)nc2c(=O)n1C. The target protein (Q8N5Z0) has sequence MNYARFITAASAARNPSPIRTMTDILSRGPKSMISLAGGLPNPNMFPFKTAVITVENGKTIQFGEEMMKRALQYSPSAGIPELLSWLKQLQIKLHNPPTIHYPPSQGQMDLCVTSGSQQGLCKVFEMIINPGDNVLLDEPAYSGTLQSLHPLGCNIINVASDESGIVPDSLRDILSRWKPEDAKNPQKNTPKFLYTVPNGNNPTGNSLTSERKKEIYELARKYDFLIIEDDPYYFLQFNKFRVPTFLSMDVDGRVIRADSFSKIISSGLRIGFLTGPKPLIERVILHIQVSTLHPSTFNQLMISQLLHEWGEEGFMAHVDRVIDFYSNQKDAILAAADKWLTGLAEWHVPAAGMFLWIKVKGINDVKELIEEKAVKMGVLMLPGNAFYVDSSAPSPYLRASFSSASPEQMDVAFQVLAQLIKESL. The pIC50 is 7.0. (5) The compound is Cc1cc(-c2c(C(=O)N3CCC3)nc3cccnn23)c(F)cc1C#N. The target protein sequence is SGAAPRARPRPPALALPPTGPESLTHFPFSDEDTRRHPPGRSVSFEAENGPTPSPGRSPLDSQASPGLVLHAGAATSQRRESFLYRSDSDYDMSPKTMSRNSSVTSEAHAEDLIVTPFAQVLASLRSVRSNFSLLTNVPVPSNKRSPLGGPTPVCKATLSEETCQQLARETLEELDWCLEQLETMQTYRSVSEMASHKFKRMLNRELTHLSEMSRSGNQVSEYISTTFLDKQNEVEIPSPTMKEREKQQAPRPRPSQPPPPPVPHLQPMSQITGLKKLMHSNSLNNSNIPRFGVKTDQEELLAQELENLNKWGLNIFCVSDYAGGRSLTCIMYMIFQERDLLKKFRIPVDTMVTYMLTLEDHYHADVAYHNSLHAADVLQSTHVLLATPALDAVFTDLEILAALFAAAIHDVDHPGVSNQFLINTNSELALMYNDESVLENHHLAVGFKLLQEDNCDIFQNLSKRQRQSLRKMVIDMVLATDMSKHMTLLADLKTMVETK.... The pIC50 is 7.3. (6) The target protein (P21451) has sequence MQSSASRCGRALVALLLACGLLGVWGEKRGFPPAQATPSLLGTKEVMTPPTKTSWTRGSNSSLMRSSAPAEVTKGGRVAGVPPRSFPPPCQRKIEINKTFKYINTIVSCLVFVLGIIGNSTLLRIIYKNKCMRNGPNILIASLALGDLLHIIIDIPINAYKLLAGDWPFGAEMCKLVPFIQKASVGITVLSLCALSIDRYRAVASWSRIKGIGVPKWTAVEIVLIWVVSVVLAVPEAIGFDVITSDYKGKPLRVCMLNPFQKTAFMQFYKTAKDWWLFSFYFCLPLAITAIFYTLMTCEMLRKKSGMQIALNDHLKQRREVAKTVFCLVLVFALCWLPLHLSRILKLTLYDQSNPQRCELLSFLLVLDYIGINMASLNSCINPIALYLVSKRFKNCFKSCLCCWCQTFEEKQSLEEKQSCLKFKANDHGYDNFRSSNKYSSS. The pIC50 is 5.6. The compound is CC[C@H](C)[C@H](NC(=O)[C@H](CC(=O)O)NC(=O)[C@H](CC(C)C)NC(=O)[C@H](N)Cc1ccccc1)C(=O)N[C@H](C(=O)N[C@@H](Cc1c[nH]c2ccccc12)C(=O)O)[C@@H](C)CC. (7) The compound is COCc1ccc(NC(=O)c2c(C#Cc3cnc4[nH]ccc4c3)n(C3(C)CC3)c3ncnc(N)c23)cc1. The target protein sequence is MAKATSGAAGLRLLLLLLLPLLGKVALGLYFSRDAYWEKLYVDQAAGTPLLYVHALRDAPEEVPSFRLGQHLYGTYRTRLHENNWICIQEDTGLLYLNRSLDHSSWEKLSVRNRGFPLLTVYLKVFLSPTSLREGECQWPGCARVYFSFFNTSFPACSSLKPRELCFPETRPSFRIRENRPPGTFHQFRLLPVQFLCPNISVAYRLLEGEGLPFRCAPDSLEVSTRWALDREQREKYELVAVCTVHAGAREEVVMVPFPVTVYDEDDSAPTFPAGVDTASAVVEFKRKEDTVVATLRVFDADVVPASGELVRRYTSTLLPGDTWAQQTFRVEHWPNETSVQANGSFVRATVHDYRLVLNRNLSISENRTMQLAVLVNDSDFQGPGAGVLLLHFNVSVLPVSLHLPSTYSLSVSRRARRFAQIGKVCVENCQAFSGINVQYKLHSSGANCSTLGVVTSAEDTSGILFVNDTKALRRPKCAELHYMVVATDQQTSRQAQAQL.... The pIC50 is 8.5. (8) The small molecule is C[C@@H]1COCCN1c1nc(-c2ccc(NC(=O)Nc3ccc(N4CCC(N(C)C)CC4)cc3)cc2)nc(N2CCOC[C@H]2C)n1. The target protein (P42346) has sequence MLGTGPATATAGAATSSNVSVLQQFASGLKSRNEETRAKAAKELQHYVTMELREMSQEESTRFYDQLNHHIFELVSSSDANERKGGILAIASLIGVEGGNSTRIGRFANYLRNLLPSSDPVVMEMASKAIGRLAMAGDTFTAEYVEFEVKRALEWLGADRNEGRRHAAVLVLRELAISVPTFFFQQVQPFFDNIFVAVWDPKQAIREGAVAALRACLILTTQREPKEMQKPQWYRHTFEEAEKGFDETLAKEKGMNRDDRIHGALLILNELVRISSMEGERLREEMEEITQQQLVHDKYCKDLMGFGTKPRHITPFTSFQAVQPQQSNALVGLLGYSSHQGLMGFGASPSPTKSTLVESRCCRDLMEEKFDQVCQWVLKCRSSKNSLIQMTILNLLPRLAAFRPSAFTDTQYLQDTMNHVLSCVKKEKERTAAFQALGLLSVAVRSEFKVYLPRVLDIIRAALPPKDFAHKRQKTVQVDATVFTCISMLARAMGPGIQQD.... The pIC50 is 8.7.