This data is from Drug-target binding data from BindingDB using IC50 measurements. The task is: Regression. Given a target protein amino acid sequence and a drug SMILES string, predict the binding affinity score between them. We predict pIC50 (pIC50 = -log10(IC50 in M); higher means more potent). Dataset: bindingdb_ic50. (1) The small molecule is CCCN(CCc1ccc(C(C)C)cc1)C(=O)[C@@H]1OC(C(=O)O)=C[C@H](N)[C@H]1NC(C)=O. The pIC50 is 5.2. The target protein (P16207) has sequence MLPSTIQTLTLFLTSGGVLLSLYVSASLSYLLYSDILLKFSPKITAPTMTLDCTNASNVQAVNRSATKEMTFLLPEPEWTYPRLSCQGSTFQKALLISPHRFGEARGNSAPLIIREPFIACGPKECKHFALTHYAAQPGGYYNGTREDRNKLRHLISVKLGKIPTVENSIFHMAAWSGSACHDGREWTYIGVDGPDSNALIKIKYGEAYTDTYHSYANNILRTQESACNCIGGDCYLMITDGSASGISKCRFLKIREGRIIKEIFPTGRVEHTEECTCGFASNKTIECACRDNNYTAKRPFVKLNVETDTAEIRLMCTETYLDTPRPDDGSITGPCESNGDKGRGGIKGGFVHQRMASKIGRWYSRTMSKTERMGMELYVKYDGDPWTDSDALDPSGVMVSIKEPGWYSFGFEIKDKKCDVPCIGIEMVHDGGKKTWHSAATAIYCLMGSGQLLWDTVTGVDMAL. (2) The compound is N=C(N)NCCC[C@@H](NC(=O)[C@@H](CCCNC(=N)N)NC(=O)[C@@H](CCCNC(=N)N)NC(=O)[C@@H](CCCNC(=N)N)NC(=O)[C@@H](CCCNC(=N)N)NC(=O)[C@@H](CCCNC(=N)N)NC(=O)CCCCCCCNC(=O)[C@H]1OC(n2cnc3c(N)ncnc32)[C@H](O)[C@@H]1O)C(=O)O. The target protein sequence is MRRGGAGAPPDLGSVLGHTTPNLRDLYALGRKLGQGQFGTTYLCTELATGIDYACKSISKRKLITKEDVDDVRREIQIMHHLSGHKNVVAIKGAYEDQVYVHIVMELCAGGELFDRIIQRGHYSERKAAALTRIIVGVVEACHSLGVMHRDLKPENFLLANRDDDLSLKAIDFGLSVFFKPGQVFTDVVGSPYYVAPEVLLKSYGPAADVWTAGVILYILLSGVPPFWAETQQGIFDAVLKGAIDFDSDPWPVISDSAKDLIRRMLNPRPAERLTAHEVLCHPWIRDHGVAPDRPLDPAVLSRIKQFSAMNKLKKMALRVIAESLSEEEIAGLKEMFQTMDTDNSGAITYDELKEGLRKYGSTLKDTEIRDLMDAADIDNSGTIDYIEFIAATLHLNKLEREEHLVAAFSYFDKDGSGYITVDELQLACKEHNMPDAFLDDVINEADQDNDGRIDYGEFVAMMTKGNMGVGRRTMRNSLNISMRDDLVCSET. The pIC50 is 5.5. (3) The drug is COc1cc(Cc2cnc(N)nc2N)c2cc(Cc3c(C(=O)N(C)C)[nH]c4ccc(Cl)cc34)oc2c1OC. The target protein sequence is MTKKIVAIWAQDEEGVIGKENRLPWHLPAELQHFKETTLNHAILMGRVTFDGMGRRLLPKRETLILTRNPEEKIDGVATFQDVQSVLDWYQAQEKNLYILGGKQIFQAFEPYLDEVIVTHIHARVEGDTYFPEELDLSLFETVSSKFYAKDEKNPYDFTIQYRKRKEV. The pIC50 is 7.0. (4) The small molecule is Cc1cc(C(=O)Nc2ccc(-c3ccccc3S(N)(=O)=O)cc2)n(-c2ccc3cc(Br)ccc3c2)n1. The target protein (P04070) has sequence MWQLTSLLLFVATWGISGTPAPLDSVFSSSERAHQVLRIRKRANSFLEELRHSSLERECIEEICDFEEAKEIFQNVDDTLAFWSKHVDGDQCLVLPLEHPCASLCCGHGTCIDGIGSFSCDCRSGWEGRFCQREVSFLNCSLDNGGCTHYCLEEVGWRRCSCAPGYKLGDDLLQCHPAVKFPCGRPWKRMEKKRSHLKRDTEDQEDQVDPRLIDGKMTRRGDSPWQVVLLDSKKKLACGAVLIHPSWVLTAAHCMDESKKLLVRLGEYDLRRWEKWELDLDIKEVFVHPNYSKSTTDNDIALLHLAQPATLSQTIVPICLPDSGLAERELNQAGQETLVTGWGYHSSREKEAKRNRTFVLNFIKIPVVPHNECSEVMSNMVSENMLCAGILGDRQDACEGDSGGPMVASFHGTWFLVGLVSWGEGCGLLHNYGVYTKVSRYLDWIHGHIRDKEAPQKSWAP. The pIC50 is 5.0. (5) The compound is CCN(CC)CC(=O)Nc1c(C)cccc1C. The target protein (Q7Z418) has sequence MEVSGHPQARRCCPEALGKLFPGLCFLCFLVTYALVGAVVFSAIEDGQVLVAADDGEFEKFLEELCRILNCSETVVEDRKQDLQGHLQKVKPQWFNRTTHWSFLSSLFFCCTVFSTVGYGYIYPVTRLGKYLCMLYALFGIPLMFLVLTDTGDILATILSTSYNRFRKFPFFTRPLLSKWCPKSLFKKKPDPKPADEAVPQIIISAEELPGPKLGTCPSRPSCSMELFERSHALEKQNTLQLPPQAMERSNSCPELVLGRLSYSIISNLDEVGQQVERLDIPLPIIALIVFAYISCAAAILPFWETQLDFENAFYFCFVTLTTIGFGDTVLEHPNFFLFFSIYIIVGMEIVFIAFKLVQNRLIDIYKNVMLFFAKGKFYHLVKK. The pIC50 is 4.0. (6) The compound is CCC(OC(=O)C(NC(=O)c1ccccc1)c1ccccc1)C(=O)Nc1cc(C)on1. The target protein (P49862) has sequence MARSLLLPLQILLLSLALETAGEEAQGDKIIDGAPCARGSHPWQVALLSGNQLHCGGVLVNERWVLTAAHCKMNEYTVHLGSDTLGDRRAQRIKASKSFRHPGYSTQTHVNDLMLVKLNSQARLSSMVKKVRLPSRCEPPGTTCTVSGWGTTTSPDVTFPSDLMCVDVKLISPQDCTKVYKDLLENSMLCAGIPDSKKNACNGDSGGPLVCRGTLQGLVSWGTFPCGQPNDPGVYTQVCKFTKWINDTMKKHR. The pIC50 is 6.2. (7) The compound is COc1ccccc1/C=C/C(=O)c1cc(Br)ccc1OC(=O)c1ccco1. The target protein sequence is SGFKKLVSPSSAVEKCIVSVSYRGNNLNGLWLGDSIYCPRHVLGKFSGDQWGDVLNLANNHEFEVVTQNGVTLNVVSRRLKGAVLILQTAVANAETPKYKFVKANCGDSFTIACSYGGTVIGLYPVTMRSNGTIRASFLAGACGSVGFNIEKGVVNFFYMHHLELPNALHTGTDLMGEFYGGYVDEEVAQRVPPDNLVTNNIVAWLYAAIISVKESSFSQPKWLESTTVSIEDYNRWASDNGFTPFSTSTAITKLSAITGVDVCKLLRTIMVKSAQWGSDPILGQYNFEDELTPESVFNQVGGVRLQ. The pIC50 is 5.0. (8) The drug is C/C=C/C[C@@H](C)[C@@H](O)[C@H]1C(=O)N[C@@H](CC)C(=O)N(C)CC(=O)N(C)[C@@H](CC(C)C)C(=O)N[C@@H](C(C)C)C(=O)N(C)[C@@H](CC(C)C)C(=O)N[C@@H](C)C(=O)N[C@H](C)C(=O)N(C)[C@@H](CC(C)C)C(=O)N(C)[C@@H](CC(C)C)C(=O)N(C)[C@@H](C(C)C)C(=O)N1C. The target protein (P19224) has sequence MACLLRSFQRISAGVFFLALWGMVVGDKLLVVPQDGSHWLSMKDIVEVLSDRGHEIVVVVPEVNLLLKESKYYTRKIYPVPYDQEELKNRYQSFGNNHFAERSFLTAPQTEYRNNMIVIGLYFINCQSLLQDRDTLNFFKESKFDALFTDPALPCGVILAEYLGLPSVYLFRGFPCSLEHTFSRSPDPVSYIPRCYTKFSDHMTFSQRVANFLVNLLEPYLFYCLFSKYEELASAVLKRDVDIITLYQKVSVWLLRYDFVLEYPRPVMPNMVFIGGINCKKRKDLSQEFEAYINASGEHGIVVFSLGSMVSEIPEKKAMAIADALGKIPQTVLWRYTGTRPSNLANNTILVKWLPQNDLLGHPMTRAFITHAGSHGVYESICNGVPMVMMPLFGDQMDNAKRMETKGAGVTLNVLEMTSEDLENALKAVINDKSYKENIMRLSSLHKDRPVEPLDLAVFWVEFVMRHKGAPHLRPAAHDLTWYQYHSLDVIGFLLAVVLT.... The pIC50 is 3.5.