Dataset: Drug-target binding data from BindingDB using IC50 measurements. Task: Regression. Given a target protein amino acid sequence and a drug SMILES string, predict the binding affinity score between them. We predict pIC50 (pIC50 = -log10(IC50 in M); higher means more potent). Dataset: bindingdb_ic50. (1) The drug is CC[C@@H](NC(=O)c1cc(C(=O)NCc2c(C)noc2C)c2n1CCOC2)c1ccccc1. The target protein (O14649) has sequence MKRQNVRTLALIVCTFTYLLVGAAVFDALESEPELIERQRLELRQQELRARYNLSQGGYEELERVVLRLKPHKAGVQWRFAGSFYFAITVITTIGYGHAAPSTDGGKVFCMFYALLGIPLTLVMFQSLGERINTLVRYLLHRAKKGLGMRRADVSMANMVLIGFFSCISTLCIGAAAFSHYEHWTFFQAYYYCFITLTTIGFGDYVALQKDQALQTQPQYVAFSFVYILTGLTVIGAFLNLVVLRFMTMNAEDEKRDAEHRALLTRNGQAGGGGGGGSAHTTDTASSTAAAGGGGFRNVYAEVLHFQSMCSCLWYKSREKLQYSIPMIIPRDLSTSDTCVEQSHSSPGGGGRYSDTPSRRCLCSGAPRSAISSVSTGLHSLSTFRGLMKRRSSV. The pIC50 is 5.5. (2) The target protein sequence is MRMPTGSELWPIAIFTIIFLLLVDLMHRRQRWTSRYPPGPVPWPVLGNLLQIDFQNMPAGFQKLRCRFGDLFSLQLAFESVVVLNGLPALREALVKYSEDTADRPPLHFNDQSGFGPRSQGVVLARYGPAWRQQRRFSVSTFRHFGLGKKSLEQWVTEEARCLCAAFADHSGFPFSPNTLLDKAVCNVIASLLFACRFEYNDPRFIRLLDLLKDTLEEESGFLPMLLNVFPMLLHIPGLLGKVFSGKKAFVAMLDELLTEHKVTWDPAQPPRDLTDAFLAEVEKAKGNPESSFNDENLRVVVADLFMAGMVTTSTTLTWALLFMILHPDVQCRVQQEIDEVIGQVRRPEMADQARMPFTNAVIHEVQRFADILPLGVPHKTSRDIEVQGFLIPKGTTLITNLSSVLKDETVWEKPLRFHPEHFLDAQGNFVKHEAFMPFSAGRRACLGEPLARMELFLFFTCLLQRFSFSVPTGQPRPSDYGIFGALTTPRPYQLCASPR.... The pIC50 is 3.1. The drug is N=C(N)N1CCc2ccccc2C1. (3) The target protein sequence is MTPLTPEQTHAYLHHIGIDDPGPPSLANLDRLIDAHLRRVAFENLDVLLDRPIEIDADKVFAKVVEGSRGGYCFELNSLFARLLLALGYELELLVARVRWGLPDDAPLTQQSHLMLRLYLAEGEFLVDVGFGSANPPRALPLPGDEADAGQVHCVRLVDPHAGLYESAVRGRSGWLPLYRFDLRPQLWIDYIPRNWYTSTHPHSVFRQGLKAAITEGDLRLTLADGLFGQRAGNGETLQRQLRDVEELLDILQTRFRLRLDPASEVPALARRLAGLISA. The drug is CC1(C)NC(=O)N(CC(O)COc2ccccc2C2CCCC2)C1=O. The pIC50 is 4.8. (4) The drug is O=c1cc(-c2ccccc2)oc2cc(O)c(O)c(O)c12. The target protein (P14410) has sequence MARKKFSGLEISLIVLFVIVTIIAIALIVVLATKTPAVDEISDSTSTPATTRVTTNPSDSGKCPNVLNDPVNVRINCIPEQFPTEGICAQRGCCWRPWNDSLIPWCFFVDNHGYNVQDMTTTSIGVEAKLNRIPSPTLFGNDINSVLFTTQNQTPNRFRFKITDPNNRRYEVPHQYVKEFTGPTVSDTLYDVKVAQNPFSIQVIRKSNGKTLFDTSIGPLVYSDQYLQISTRLPSDYIYGIGEQVHKRFRHDLSWKTWPIFTRDQLPGDNNNNLYGHQTFFMCIEDTSGKSFGVFLMNSNAMEIFIQPTPIVTYRVTGGILDFYILLGDTPEQVVQQYQQLVGLPAMPAYWNLGFQLSRWNYKSLDVVKEVVRRNREAGIPFDTQVTDIDYMEDKKDFTYDQVAFNGLPQFVQDLHDHGQKYVIILDPAISIGRRANGTTYATYERGNTQHVWINESDGSTPIIGEVWPGLTVYPDFTNPNCIDWWANECSIFHQEVQYD.... The pIC50 is 8.4. (5) The target protein sequence is KCGRRNKFGINRPAVLAPEDGLAMSLHFMTLGGSSLSPTEGKGSGLQGHIIENPQYFSDACVHHIKRRDIVLKWELGEGAFGKVFLAECHNLLPEQDKMLVAVKALKEASESARQDFQREAELLTMLQHQHIVRFFGVCTEGRPLLMVFEYMRHGDLNRFLRSHGPDAKLLAGGEDVAPGPLGLGQLLAVASQVAAGMVYLAGLHFVHRDLATRNCLVGQGLVVKIGDFGMSRDIYSTDYYRVGGRTMLPIRWMPPESILYRKFTTESDVWSFGVVLWEIFTYGKQPWYQLSNTEAIDCITQGRELERPRACPPEVYAIMRGCWQREPQQRHSIKDVHARLQALAQAPPVYLDVLG. The drug is CN(C)[C@H]1CC[C@H](NC(=O)c2cc(NC(=O)c3cc(-c4ncccc4F)ccc3Cl)n(-c3ccccc3)n2)CC1. The pIC50 is 7.1.