Dataset: Drug-target binding data from BindingDB using IC50 measurements. Task: Regression. Given a target protein amino acid sequence and a drug SMILES string, predict the binding affinity score between them. We predict pIC50 (pIC50 = -log10(IC50 in M); higher means more potent). Dataset: bindingdb_ic50. (1) The drug is CC(C)c1ccc(NC(=O)N2CCC[C@H]2C(=O)N2CC[C@H]3[C@H]2[C@H](C)C(=O)N3C(=O)[C@H]2[C@@H](C)[C@H]2C)cc1. The pIC50 is 6.5. The target protein (P16753) has sequence MTMDEQQSQAVAPVYVGGFLARYDQSPDEAELLLPRDVVEHWLHAQGQGQPSLSVALPLNINHDDTAVVGHVAAMQSVRDGLFCLGCVTSPRFLEIVRRASEKSELVSRGPVSPLQPDKVVEFLSGSYAGLSLSSRRCDDVEAATSLSGSETTPFKHVALCSVGRRRGTLAVYGRDPEWVTQRFPDLTAADRDGLRAQWQRCGSTAVDASGDPFRSDSYGLLGNSVDALYIRERLPKLRYDKQLVGVTERESYVKASVSPEAACDIKAASAERSGDSRSQAATPAAGARVPSSSPSPPVEPPSPVQPPALPASPSVLPAESPPSLSPSEPAEAASMSHPLSAAVPAATAPPGATVAGASPAVSSLAWPHDGVYLPKDAFFSLLGASRSAVPVMYPGAVAAPPSASPAPLPLPSYPASYGAPVVGYDQLAARHFADYVDPHYPGWGRRYEPAPSLHPSYPVPPPPSPAYYRRRDSPGGMDEPPSGWERYDGGHRGQSQKQH.... (2) The compound is CN1CCCCCNC(=O)Cn2c(-c3ccc(Cl)cc3)c(C3CCCCC3)c3ccc(cc32)C(=O)NS1(=O)=O. The target protein (O92972) has sequence MSTNPKPQRKTKRNTNRRPQDVKFPGGGQIVGGVYLLPRRGPRLGVRATRKASERSQPRGRRQPIPKARRPEGRAWAQPGYPWPLYGNEGLGWAGWLLSPRGSRPSWGPTDPRRRSRNLGKVIDTLTCGFADLMGYIPLVGAPLGGAARALAHGVRVLEDGVNYATGNLPGCSFSIFLLALLSCLTIPASAYEVRNVSGIYHVTNDCSNSSIVYEAADVIMHTPGCVPCVREGNSSRCWVALTPTLAARNASVPTTTIRRHVDLLVGTAAFCSAMYVGDLCGSIFLVSQLFTFSPRRHETVQDCNCSIYPGHVSGHRMAWDMMMNWSPTTALVVSQLLRIPQAVVDMVAGAHWGVLAGLAYYSMVGNWAKVLIVALLFAGVDGETHTTGRVAGHTTSGFTSLFSSGASQKIQLVNTNGSWHINRTALNCNDSLQTGFFAALFYAHKFNSSGCPERMASCRPIDWFAQGWGPITYTKPNSSDQRPYCWHYAPRPCGVVPAS.... The pIC50 is 6.7. (3) The drug is CCCCCCCCCCSCC(P(=O)(O)O)P(=O)(O)O. The target protein sequence is MTAFACFPHSLFYSTRLPFFFFFFCVCVHCCLRYLCLLKCAYCCSDKNYFRPLNYFFYCLYLAMASMERFLSVYDEVQAFLLDQLQSKYEIDPNRARYLRIMMDTTCLGGKYFRGMTVVNVAEGFLAVTQHDEATKERILHDACVGGWMIEFLQAHYLVEDDIMDGSVMRRGKPCWYRFPGVTTQCAINDGIILKSWTQIMAWHYFADRPFLKDLLCLFQKVDYATAVGQMYDVTSMCDSNKLDPEVAQPMTTDFAEFTPAIYKRIVKYKTTFYTYLLPLVMGLLVSEAAASVEMNLVERVAHLIGEYFQVQDDVMDCFTPPEQLGKVGTDIEDAKCSWLAVTFLGKANAAQVAEFKANYGEKDPAKVAVVKRLYSKANLQADFAAYEAEVVREVESLIEQLKVKSPTFAESVAVVWEKTHKRKK. The pIC50 is 7.0. (4) The small molecule is COc1cnc2ccc(=O)n(C[C@H](N)[C@H]3CC[C@H](NCc4ccc5c(n4)NC(=O)CO5)CC3)c2c1. The target protein (P51787) has sequence MAAASSPPRAERKRWGWGRLPGARRGSAGLAKKCPFSLELAEGGPAGGALYAPIAPGAPGPAPPASPAAPAAPPVASDLGPRPPVSLDPRVSIYSTRRPVLARTHVQGRVYNFLERPTGWKCFVYHFAVFLIVLVCLIFSVLSTIEQYAALATGTLFWMEIVLVVFFGTEYVVRLWSAGCRSKYVGLWGRLRFARKPISIIDLIVVVASMVVLCVGSKGQVFATSAIRGIRFLQILRMLHVDRQGGTWRLLGSVVFIHRQELITTLYIGFLGLIFSSYFVYLAEKDAVNESGRVEFGSYADALWWGVVTVTTIGYGDKVPQTWVGKTIASCFSVFAISFFALPAGILGSGFALKVQQKQRQKHFNRQIPAAASLIQTAWRCYAAENPDSSTWKIYIRKAPRSHTLLSPSPKPKKSVVVKKKKFKLDKDNGVTPGEKMLTVPHITCDPPEERRLDHFSVDGYDSSVRKSPTLLEVSMPHFMRTNSFAEDLDLEGETLLTPI.... The pIC50 is 3.5. (5) The drug is COC(=O)[C@@H]1c2cc3c(c(O)c2[C@@H](O[C@@H]2O[C@@H](C)[C@H](OC)[C@@](C)(OC)[C@H]2OC)C[C@]1(C)O)C(=O)c1c(O)cc2c(c1C3=O)O[C@@H]1O[C@@]2(C)[C@H](O)[C@@H](N(C)C)[C@@H]1O. The target protein sequence is NFSEEQQRIIEIPMNVNLCIIACPGSGKTSTLTARIIKSIIEEKQSIVCITFTNYAASDLKDKIMKKINCLIDICVDNKINQKLFNNKNNKINFSLKNKCTLNNKMNKSIFKVLNTVMFIGTIHSFCRYILYKYKGTFKILTDFINTNIIKLAFNNFYSSMMSKTKGTQPGFSTILERKSNKASTQNCDPDKINTHNNDDNINNKNDYINNKNKNDYNNINNYDNINNYDNINNDDNINNDDNINNDDNINNDDNINNDDDINNCGNCNQPKGIPSQLAYFINCMKNAEIKEDEEKEFYEEEHDIQNDILNNDDNNNDEDDDDDDEFYNYLYNFKHSYEQTNDYFANEQVQSVLKKKNIIFLKKKIKLMKYIELYNIKIEINDVEKMFYEEYKKIFKKAKNIYYDFDDLLIETYRLMKDN. The pIC50 is 5.5. (6) The drug is CCc1cn([C@@H]2O[C@H](CNC(=O)C3c4ccccc4Oc4c3cccc4C(F)(F)F)[C@@H](O)[C@@H]2F)c(=O)[nH]c1=O. The target protein sequence is MASHAGQQHAPAFGQAARASGPTDGRAASRPSHRQGASGARGDPELPTLLRVYIDGPHGVGKTTTSAQLMEALGPRDNIVYVPEPMTYWQVLGASETLTNIYNTQHRLDRGEISAGEAAVVMTSAQITMSTPYAATDAVLAPHIGGEAVGPQAPPPALTLVFDRHPIASLLCYPAARYLMGSMTPQAVLAFVALMPPTAPGTNLVLGVLPEAEHADRLARRQRPGERLDLAMLSAIRRVYDLLANTVRYLQRGGRWREDWGRLTGVAAATPRPDPEDGAGSLPRIEDTLFALFRVPELLAPNGDLYHIFAWVLDVLADRLLPMHLFVLDYDQSPVGCRDALLRLTAGMIPTRVTTAGSIAEIRDLARTFAREVGGV. The pIC50 is 9.6. (7) The small molecule is Cc1cc(C2CN(C(=O)/C=C/c3cnc4c(c3)CCC(=O)N4)C2)nc2ccccc12. The target protein sequence is YVIMGIANKRSIAFGVAKVLDQLGAKLVFTYRKERSRKELEKLLEQLNQPEAHLYQIDVQSDEEVINGFEQIGKDVGNIDGVYHSIAFANMEDLRGRFSETSREGFLLAQDISSYSLTIVAHEAKKLMPEGGSIVATTYLGGEFAVQNYNVMGVAKASLEANVKYLALDLGPDNIRVNAISAGPIRTLSAKGVGGFNTILKEIEERAPLKRNVDQVEVGKTAAYLLSDLSSGVTGENIHVDSG. The pIC50 is 6.6. (8) The drug is CCOC(=O)/C=C/C(=O)N(CC(N)=O)NC(=O)[C@@H]1CCCN1C(C)=O. The target protein sequence is MTWRVAVLLSLVLGAGAVPVGVDDPEDGGKHWVVIVAGSNGWYNYRHQADACHAYQIIHRNGIPDEQIIVMMYDDIANSEENPTPGVVINRPNGTDVYKGVLKDYTGEDVTPENFLAVLRGDAEAVKGKGSGKVLKSGPRDHVFIYFTDHGATGILVFPNDDLHVKDLNKTIRYMYEHKMYQKMVFYIEACESGSMMNHLPDDINVYATTAANPKESSYACYYDEERGTYLGDWYSVNWMEDSDVEDLTKETLHKQYHLVKSHTNTSHVMQYGNKSISTMKVMQFQGMKHRASSPISLPPVTHLDLTPSPDVPLTILKRKLLRTNDVKESQNLIGQIQQFLDARHVIEKSVHKIVSLLAGFGETAERHLSERTMLTAHDCYQEAVTHFRTHCFNWHSVTYEHALRYLYVLANLCEAPYPIDRIEMAMDKVCLSHY. The pIC50 is 8.1.