Dataset: Drug-target binding data from BindingDB using IC50 measurements. Task: Regression. Given a target protein amino acid sequence and a drug SMILES string, predict the binding affinity score between them. We predict pIC50 (pIC50 = -log10(IC50 in M); higher means more potent). Dataset: bindingdb_ic50. (1) The small molecule is CC(C)C[C@@H](C=O)NC(=O)[C@@H](NC(=O)c1cc(C=O)c[nH]1)C(C)C. The target protein sequence is MAGIAAKLAKDREAAEGLGSHERAVKYLNQDYAALRDECLEAGALFQDPSFPALPSSLGFKELGPYSSKTRGIEWKRPTEICDDPQFITGGATRTDICQGALGDCWLLAAIASLTLNEEILARVVPLDQSFQENYAGIFHFQFWQYGEWVEVVVDDRLPTKDGELLFVHSAEGSEFWSALLEKAYAKINGCYEALSGGATTEGFEDFTGGIAEWYELRKAPPNLFRIIQKALQKGSLLGCSIDITSAADSEAITFQKLVKGHAYSVTGAEEVESRGSLQKLIRIRNPWGEVEWTGQWNDNCPNWNTVDPEVRESLTRRHEDGEFWMSFSDFLRHYSRLEICNLTPDTLTSDTYKKWKLTKMDGNWRRGSTAGGCRNYPNTFWMNPQYLIKLEEEDEDQEDGESGCTFLVGLIQKHRRRQRKMGEDMHTIGFGIYEVPEEFTGQTNIHLSKKFFLTTRARERSDTFINLREVLNRFKLPPGEYIVVPSTFEPNKDGDFCIR.... The pIC50 is 7.0. (2) The compound is Cc1nc2ccc3ccc(CNc4ccc5c(c4)CN([C@@H](CCC(=O)O)C(=O)O)C5=O)cc3c2c(=O)[nH]1. The target protein sequence is MSLLLNREHTNGQVTNASYAKVIETVLKSGVQADDRTGTGTLSTCYVPSYYMLTGGTVPLISGKAVNLKPLLVELEWYLKGTGNIQFLKDNGVKIWDAWADENGDLGPVYGKQWRRWEDTRIVSHSEYLSKIATFRERGYKVEGYLGISEDRVVLSREIDQLQRIVDTLRTNPTDRRIMLNAWNVGELEDMKLPPCHFVFSLWSRELDFETRLTMATDIGLQHSRLGYESIYTKMLYDLEMDGSVTEAELDELGIPKRILNSCLVQRSVDTFVGMPFNIAGYGILTHFLAKITGHMAGAFVHFGFDVHLYNNHMEGVCELMKRQAPEHSDPVVIFPHEWSELDDFKWDEVLILGYDPLPWIKVPVAV. The pIC50 is 7.6. (3) The drug is CC(C)CN(NC(=O)[C@@H](Cc1c[nH]c2ccccc12)NC(=O)[C@@H](N)Cc1cnc[nH]1)C(=O)N[C@@H](Cc1c[nH]c2ccccc12)C(=O)N[C@H](Cc1ccccc1)C(=O)N[C@@H](C)C(N)=O. The target protein (Q07969) has sequence MGCDRNCGLITGAVIGAVLAVFGGILMPVGDLLIEKTIKREVVLEEGTIAFKNWVKTGTTVYRQFWIFDVQNPEEVAKNSSKIKVKQRGPYTYRVRYLAKENITQDPKDSTVSFVQPNGAIFEPSLSVGTENDNFTVLNLAVAAAPHIYTNSFVQGVLNSLIKKSKSSMFQTRSLKELLWGYKDPFLSLVPYPISTTVGVFYPYNNTVDGVYKVFNGKDNISKVAIIDTYKGKRNLSYWESYCDMINGTDAASFPPFVEKSQTLRFFSSDICRSIYAVFESEVNLKGIPVYRFVLPANAFASPLQNPDNHCFCTEKVISNNCTSYGVLDIGKCKEGKPVYISLPHFLHASPDVSEPIEGLNPNEDEHRTYLDVEPITGFTLQFAKRLQVNILVKPARKIEALKNLKRPYIVPILWLNETGTIGDEKAEMFRNQVTGKIKLLGLVEMVLLGVGVVMFVAFMISYCACRSKNGK. The pIC50 is 5.1. (4) The drug is O=C(Nc1ccc(Cl)c(Cl)c1)Nc1nnc(-c2ccncc2)s1. The target protein (Q14191) has sequence MSEKKLETTAQQRKCPEWMNVQNKRCAVEERKACVRKSVFEDDLPFLEFTGSIVYSYDASDCSFLSEDISMSLSDGDVVGFDMEWPPLYNRGKLGKVALIQLCVSESKCYLFHVSSMSVFPQGLKMLLENKAVKKAGVGIEGDQWKLLRDFDIKLKNFVELTDVANKKLKCTETWSLNSLVKHLLGKQLLKDKSIRCSNWSKFPLTEDQKLYAATDAYAGFIIYRNLEILDDTVQRFAINKEEEILLSDMNKQLTSISEEVMDLAKHLPHAFSKLENPRRVSILLKDISENLYSLRRMIIGSTNIETELRPSNNLNLLSFEDSTTGGVQQKQIREHEVLIHVEDETWDPTLDHLAKHDGEDVLGNKVERKEDGFEDGVEDNKLKENMERACLMSLDITEHELQILEQQSQEEYLSDIAYKSTEHLSPNDNENDTSYVIESDEDLEMEMLKHLSPNDNENDTSYVIESDEDLEMEMLKSLENLNSGTVEPTHSKCLKMERN.... The pIC50 is 5.5.