This data is from Drug-target binding data from BindingDB using Ki measurements. The task is: Regression. Given a target protein amino acid sequence and a drug SMILES string, predict the binding affinity score between them. We predict pKi (pKi = -log10(Ki in M); higher means stronger inhibition). Dataset: bindingdb_ki. (1) The compound is Nc1ncnc2nc(-c3ccc(N4CCOCC4)nc3)cc(-c3cccc(Br)c3)c12. The target protein (Q06432) has sequence MSQTKMLKVRVTLFCILAGIVLAMTAVVTDHWAVLSPHMEHHNTTCEAAHFGLWRICTKRIPMDDSKTCGPITLPGEKNCSYFRHFNPGESSEIFEFTTQKEYSISAAAIAIFSLGFIILGSLCVLLSLGKKRDYLLRPASMFYAFAGLCILVSVEVMRQSVKRMIDSEDTVWIEYYYSWSFACACAAFILLFLGGLALLLFSLPRMPRNPWESCMDAEPEH. The pKi is 5.0. (2) The drug is N#Cc1ccc(C2CN3CCCC3c3ccccc32)cc1. The target protein (Q61327) has sequence MSKSKCSVGPMSSVVAPAKEPNAVGPREVELILVKEQNGVQLTNSTLINPPQTPVEVQERETWSKKIDFLLSVIGFAVDLANVWRFPYLCYKNGGGAFLVPYLLFMVIAGMPLFYMELALGQFNREGAAGVWKICPVLKGVGFTVILISFYVGFFYNVIIAWALHYFFSSFTMDLPWIHCNNTWNSPNCSDAHSSNSSDGLGLNDTFGTTPAAEYFERGVLHLHQSRGIDDLGPPRWQLTACLVLVIVLLYFSLWKGVKTSGKVVWITATMPYVVLTALLLRGVTLPGAMDGIRAYLSVDFYRLCEASVWIDAATQVCFSLGVGFGVLIAFSSYNKFTNNCYRDAIITTSINSLTSFSSGFVVFSFLGYMAQKHNVPIRDVATDGPGLIFIIYPEAIATLPLSSAWAAVFFLMLLTLGIDSAMGGMESVITGLVDEFQLLHRHRELFTLGIVLATFLLSLFCVTNGGIYVFTLLDHFAAGTSILFGVLIEAIGVAWFYGV.... The pKi is 8.1. (3) The drug is COCCCC/C(=N\OCCN)c1ccc(C(F)(F)F)cc1. The target protein (P04799) has sequence MAFSQYISLAPELLLATAIFCLVFWVLRGTRTQVPKGLKSPPGPWGLPFIGHMLTLGKNPHLSLTKLSQQYGDVLQIRIGSTPVVVLSGLNTIKQALVKQGDDFKGRPDLYSFTLITNGKSMTFNPDSGPVWAARRRLAQDALKSFSIASDPTSVSSCYLEEHVSKEANHLISKFQKLMAEVGHFEPVNQVVESVANVIGAMCFGKNFPRKSEEMLNLVKSSKDFVENVTSGNAVDFFPVLRYLPNPALKRFKNFNDNFVLFLQKTVQEHYQDFNKNSIQDITGALFKHSENYKDNGGLIPQEKIVNIVNDIFGAGFETVTTAIFWSILLLVTEPKVQRKIHEELDTVIGRDRQPRLSDRPQLPYLEAFILEIYRYTSFVPFTIPHSTTRDTSLNGFHIPKECCIFINQWQVNHDEKQWKDPFVFRPERFLTNDNTAIDKTLSEKVMLFGLGKRRCIGEIPAKWEVFLFLAILLHQLEFTVPPGVKVDLTPSYGLTMKPR.... The pKi is 4.2. (4) The small molecule is C[n+]1ccc(-c2ccccc2)cc1. The target is MLLARMKPQVQPELGGADQ. The pKi is 6.1. (5) The compound is CCOC(=O)CNC(=O)NCc1ccccc1. The target protein (Q08752) has sequence MSHPSPQAKPSNPSNPRVFFDVDIGGERVGRIVLELFADIVPKTAENFRALCTGEKGIGHTTGKPLHFKGCPFHRIIKKFMIQGGDFSNQNGTGGESIYGEKFEDENFHYKHDREGLLSMANAGRNTNGSQFFITTVPTPHLDGKHVVFGQVIKGIGVARILENVEVKGEKPAKLCVIAECGELKEGDDGGIFPKDGSGDSHPDFPEDADIDLKDVDKILLITEDLKNIGNTFFKSQNWEMAIKKYAEVLRYVDSSKAVIETADRAKLQPIALSCVLNIGACKLKMSNWQGAIDSCLEALELDPSNTKALYRRAQGWQGLKEYDQALADLKKAQGIAPEDKAIQAELLKVKQKIKAQKDKEKAVYAKMFA. The pKi is 4.2. (6) The drug is COc1nc(F)nc(OC)c1NC(=O)c1ccc(Oc2cc3c(cc2C)CCC3(C)C)o1. The target protein (P30969) has sequence MANNASLEQDQNHCSAINNSIPLTQGKLPTLTLSGKIRVTVTFFLFLLSTAFNASFLVKLQRWTQKRKKGKKLSRMKVLLKHLTLANLLETLIVMPLDGMWNITVQWYAGEFLCKVLSYLKLFSMYAPAFMMVVISLDRSLAVTQPLAVQSKSKLERSMTSLAWILSIVFAGPQLYIFRMIYLADGSGPAVFSQCVTHCSFPQWWHEAFYNFFTFSCLFIIPLLIMLICNAKIIFALTRVLHQDPRKLQLNQSKNNIPRARLRTLKMTVAFGTSFVICWTPYYVLGIWYWFDPEMLNRVSEPVNHFFFLFAFLNPCFDPLIYGYFSL. The pKi is 7.4. (7) The drug is CS(=O)(=O)c1ccc(C2=C(c3ccccc3)C(=O)OC2)cc1. The pKi is 7.4. The target protein (P34980) has sequence MAGVWAPEHSVEAHSNQSSAADGCGSVSVAFPITMMVTGFVGNALAMLLVVRSYRRRESKRKKSFLLCIGWLALTDLVGQLLTSPVVILVYLSQRRWEQLDPSGRLCTFFGLTMTVFGLSSLLVASAMAVERALAIRAPHWYASHMKTRATPVLLGVWLSVLAFALLPVLGVGRYSVQWPGTWCFISTGPAGNETDSAREPGSVAFASAFACLGLLALVVTFACNLATIKALVSRCRAKAAASQSSAQWGRITTETAIQLMGIMCVLSVCWSPLLIMMLKMIFNQMSVEQCKTQMGKEKECNSFLIAVRLASLNQILDPWVYLLLRKILLRKFCQIRDHTNYASSSTSLPCPGSSVLMWSDQLER. (8) The small molecule is NC[C@@H]1O[C@H](c2ccccc2)Cc2c(O)cccc21. The target protein (P35406) has sequence MAVLDLNLTTVIDSGFMESDRSVRVLTGCFLSVLILSTLLGNTLVCAAVTKFRHLRSKVTNFFVISLAVSDLLVAVLVMPWKAVTEVAGFWPFGAFCDIWVAFDIMCSTASILNLCVISVDRYWAISSPFRYERKMTPRVAFVMISGAWTLSVLISFIPVQLKWHKAQPIGFLEVNASRRDLPTDNCDSSLNRTYAISSSLISFYIPVAIMIVTYTQIYRIAQKQIRRISALERAAESAQIRHDSMGSGSNMDLESSFKLSFKRETKVLKTLSVIMGVFVCCWLPFFILNCMVPFCKRTSNGLPCISPTTFDVFVWFGWANSSLNPIIYAFNADFRRAFAILLGCQRLCPGSISMETPSLNKN. The pKi is 7.3.