This data is from Drug-target binding data from BindingDB using IC50 measurements. The task is: Regression. Given a target protein amino acid sequence and a drug SMILES string, predict the binding affinity score between them. We predict pIC50 (pIC50 = -log10(IC50 in M); higher means more potent). Dataset: bindingdb_ic50. (1) The compound is CC(C)Cc1c(C(=O)C(N)=O)c2c(OCC(=O)NS(=O)(=O)c3ccccc3)cccc2n1Cc1ccccc1. The target protein (Q9QZT4) has sequence MKKFFAIAVLAGSVVTTAHSSLLNLKSMVEAITHRNSILSFVGYGCYCGLGGRGHPMDEVDWCCHAHDCCYEKLFEQGCRPYVDHYDHRIENGTMIVCTELNETECDKQTCECDKSLTLCLKDHPYRNKYRGYFNVYCQGPTPNCSIYDPYPEEVTCGHGLPATPVST. The pIC50 is 5.8. (2) The pIC50 is 7.4. The target protein sequence is MLRHSCLREKGNLIGGSLLRGISAQLRAAGGGTFRSFHSYRSFCSFCIFRRANKADANWYGCFPGHVAGGATCGPLRGDCAERHKLMAHVRRFSGESTRAKGGDKREGDIEGNRTNGSDKTKQLEEEMKKLNEQIAREKGNHKKALLFIFTCVVALYMYFESYDPEFFLYDVFLKMLLKYVDGETCHELFLLMGKYKLLPYDTGKDNIYSCSEIKGLNFINPFGVAAGFDKNGVCIDGILKLGFSFIEIGTITPKAQKGNERPRIFRDLETRSIINSCGFNNMGCDEVCKNLKRFRERQKTDKLLQRHLVGVSLGKNKDSPDILQDLSYCIGKIGRYADYIAINVSSPNTPGLRDHQKGERLHGIIQRVKEEVAKLDGGGAPLGGATTGGAAMGGATTGEAVVGKAPPDEAATGGEPWANTTKRRPLIFVKLAPDLEEGERKSIANVLLNAEVDGMIICNTTTQKFNIKSFEDKKGGVSGEKLKGVSTHMISQMYNYTNG.... The small molecule is CCOC(=O)c1[nH]c(C)c(Cc2ccc(OCC)cc2)c1C. (3) The small molecule is O=C(NCc1ccc(-c2ccc(F)c(C(F)(F)F)c2)cc1)C(O)(c1ccc(-c2ccc3cccnc3n2)cc1)C(F)(F)F. The target protein (Q9BXA5) has sequence MLGIMAWNATCKNWLAAEAALEKYYLSIFYGIEFVVGVLGNTIVVYGYIFSLKNWNSSNIYLFNLSVSDLAFLCTLPMLIRSYANGNWIYGDVLCISNRYVLHANLYTSILFLTFISIDRYLIIKYPFREHLLQKKEFAILISLAIWVLVTLELLPILPLINPVITDNGTTCNDFASSGDPNYNLIYSMCLTLLGFLIPLFVMCFFYYKIALFLKQRNRQVATALPLEKPLNLVIMAVVIFSVLFTPYHVMRNVRIASRLGSWKQYQCTQVVINSFYIVTRPLAFLNSVINPVFYFLLGDHFRDMLMNQLRHNFKSLTSFSRWAHELLLSFREK. The pIC50 is 7.5. (4) The drug is CC[N+]1(C/C2=C/CCCCCC2)CCC(NC(=O)C2c3cc(Cl)ccc3Oc3ccc(Cl)cc32)CC1.[I-]. The target protein (P51678) has sequence MAFNTDEIKTVVESFETTPYEYEWAPPCEKVRIKELGSWLLPPLYSLVFIIGLLGNMMVVLILIKYRKLQIMTNIYLFNLAISDLLFLFTVPFWIHYVLWNEWGFGHYMCKMLSGFYYLALYSEIFFIILLTIDRYLAIVHAVFALRARTVTFATITSIITWGLAGLAALPEFIFHESQDSFGEFSCSPRYPEGEEDSWKRFHALRMNIFGLALPLLIMVICYSGIIKTLLRCPNKKKHKAIRLIFVVMIVFFIFWTPYNLVLLFSAFHSTFLETSCQQSKHLDLAMQVTEVIAYTHCCINPVIYAFVGERFRKHLRLFFHRNVAVYLGKYIPFLPGEKMERTSSVSPSTGEQEISVVF. The pIC50 is 6.5. (5) The small molecule is N[C@@H](c1ccccc1)[C@@H](O)C(=O)N[C@@H](CO)C(=O)NO. The target protein sequence is MTEARGARGALAGPLRALCVLGCLLGRAAAAPSPIIKFPGDVAPKTDKELAVQYLNTFYGCPKESCNLFVLKDTLKKMQKFFGLPQTGELDQSTIETMRKPRCGNPDVANYNFFPRKPKWDKTQITYRIIGYTPDLDPETVDDAFARAFRVWSDVTPLRFSRIHDGEADIMINFGRWEHGDGYPFDGKDGLLAHAFAPGPGVGGDSHFDDDELWTLGEGQVVRVKYGNADGEYCKFPFSFNGKEYNSCTDTGRSDGFLWCSTTYNFDKDGKYGFCPHEALFTMGGNADGQPCKFPFRFQGTSYNSCTTEGRTDGYRWCGTTEDYDRDKKYGFCPETAMSTVGGNSEGAPCVFPFTFLGNKHESCTSAGRSDGKLWCATTANYDDDRKWGFCPDQGYSLFLVAAHEFGHAMGLEHSEDPGALMAPIYTYTKNFRLSHDDVKGIQELYGASPDIDTGTGPTPTLGPVTPEICKQDIVFDGISQIRGEIFFFKDRFIWRTVTP.... The pIC50 is 3.5. (6) The small molecule is C[C@]1(CSC2=NCCS2)[C@H](C(=O)O)N2C(=O)C[C@H]2S1(=O)=O. The target protein (P0AD64) has sequence MRYIRLCIISLLATLPLAVHASPQPLEQIKLSESQLSGRVGMIEMDLASGRTLTAWRADERFPMMSTFKVVLCGAVLARVDAGDEQLERKIHYRQQDLVDYSPVSEKHLADGMTVGELCAAAITMSDNSAANLLLATVGGPAGLTAFLRQIGDNVTRLDRWETELNEALPGDARDTTTPASMAATLRKLLTSQRLSARSQRQLLQWMVDDRVAGPLIRSVLPAGWFIADKTGAGERGARGIVALLGPNNKAERIVVIYLRDTPASMAERNQQIAGIGAALIEHWQR. The pIC50 is 6.4. (7) The small molecule is Cc1ccc(NC(=O)c2ccc3c(c2)OCCO3)cc1NC(=O)c1ccc2nc(OCCN3CCCC3)ccc2c1. The target protein (Q00613) has sequence MDLPVGPGAAGPSNVPAFLTKLWTLVSDPDTDALICWSPSGNSFHVFDQGQFAKEVLPKYFKHNNMASFVRQLNMYGFRKVVHIEQGGLVKPERDDTEFQHPCFLRGQEQLLENIKRKVTSVSTLKSEDIKIRQDSVTKLLTDVQLMKGKQECMDSKLLAMKHENEALWREVASLRQKHAQQQKVVNKLIQFLISLVQSNRILGVKRKIPLMLNDSGSAHSMPKYSRQFSLEHVHGSGPYSAPSPAYSSSSLYAPDAVASSGPIISDITELAPASPMASPGGSIDERPLSSSPLVRVKEEPPSPPQSPRVEEASPGRPSSVDTLLSPTALIDSILRESEPAPASVTALTDARGHTDTEGRPPSPPPTSTPEKCLSVACLDKNELSDHLDAMDSNLDNLQTMLSSHGFSVDTSALLDLFSPSVTVPDMSLPDLDSSLASIQELLSPQEPPRPPEAENSSPDSGKQLVHYTAQPLFLLDPGSVDTGSNDLPVLFELGEGSYF.... The pIC50 is 7.4. (8) The compound is O=C(CCCc1nc2ccccc2c(=O)[nH]1)N1CCC(n2c(=O)sc3ccccc32)CC1. The target protein sequence is SGTILIDLSPDDKEFQSVEEEMQSTVREHRDGGHAGGIFNRYNILKIQKVCNKKLWERYTHRRKEVSEENHNHANERMLFHGSPFVNAIIHKGFDERHAYIGGMFGAGIYFAENSSKSNQYVYGIGGGTGCPVHKDRSCYICHRQLLFCRVTLGKSFLQFSAMKMAHSPPGHHSVTGRPSVNGLALAEYVIYRGEQAYPEYLITYQIMRPEG. The pIC50 is 8.7. (9) The compound is CCCCCCCCCCOc1ccc(OCC(=O)CSCC(=O)O)cc1. The target protein (P0C869) has sequence MAVAEVSRTCLLTVRVLQAHRLPSKDLVTPSDCYVTLWLPTACSHRLQTRTVKNSSSPVWNQSFHFRIHRQLKNVMELKVFDQDLVTGDDPVLSVLFDAGTLRAGEFRRESFSLSPQGEGRLEVEFRLQSLADRGEWLVSNGVLVARELSCLHVQLEETGDQKSSEHRVQLVVPGSCEGPQEASVGTGTFRFHCPACWEQELSIRLQDAPEEQLKAPLSALPSGQVVRLVFPTSQEPLMRVELKKEAGLRELAVRLGFGPCAEEQAFLSRRKQVVAAALRQALQLDGDLQEDEIPVVAIMATGGGIRAMTSLYGQLAGLKELGLLDCVSYITGASGSTWALANLYEDPEWSQKDLAGPTELLKTQVTKNKLGVLAPSQLQRYRQELAERARLGYPSCFTNLWALINEALLHDEPHDHKLSDQREALSHGQNPLPIYCALNTKGQSLTTFEFGEWCEFSPYEVGFPKYGAFIPSELFGSEFFMGQLMKRLPESRICFLEGI.... The pIC50 is 5.8. (10) The compound is Cc1ccc(S(=O)(=O)NC2c3ccccc3Oc3ccccc32)cc1. The target protein (Q2AC31) has sequence MSPECAQTTGPGPSRTPDQVNRTHFPFFSDVKGDHRLVLSVLETTVLGLIFVVSLLGNVCALVLVVRRRRRGATVSLVLNLFCADLLFTSAIPLVLVVRWTEAWLLGPVVCHLLFYVMTMSGSVTILTLAAVSLERMVCIVRLRRGLSGPGRRTQAALLAFIWGYSALAALPLCILFRVVPQRLPGGDQEIPICTLDWPNRIGEISWDVFFVTLNFLVPGLVIVISYSKILQITKASRKRLTLSLAYSESHQIRVSQQDYRLFRTLFLLMVSFFIMWSPIIITILLILIQNFRQDLVIWPSLFFWVVAFTFANSALNPILYNMSLFRSEWRKIFCCFFFPEKGAIFTETSIRRNDLSVIST. The pIC50 is 8.1.