This data is from Drug-target binding data from BindingDB using IC50 measurements. The task is: Regression. Given a target protein amino acid sequence and a drug SMILES string, predict the binding affinity score between them. We predict pIC50 (pIC50 = -log10(IC50 in M); higher means more potent). Dataset: bindingdb_ic50. (1) The small molecule is CC(C)CNC(=O)[C@H]1N(C(=O)[C@@H](O)[C@H](Cc2ccccc2)NC(=O)[C@@H](NC(=O)[C@@H](NC(=O)CCc2ccccc2)c2ccccc2)C(C)(C)C)CSC1(C)C. The target protein sequence is MTVLPIALFSSNTPLRNTSVLGAGGQTQDHFKLTSLPVLIRLPFRTTPIVLTSCLVDTKNNWAIIGRDALQQCQGALYLPEAKGPPVILPIQAPAVLGLEHLPRPPEISQFPLNQNGSRPCNTWSGRPWRQAISNPTPGQEITQYSQLKRPMEPGDSSTTCGPLTL. The pIC50 is 7.0. (2) The small molecule is C[C@H](NCCc1ccc(NCC(=O)O)c(I)c1)[C@H](O)c1ccc(O)cc1. The target protein (P10608) has sequence MEPHGNDSDFLLAPNGSRAPGHDITQERDEAWVVGMAILMSVIVLAIVFGNVLVITAIAKFERLQTVTNYFITSLACADLVMGLAVVPFGASHILMKMWNFGNFWCEFWTSIDVLCVTASIETLCVIAVDRYVAITSPFKYQSLLTKNKARVVILMVWIVSGLTSFLPIQMHWYRATHKQAIDCYAKETCCDFFTNQAYAIASSIVSFYVPLVVMVFVYSRVFQVAKRQLQKIDKSEGRFHAQNLSQVEQDGRSGHGLRSSSKFCLKEHKALKTLGIIMGTFTLCWLPFFIVNIVHVIRANLIPKEVYILLNWLGYVNSAFNPLIYCRSPDFRIAFQELLCLRRSSSKTYGNGYSSNSNGRTDYTGEQSAYQLGQEKENELLCEEAPGMEGFVNCQGTVPSLSIDSQGRNCNTNDSPL. The pIC50 is 5.7. (3) The small molecule is C/C(=C\C(C)/C=C/C(=O)NO)C(=O)c1ccc(N(C)C)cc1. The target protein sequence is SSPITGLVYDQRMMLHHNMWDSHHPELPQRISRIFSRHEELRLLSRCHRIPARLATEEELALCHSSKHISIIKSSEHMKPRDLNRLGDEYNSIFISNESYTCALLAAGSCFNSAQAILTGQVRNAVAIVRPPGHHAEKDTACGFCFFNTAALTARYAQSITRESLRVLIVDWDVHHGNGTQHIFEEDDSVLYISLHRYEDGAFFPNSEDANYDKVGLGKGRGYNVNIPWNGGKMGDPEYMAAFHHLVMPIAREFAPELVLVSAGFDAARGDPLGGFQVTPEGYAHLTHQLMSLAAGRVLIILEGGYNLTSISESMSMCTSMLLGDSPPSLDHLTPLKTSATVSINNVLRAHAPFWSSLR. The pIC50 is 8.8. (4) The small molecule is Cc1cc(/C=C2/S/C(=N/c3ccccc3)NC2=O)c(C)n1-c1cc(C(=O)O)cc(C(=O)O)c1. The target protein (P22188) has sequence MADRNLRDLLAPWVPDAPSRALREMTLDSRVAAAGDLFVAVVGHQADGRRYIPQAIAQGVAAIIAEAKDEATDGEIREMHGVPVIYLSQLNERLSALAGRFYHEPSDNLRLVGVTGTNGKTTTTQLLAQWSQLLGEISAVMGTVGNGLLGKVIPTENTTGSAVDVQHELAGLVDQGATFCAMEVSSHGLVQHRVAALKFAASVFTNLSRDHLDYHGDMEHYEAAKWLLYSEHHCGQAIINADDEVGRRWLAKLPDAVAVSMEDHINPNCHGRWLKATEVNYHDSGATIRFSSSWGDGEIESHLMGAFNVSNLLLALATLLALGYPLADLLKTAARLQPVCGRMEVFTAPGKPTVVVDYAHTPDALEKALQAARLHCAGKLWCVFGCGGDRDKGKRPLMGAIAEEFADVAVVTDDNPRTEEPRAIINDILAGMLDAGHAKVMEGRAEAVTCAVMQAKENDVVLVAGKGHEDYQIVGNQRLDYSDRVTVARLLGVIA. The pIC50 is 3.3. (5) The small molecule is C=C[C@@H]1C[C@]1(NC(=O)[C@@H]1C[C@@H](Oc2cc(-c3csc(NC(=O)C(C)C)n3)nc3c(Br)c(OC)ccc23)CN1C(=O)[C@@H](NC(=O)OC1CCCC1)C(C)(C)C)C(=O)O. The target protein (Q7TQM4) has sequence MEPKAPQLRRRERQGEEQENGACGEGNTRTHRAPDLVQWTRHMEAVKTQCLEQAQRELAELMDRAIWEAVQAYPKQDRPLPSTASDSTRKTQELHPGKRKVFITRKSLLDELMGVQHFRTIYHMFIAGLCVLIISTLAIDFIDEGRLMLEFDLLLFSFGQLPLALMMWVPMFLSTLLLPYQTLRLWARPRSGGAWTLGASLGCVLLAAHAAVLCVLPVHVSVKHELPPASRCVLVFEQVRFLMKSYSFLRETVPGIFCVRGGKGICTPSFSSYLYFLFCPTLIYRETYPRTPSIRWNYVAKNFAQALGCLLYACFILGRLCVPVFANMSREPFSTRALLLSILHATGPGIFMLLLIFFAFLHCWLNAFAEMLRFGDRMFYRDWWNSTSFSNYYRTWNVVVHDWLYSYVYQDGLWLLGRQGRGAAMLGVFLVSALVHEYIFCFVLGFFYPVMLILFLVVGGLLNFTMNDRHTGPAWNILMWTFLFLGQGIQVSLYCQEWYA.... The pIC50 is 5.6. (6) The small molecule is COP(=O)(OC)OC=C(Cl)Cl. The target protein (Q869C3) has sequence MEIRGLLMGRLRLGRRMVPLGLLGVTALLLILPPFALVQGRHHELNNGAAIGSHQLSAAAGVGLASQSAQSGSLASGVMSSVPAAGASSSSSSSLLSSSAEDDVARITLSKDADAFFTPYIGHGESVRIIDAELGTLEHVHSGATPRRRGLTRRESNSDANDNDPLVVNTDKGRIRGITVDAPSGKKVDVWLGIPYAQPPVGPLRFRHPRPAEKWTGVLNTTTPPNSCVQIVDTVFGDFPGATMWNPNTPLSEDCLYINVVAPRPRPKNAAVMLWIFGGGFYSGTATLDVYDHRALASEENVIVVSLQYRVASLGFLFLGTPEAPGNAGLFDQNLALRWVRDNIHRFGGDPSRVTLFGESAGAVSVSLHLLSALSRDLFQRAILQSGSPTAPWALVSREEATLRALRLAEAVGCPHEPSKLSDAVECLRGKDPHVLVNNEWGTLGICEFPFVPVVDGAFLDETPQRSLASGRFKKTEILTGSNTEEGYYFIIYYLTELLR.... The pIC50 is 7.2. (7) The compound is CC(=O)N[C@@H]1[C@@H](N=C(N)N)C=C(C(=O)O)O[C@H]1[C@H](O)[C@H](O)CO. The target protein (Q9Y3R4) has sequence MASLPVLQKESVFQSGAHAYRIPALLYLPGQQSLLAFAEQRASKKDEHAELIVLRRGDYDAPTHQVQWQAQEVVAQARLDGHRSMNPCPLYDAQTGTLFLFFIAIPGQVTEQQQLQTRANVTRLCQVTSTDHGRTWSSPRDLTDAAIGPAYREWSTFAVGPGHCLQLHDRARSLVVPAYAYRKLHPIQRPIPSAFCFLSHDHGRTWARGHFVAQDTLECQVAEVETGEQRVVTLNARSHLRARVQAQSTNDGLDFQESQLVKKLVEPPPQGCQGSVISFPSPRSGPGSPAQWLLYTHPTHSWQRADLGAYLNPRPPAPEAWSEPVLLAKGSCAYSDLQSMGTGPDGSPLFGCLYEANDYEEIVFLMFTLKQAFPAEYLPQ. The pIC50 is 4.8. (8) The small molecule is O=C(CCc1cccc([N+](=O)[O-])c1)NC(Cn1cncn1)CP(=O)(O)O. The target protein (P0CO23) has sequence MSERIASVERTTSETHISCTIDLDHIPGVTEQKINVSTGIGFLDHMFTALAKHGGMSLQLQCKGDLHIDDHHTAEDCALALGEAFKKALGERKGIKRYGYAYAPLDESLSRAVIDISSRPYFMCHLPFTREKVGDLSTEMVSHLLQSFAFAAGVTLHIDSIRGENNHHIAESAFKALALAIRMAISRTGGDDVPSTKGVLAL. The pIC50 is 5.8. (9) The drug is C=CC(=O)Nc1ccc(S(=O)(=O)N2CC3CN(C(=O)OCc4ccccc4)CC3C2)cc1. The target protein (Q08188) has sequence MAALGVQSINWQTAFNRQAHHTDKFSSQELILRRGQNFQVLMIMNKGLGSNERLEFIVSTGPYPSESAMTKAVFPLSNGSSGGWSAVLQASNGNTLTISISSPASAPIGRYTMALQIFSQGGISSVKLGTFILLFNPWLNVDSVFMGNHAEREEYVQEDAGIIFVGSTNRIGMIGWNFGQFEEDILSICLSILDRSLNFRRDAATDVASRNDPKYVGRVLSAMINSNDDNGVLAGNWSGTYTGGRDPRSWNGSVEILKNWKKSGFSPVRYGQCWVFAGTLNTALRSLGIPSRVITNFNSAHDTDRNLSVDVYYDPMGNPLDKGSDSVWNFHVWNEGWFVRSDLGPSYGGWQVLDATPQERSQGVFQCGPASVIGVREGDVQLNFDMPFIFAEVNADRITWLYDNTTGKQWKNSVNSHTIGRYISTKAVGSNARMDVTDKYKYPEGSDQERQVFQKALGKLKPNTPFAATSSMGLETEEQEPSIIGKLKVAGMLAVGKEVN.... The pIC50 is 4.1. (10) The target protein (P00743) has sequence MAGLLHLVLLSTALGGLLRPAGSVFLPRDQAHRVLQRARRANSFLEEVKQGNLERECLEEACSLEEAREVFEDAEQTDEFWSKYKDGDQCEGHPCLNQGHCKDGIGDYTCTCAEGFEGKNCEFSTREICSLDNGGCDQFCREERSEVRCSCAHGYVLGDDSKSCVSTERFPCGKFTQGRSRRWAIHTSEDALDASELEHYDPADLSPTESSLDLLGLNRTEPSAGEDGSQVVRIVGGRDCAEGECPWQALLVNEENEGFCGGTILNEFYVLTAAHCLHQAKRFTVRVGDRNTEQEEGNEMAHEVEMTVKHSRFVKETYDFDIAVLRLKTPIRFRRNVAPACLPEKDWAEATLMTQKTGIVSGFGRTHEKGRLSSTLKMLEVPYVDRSTCKLSSSFTITPNMFCAGYDTQPEDACQGDSGGPHVTRFKDTYFVTGIVSWGEGCARKGKFGVYTKVSNFLKWIDKIMKARAGAAGSRGHSEAPATWTVPPPLPL. The pIC50 is 4.2. The compound is C[C@H](NC(=O)[C@@H](CO)NS(=O)(=O)c1ccc(Br)cc1)C(=O)NC(Cc1ccc(N=C(N)N)cc1)P(=O)(Oc1ccccc1)Oc1ccccc1.