The task is: Regression. Given a target protein amino acid sequence and a drug SMILES string, predict the binding affinity score between them. We predict pIC50 (pIC50 = -log10(IC50 in M); higher means more potent). Dataset: bindingdb_ic50.. This data is from Drug-target binding data from BindingDB using IC50 measurements. (1) The small molecule is Nc1ccc(CC(C(=O)O)c2cn(Cc3ccc(OC(F)(F)F)cc3)cn2)cn1. The target protein (P15086) has sequence MLALLVLVTVALASAHHGGEHFEGEKVFRVNVEDENHINIIRELASTTQIDFWKPDSVTQIKPHSTVDFRVKAEDTVTVENVLKQNELQYKVLISNLRNVVEAQFDSRVRATGHSYEKYNKWETIEAWTQQVATENPALISRSVIGTTFEGRAIYLLKVGKAGQNKPAIFMDCGFHAREWISPAFCQWFVREAVRTYGREIQVTELLDKLDFYVLPVLNIDGYIYTWTKSRFWRKTRSTHTGSSCIGTDPNRNFDAGWCEIGASRNPCDETYCGPAAESEKETKALADFIRNKLSSIKAYLTIHSYSQMMIYPYSYAYKLGENNAELNALAKATVKELASLHGTKYTYGPGATTIYPAAGGSDDWAYDQGIRYSFTFELRDTGRYGFLLPESQIRATCEETFLAIKYVASYVLEHLY. The pIC50 is 5.0. (2) The target protein sequence is NFSEEQQRIIEIPMNVNLCIIACPGSGKTSTLTARIIKSIIEEKQSIVCITFTNYAASDLKDKIMKKINCLIDICVDNKINQKLFNNKNNKINFSLKNKCTLNNKMNKSIFKVLNTVMFIGTIHSFCRYILYKYKGTFKILTDFINTNIIKLAFNNFYSSMMSKTKGTQPGFSTILERKSNKASTQNCDPDKINTHNNDDNINNKNDYINNKNKNDYNNINNYDNINNYDNINNDDNINNDDNINNDDNINNDDNINNDDDINNCGNCNQPKGIPSQLAYFINCMKNAEIKEDEEKEFYEEEHDIQNDILNNDDNNNDEDDDDDDEFYNYLYNFKHSYEQTNDYFANEQVQSVLKKKNIIFLKKKIKLMKYIELYNIKIEINDVEKMFYEEYKKIFKKAKNIYYDFDDLLIETYRLMKDN. The compound is CC[n+]1c(-c2ccccc2)c2cc(N)ccc2c2ccc(N)cc21. The pIC50 is 5.5. (3) The compound is O=C(c1cc2cc(F)ccc2[nH]1)N1C[C@]2(CCN(C3CNC3)C2)c2ccccc21. The target protein sequence is MAGSAVDSANHLTYLFGNITREEAEDYLVQGGMTDGLYLLRQSRNYLGGFALSVAHNRKAHHYTIERELNGTYAISGGRAHASPADLCHYHSQEPDGLICLLKKPFNRPPGVQPKTGPFEDLKENLIREYVKQTWNLQGQALEQAIISQKPQLEKLIATTAHEKMPWFHGNISRDESEQTVLIGSKTNGKFLIRARDNSGSYALCLLHEGKVLHYRIDRDKTGKLSIPEGKKFDTLWQLVEHYSYKPDGLLRVLTVPCQKIGAQMGHPGSPNAHPVTWSPGGIISRIKSYSFPKPGHKKPAPPQGSRPESTVSFNPYEPTGGPWGPDRGLQREALPMDTEVYESPYADPEEIRPKEVYLDRSLLTLEDNELGSGNFGTVKKGYYQMKKVVKTVAVKILKNEANDPALKDELLAEANVMQQLDNPYIVRMIGICEAESWMLVMEMAELGPLNKYLQQNRHIKDKNIIELVHQVSMGMKYLEESNFVHRDLAARNVLLVTQH.... The pIC50 is 5.7. (4) The small molecule is Cc1[nH]nc(Nc2cc(N3CCCCC3)nc(Sc3ccc(NC(=O)C4CCCCC4)cc3)n2)c1C. The target protein (Q60855) has sequence MQPDMSLDNIKMASSDLLEKTDLDSGGFGKVSLCYHRSHGFVILKKVYTGPNRAEYNEVLLEEGKMMHRLRHSRVVKLLGIIIEEGNYSLVMEYMEKGNLMHVLKTQIDVPLSLKGRIIVEAIEGMCYLHDKGVIHKDLKPENILVDRDFHIKIADLGVASFKTWSKLTKEKDNKQKEVSSTTKKNNGGTLYYMAPEHLNDINAKPTEKSDVYSFGIVLWAIFAKKEPYENVICTEQFVICIKSGNRPNVEEILEYCPREIISLMERCWQAIPEDRPTFLGIEEEFRPFYLSHFEEYVEEDVASLKKEYPDQSPVLQRMFSLQHDCVPLPPSRSNSEQPGSLHSSQGLQMGPVEESWFSSSPEYPQDENDRSVQAKLQEEASYHAFGIFAEKQTKPQPRQNEAYNREEERKRRVSHDPFAQQRARENIKSAGARGHSDPSTTSRGIAVQQLSWPATQTVWNNGLYNQHGFGTTGTGVWYPPNLSQMYSTYKTPVPETNIP.... The pIC50 is 5.5. (5) The drug is Cc1c(-c2cncc(N)n2)nc(Nc2ccc(N3CCN(C4COC4)CC3)cc2)c2nccn12. The target protein (P08962) has sequence MAVEGGMKCVKFLLYVLLLAFCACAVGLIAVGVGAQLVLSQTIIQGATPGSLLPVVIIAVGVFLFLVAFVGCCGACKENYCLMITFAIFLSLIMLVEVAAAIAGYVFRDKVMSEFNNNFRQQMENYPKNNHTASILDRMQADFKCCGAANYTDWEKIPSMSKNRVPDSCCINVTVGCGINFNEKAIHKEGCVEKIGGWLRKNVLVVAAAALGIAFVEVLGIVFACCLVKSIRSGYEVM. The pIC50 is 6.8. (6) The small molecule is O=C(Nc1ccc(Oc2cccnc2)c(C(=O)NCc2ccc(Cl)c(Cl)c2)c1)OCc1ccccc1. The target protein (Q9Z0R9) has sequence MGKGGNQGEGSTERQAPMPTFRWEEIQKHNLRTDRWLVIDRKVYNVTKWSQRHPGGHRVIGHYSGEDATDAFRAFHLDLDFVGKFLKPLLIGELAPEEPSLDRGKSSQITEDFRALKKTAEDMNLFKTNHLFFFLLLSHIIVMESLAWFILSYFGTGWIPTLVTAFVLATSQAQAGWLQHDYGHLSVYKKSIWNHVVHKFVIGHLKGASANWWNHRHFQHHAKPNIFHKDPDIKSLHVFVLGEWQPLEYGKKKLKYLPYNHQHEYFFLIGPPLLIPMYFQYQIIMTMISRRDWVDLAWAISYYMRFFYTYIPFYGILGALVFLNFIRFLESHWFVWVTQMNHLVMEIDLDHYRDWFSSQLAATCNVEQSFFNDWFSGHLNFQIEHHLFPTMPRHNLHKIAPLVKSLCAKHGIEYQEKPLLRALIDIVSSLKKSGELWLDAYLHK. The pIC50 is 4.5. (7) The small molecule is COc1ccc2cc(C(C)C(=O)N3CCSCC3)ccc2c1. The target protein (P24095) has sequence MFGIFDKGQKIKGTVVLMPKNVLDFNAITSIGKGGVIDTATGILGQGVSLVGGVIDTATSFLGRNISMQLISATQTDGSGNGKVGKEVYLEKHLPTLPTLGARQDAFSIFFEWDASFGIPGAFYIKNFMTDEFFLVSVKLEDIPNHGTIEFVCNSWVYNFRSYKKNRIFFVNDTYLPSATPAPLLKYRKEELEVLRGDGTGKRKDFDRIYDYDVYNDLGNPDGGDPRPILGGSSIYPYPRRVRTGRERTRTDPNSEKPGEVYVPRDENFGHLKSSDFLTYGIKSLSHDVIPLFKSAIFQLRVTSSEFESFEDVRSLYEGGIKLPTDILSQISPLPALKEIFRTDGENVLQFPPPHVAKVSKSGWMTDEEFAREVIAGVNPNVIRRLQEFPPKSTLDPTLYGDQTSTITKEQLEINMGGVTVEEALSTQRLFILDYQDAFIPYLTRINSLPTAKAYATRTILFLKDDGTLKPLAIELSKPHPDGDNLGPESIVVLPATEGV.... The pIC50 is 4.9. (8) The compound is CCCCCCCCC(C)C1(C)SC(=O)C(C)C1=O. The target protein (P9WNG3) has sequence MTEIATTSGARSVGLLSVGAYRPERVVTNDEICQHIDSSDEWIYTRTGIKTRRFAADDESAASMATEACRRALSNAGLSAADIDGVIVTTNTHFLQTPPAAPMVAASLGAKGILGFDLSAGCAGFGYALGAAADMIRGGGAATMLVVGTEKLSPTIDMYDRGNCFIFADGAAAVVVGETPFQGIGPTVAGSDGEQADAIRQDIDWITFAQNPSGPRPFVRLEGPAVFRWAAFKMGDVGRRAMDAAGVRPDQIDVFVPHQANSRINELLVKNLQLRPDAVVANDIEHTGNTSAASIPLAMAELLTTGAAKPGDLALLIGYGAGLSYAAQVVRMPKG. The pIC50 is 3.0. (9) The drug is O=C(O)c1cc2ccc(N(Cc3ccccc3)Cc3ccccc3)cc2oc1=O. The target protein (P53987) has sequence MPPAIGGPVGYTPPDGGWGWAVVVGAFISIGFSYAFPKSITVFFKEIEIIFSATTSEVSWISSIMLAVMYAGGPISSILVNKYGSRPVMIAGGCLSGCGLIAASFCNTVQELYFCIGVIGGLGLAFNLNPALTMIGKYFYKKRPLANGLAMAGSPVFLSTLAPLNQAFFGIFGWRGSFLILGGLLLNCCVAGSLMRPIGPQQGKVEKLKSKESLQEAGKSDANTDLIGGSPKGEKLSVFQTVNKFLDLSLFTHRGFLLYLSGNVVMFFGLFTPLVFLSNYGKSKHFSSEKSAFLLSILAFVDMVARPSMGLAANTRWIRPRVQYFFAASVVANGVCHLLAPLSTTYVGFCIYAGVFGFAFGWLSSVLFETLMDLVGPQRFSSAVGLVTIVECCPVLLGPPLLGRLNDMYGDYKYTYWACGVILIIAGLYLFIGMGINYRLVAKEQKAEEKKRDGKEDETSTDVDEKPKKTMKETQSPAPLQNSSGDPAEEESPV. The pIC50 is 7.3. (10) The compound is CCOc1ccc(-n2c([C@@H](C)N(Cc3cccnc3)C(=O)Cc3ccc(C(F)(F)F)cn3)nc3ccccc3c2=O)cc1. The target protein (P49682) has sequence MVLEVSDHQVLNDAEVAALLENFSSSYDYGENESDSCCTSPPCPQDFSLNFDRAFLPALYSLLFLLGLLGNGAVAAVLLSRRTALSSTDTFLLHLAVADTLLVLTLPLWAVDAAVQWVFGSGLCKVAGALFNINFYAGALLLACISFDRYLNIVHATQLYRRGPPARVTLTCLAVWGLCLLFALPDFIFLSAHHDERLNATHCQYNFPQVGRTALRVLQLVAGFLLPLLVMAYCYAHILAVLLVSRGQRRLRAMRLVVVVVVAFALCWTPYHLVVLVDILMDLGALARNCGRESRVDVAKSVTSGLGYMHCCLNPLLYAFVGVKFRERMWMLLLRLGCPNQRGLQRQPSSSRRDSSWSETSEASYSGL. The pIC50 is 7.9.