Task: Regression. Given a target protein amino acid sequence and a drug SMILES string, predict the binding affinity score between them. We predict pIC50 (pIC50 = -log10(IC50 in M); higher means more potent). Dataset: bindingdb_ic50.. Dataset: Drug-target binding data from BindingDB using IC50 measurements (1) The small molecule is CCNc1ncnc2c1ncn2C1CC(OP(=O)(O)O)C(COP(=O)(O)O)O1. The target protein (P49652) has sequence MTEALISAALNGTQPELLAGGWAAGNASTKCSLTKTGFQFYYLPTVYILVFITGFLGNSVAIWMFVFHMRPWSGISVYMFNLALADFLYVLTLPALIFYYFNKTDWIFGDVMCKLQRFIFHVNLYGSILFLTCISVHRYTGVVHPLKSLGRLKKKNAVYVSSLVWALVVAVIAPILFYSGTGVRRNKTITCYDTTADEYLRSYFVYSMCTTVFMFCIPFIVILGCYGLIVKALIYKDLDNSPLRRKSIYLVIIVLTVFAVSYLPFHVMKTLNLRARLDFQTPQMCAFNDKVYATYQVTRGLASLNSCVDPILYFLAGDTFRRRLSRATRKSSRRSEPNVQSKSEEMTLNILTEYKQNGDTSL. The pIC50 is 6.0. (2) The drug is O=S(=O)(c1cccc2cnccc12)N1CCCNCC1. The target protein sequence is MGNAAAAKKGSEQESVKEFLAKAKEDFLKKWENPAQNTAHLDQFERIKTLGTGSFGRVMLVKHMETGNHYAMKILDKQKVVKLKQIEHTLNEKRILQAVNFPFLVKLEFSFKDNSNLYMVMEYMPGGEMFSHLRRIGRFSEPHARFYAAQIVLTFEYLHSLDLIYRDLKPENLLIDQQGYIKVADFGFAKRVKGRTWTLCGTPEYLAPEIILSKGYNKAVDWWALGVLIYEMAAGYPPFFADQPIQIYEKIVSGKVRFPSHFSSDLKDLLRNLLQVDLTKRFGNLKNGVNDIKNHKWFATTDWIAIYQRKVEAPFIPKFKGPGDTSNFDDYEEEEIRVSINEKCGKEFSEF. The pIC50 is 5.7. (3) The small molecule is CCCc1nn(C)c2c(=O)[nH]c(-c3cc(C(=O)CN4CCOCC4)ccc3OCC)nc12. The target protein (P54827) has sequence MNLEPPKAEIRSATRVIGGPVTPRKGPPKFKQRQTRQFKSKPPKKGVQGFGDDIPGMEGLGTDITVICPWEAFNHLELHELAQYGII. The pIC50 is 8.8. (4) The small molecule is C=CC(=O)NC[C@H](NC(=O)[C@@H](NC(=O)c1cnccn1)C1CCCCC1)C(=O)N1C[C@@H]2CCC[C@@H]2[C@H]1C(=O)N[C@@H](CCC)C(OC(=O)C(F)(F)F)C(=O)NC1CC1. The target protein sequence is APITAYAQQTRGLLGCIITSLTGRDKNQVEGEVQIVSTAAQTFLATCINGVCWTVYHGAGTRTIASPKGPVIQMYTNVDQDLVGWPAPQGARSLTPCTCGSSDLYLVTRHADVIPVRRRGDSRGSLLSPRPISYLKGSSGGPLLCPAGHAVGLFRTAVCTRGVAKAVDFIPVENLETTMRS. The pIC50 is 6.0. (5) The drug is CSC[C@H](NC(=O)[C@H](Cc1ccccc1)OC(=O)N1CCC(N)CC1)C(=O)N[C@@H](CC1CCCCC1)[C@@H](O)Cn1ccc(=O)cc1. The target protein (P80209) has sequence VIRIPLHKFTSIRRTMSEAAGVLIAKGPISKYATGEPAVRQGPIPELLKNYMDAQYYGEIGIGTPPQCFTVVFDTGSANLWVPSIHCKLLDIACWTHRKYNSDKSSTYVKNGTTFDIHYGSGSLSGYLSQDTVSVPCNPSSSSPGGVTVQRQTFGEAIKQPGVVFIAAKFDGILGMAYPRISVNNVLPVFDNLMQQKLVDKNVFSFFLNRDPKAQPGGELMLGGTDSKYYRGSLMFHNVTRQAYWQIHMDQLDVGSSLTVCKGGCEAIVDTGTSLIVGPVEEVRELQKAIGAVPLIQGEYMIPCEKVSSLPEVTVKLGGKDYALSPEDYALKVSQAETTVCLSGFMGMDIPPPGGPLWILGDVFIGRYYTVFDRDQNRVGLAEAARL. The pIC50 is 5.0. (6) The compound is O=C(NO)c1cc(O)c(O)c(O)c1. The target protein sequence is IPQETGRQTALFLLKLASRWPITHLHTDNGSNFTSQEVKMVAWWIGIEQSFGVPYNPQSQGVVEAMNHHLKNQISRIREQANTVETIVLVAVHCM. The pIC50 is 5.8. (7) The small molecule is C#CCOCCSc1nc2c(c(=O)n1CC=C)C(C)(C)Cc1ccc(N=[N+]=[N-])cc1-2. The target protein (P96471) has sequence MKNYLSFGMFALLFALTFGTVKPVQAIAGPEWLLGRPSVNNSQLVVSVAGTVEGTNQEISLKFFEIDLTSRPAQGGKTEQGLRPKSKPLATDKGAMSHKLEKADLLKAIQEQLIANVHSNDGYFEVIDFASDATITDRNGKVYFADRDDSVTLPTQPVQEFLLSGHVRVRPYRPKAVHNSAERVNVNYEVSFVSETGNLDFTPSLKEQYHLTTLAVGDSLSSQELAAIAQFILSKKHPDYIITKRDSSIVTHDNDIFRTILPMDQEFTYHIKDREQAYKANSKTGIEEKTNNTDLISEKYYILKKGEKPYDPFDRSHLKLFTIKYVDVDTKALLKSEQLLTASERNLDFRDLYDPRDKAKLLYNNLDAFGIMGYTLTGKVEDNHDDTNRIITVYMGKRPEGENASYHLAYDKDRYTEEEREVYSYLRDTGTPIPDNPKDK. The pIC50 is 6.7. (8) The small molecule is O=C(NN=Cc1ccc(F)cc1)c1cc(-c2ccco2)nc2ccccc12. The target is TRQARRNRRRRWRERQR. The pIC50 is 4.1. (9) The target protein sequence is MLSLRSILSLLALASLFLVASGTSVPTSKSQASADAKLWALLVAGSNGYYNYRHQADICHAYHVLHNHGIPDERIVVMMYDDIAHDPSNPTPGIIINHLNGSNVYAGVPKDYTGDLVTPKNFLSILQGKKIKGGSGKVIASGPNDHVFVFFADHGAPGLIAFPNDDLQATNLSRVIKRMHKQKKFGKLVFYVEACESGSMFENLLPDDINVYATTAANSDESSYACYYDDLRQTYLGDVYSVNWMEDSDREDLHKETLLKQFKIVRSETNTSHVMEFGDLKIANLKVSEFQGAKSTPPIVLPKAPLDAVDSRDVPIAIVRKKLQKATDPQIKLSLKHELDQMLRNRAFLKEKMVEIVSFVALGDAEKTEQLLKAKIPLRDHTCYEQAVRYFDTTCFELSANPHALAHLRLLVNMCEEKISVSEIREAMDNVCTHPTVIGIV. The compound is C[C@H](NC(=O)OCc1ccccc1)C(=O)N[C@@H](C)C(=O)NN(CC(N)=O)C(=O)C=CC(=O)N(Cc1ccccc1)Cc1ccc2ccccc2c1. The pIC50 is 8.1. (10) The small molecule is OC[C@H]1NC[C@H](O)[C@@H](O)[C@@H]1O. The target protein (P04062) has sequence MEFSSPSREECPKPLSRVSIMAGSLTGLLLLQAVSWASGARPCIPKSFGYSSVVCVCNATYCDSFDPPTFPALGTFSRYESTRSGRRMELSMGPIQANHTGTGLLLTLQPEQKFQKVKGFGGAMTDAAALNILALSPPAQNLLLKSYFSEEGIGYNIIRVPMASCDFSIRTYTYADTPDDFQLHNFSLPEEDTKLKIPLIHRALQLAQRPVSLLASPWTSPTWLKTNGAVNGKGSLKGQPGDIYHQTWARYFVKFLDAYAEHKLQFWAVTAENEPSAGLLSGYPFQCLGFTPEHQRDFIARDLGPTLANSTHHNVRLLMLDDQRLLLPHWAKVVLTDPEAAKYVHGIAVHWYLDFLAPAKATLGETHRLFPNTMLFASEACVGSKFWEQSVRLGSWDRGMQYSHSIITNLLYHVVGWTDWNLALNPEGGPNWVRNFVDSPIIVDITKDTFYKQPMFYHLGHFSKFIPEGSQRVGLVASQKNDLDAVALMHPDGSAVVVVL.... The pIC50 is 3.7.