Dataset: Drug-target binding data from BindingDB using IC50 measurements. Task: Regression. Given a target protein amino acid sequence and a drug SMILES string, predict the binding affinity score between them. We predict pIC50 (pIC50 = -log10(IC50 in M); higher means more potent). Dataset: bindingdb_ic50. (1) The compound is COc1ncc(C)cc1CC[C@@](O)(CC(=O)O)C(=O)O. The target protein (Q86YT5) has sequence MASALSYVSKFKSFVILFVTPLLLLPLVILMPAKFVRCAYVIILMAIYWCTEVIPLAVTSLMPVLLFPLFQILDSRQVCVQYMKDTNMLFLGGLIVAVAVERWNLHKRIALRTLLWVGAKPARLMLGFMGVTALLSMWISNTATTAMMVPIVEAILQQMEATSAATEAGLELVDKGKAKELPGSQVIFEGPTLGQQEDQERKRLCKAMTLCICYAASIGGTATLTGTGPNVVLLGQMNELFPDSKDLVNFASWFAFAFPNMLVMLLFAWLWLQFVYMRFNFKKSWGCGLESKKNEKAALKVLQEEYRKLGPLSFAEINVLICFFLLVILWFSRDPGFMPGWLTVAWVEGETKYVSDATVAIFVATLLFIVPSQKPKFNFRSQTEEERKTPFYPPPLLDWKVTQEKVPWGIVLLLGGGFALAKGSEASGLSVWMGKQMEPLHAVPPAAITLILSLLVAVFTECTSNVATTTLFLPIFASMSRSIGLNPLYIMLPCTLSASF.... The pIC50 is 6.1. (2) The pIC50 is 4.7. The compound is CCCCCCCC(O)(P(=O)(O)O)P(=O)(O)O. The target protein sequence is MAHMERFQKVYEEVQEFLLGDAEKRFEMDVHRKGYLKSMMDTTCLGGKYNRGLCVVDVAEAMAKDTKMDAAAMERVLHDACVCGWMIEMLQAHFLVEDDIMDHSKTRRGKPCWYLHPGVTTQVAINDGLILLAWATQMALHYFADRPFLAEVLRVFHDVDLTTTIGQLYDVTSMVDSAKLDANVAHANTTDYIEYTPFNHRRIVVYKTAYYTYWLPLVMGLLVSGTVEKVDKEATHKVAMVMGEYFQVQDDVMDCFTPPEKLGKIGTDIEDAKCSWLAVTFLTTAPAEKVAEFKANYGSTDPAKVAVIKQLYTEQNLLARFEEYEKAVVAEIEQLIAALEAQNTAFAASVKVLWSKTYKRQK. (3) The drug is CN(Cc1ccc(Cl)cc1)c1nc(C(F)(F)F)cc(=O)[nH]1. The target protein (P05413) has sequence MVDAFLGTWKLVDSKNFDDYMKSLGVGFATRQVASMTKPTTIIEKNGDILTLKTHSTFKNTEISFKLGVEFDETTADDRKVKSIVTLDGGKLVHLQKWDGQETTLVRELIDGKLILTLTHGTAVCTRTYEKEA. The pIC50 is 4.0. (4) The drug is C=CC[C@H](NC(=O)[C@@H]1C[C@@H](Cc2cccc(C)c2)c2c(Cl)nc(NCc3cccc(C(F)(F)F)c3)c(=O)n21)B1OC2CC3CC(C3(C)C)[C@@]2(C)O1. The target protein sequence is APITAYAQQTRGLLGCIITSLTGRDKNQVEGEVQIVSTAAQTFLATCINGVCWTVYHGAGTRTIASSKGPVIQMYTNVDQDLVGWPAPQGARSLTPCTCGSSDLYLVTRHADVIPVRRRGDGRGSLLSPRPISYLKGSSGGPLLCPAGHAVGIFRAAVCTRGVAKAVDFIPVEGLETTMRSPVFSDNSSPPAVPQSYQVAHLHAPTGSGKSTKVPAAYAAQGYKVLVLNPSVAATLGFGAYMSKAHGIDPNIRTGVRTITTGSPITYSTYGKFLADGGCSGGAYDIIICDECHSTDATSILGIGTVLDQAETAGARLTVLATATPPGSVTVPHPNIEEVALSTTGEIPFYGKAIPLEAIKGGRHLIFCHSKKKCDELAAKLVALGVNAVAYYRGLDVSVIPASGDVVVVATDALMTGFTGDFDSVIDCNTCVTQTVDFSLDPTFTIETTTLPQDAVSRTQRRGRTGRGKPGIYRFVTPGERPSGMFDSSVLCECYDAGCA.... The pIC50 is 6.4. (5) The target protein (P24942) has sequence MTKSNGEEPRMGSRMERFQQGVRKRTLLAKKKVQNITKEDVKSYLFRNAFVLLTVSAVIVGTILGFALRPYKMSYREVKYFSFPGELLMRMLQMLVLPLIISSLVTGMAALDSKASGKMGMRAVVYYMTTTIIAVVIGIIIVIIIHPGKGTKENMYREGKIVQVTAADAFLDLIRNMFPPNLVEACFKQFKTSYEKRSFKVPIQANETLLGAVINNVSEAMETLTRIREEMVPVPGSVNGVNALGLVVFSMCFGFVIGNMKEQGQALREFFDSLNEAIMRLVAVIMWYAPLGILFLIAGKILEMEDMGVIGGQLAMYTVTVIVGLLIHAVIVLPLLYFLVTRKNPWVFIGGLLQALITALGTSSSSATLPITFKCLEENNGVDKRITRFVLPVGATINMDGTALYEALAAIFIAQVNNFDLNFGQIITISITATAASIGAAGIPQAGLVTMVIVLTSVGLPTDDITLIIAVDWFLDRLRTTTNVLGDSLGAGIVEHLSRH.... The small molecule is COc1ccc(C2C3=C(CC(c4cccc5ccccc45)CC3=O)OC(=N)C2C#N)cc1. The pIC50 is 6.0. (6) The drug is CC[C@H](C)[C@H](NC(C)=O)C(=O)N[C@@H](Cc1ccc(O)cc1)C(=O)N[C@@H](Cc1ccc(O)cc1)P(=O)(O)O. The target protein (P24171) has sequence MTTMNPFLVQSTLPYLAPHFDQIANHHYRPAFDEGMQQKRAEIAAIALNPQMPDFNNTILALEQSGELLTRVTSVFFAMTAAHTNDELQRLDEQFSAELAELANDIYLNGELFARVDAVWQRRESLGLDSESIRLVEVIHQRFVLAGAKLAQADKAKLKVLNTEAATLTSQFNQRLLAANKSGGLVVNDIAQLAGMSEQEIALAAEAAREKGLDNKWLIPLLNTTQQPALAEMRDRATREKLFIAGWTRAEKNDANDTRAIIQRLVEIRAQQATLLGFPHYAAWKIADQMAKTPEAALNFMREIVPAARQRASDELASIQAVIDKQQGGFSAQPWDWAFYAEQVRREKFDLDEAQLKPYFELNTVLNEGVFWTANQLFGIKFVERFDIPVYHPDVRVWEIFDHNGVGLALFYGDFFARDSKSGGAWMGNFVEQSTLNKTHPVIYNVCNYQKPAAGEPALLLWDDVITLFHEFGHTLHGLFARQRYATLSGTNTPRDFVEF.... The pIC50 is 3.8. (7) The drug is C[C@H](CCC(=O)O)C1CC[C@H]2C3CC[C@@H]4C[C@H](OC(=O)CCC(=O)O)CC[C@]4(C)C3CC[C@]12C. The target protein (Q11205) has sequence MKCSLRVWFLSMAFLLVFIMSLLFTYSHHSMATLPYLDSGTLGGTHRVKLVPGYTGQQRLVKEGLSGKSCTCSRCMGDAGTSEWFDSHFDSNISPVWTRDNMNLTPDVQRWWMMLQPQFKSHNTNEVLEKLFQIVPGENPYRFRDPQQCRRCAVVGNSGNLRGSGYGQEVDSHNFIMRMNQAPTVGFEKDVGSRTTHHFMYPESAKNLPANVSFVLVPFKALDLMWIASALSTGQIRFTYAPVKSFLRVDKEKVQIYNPAFFKYIHDRWTEHHGRYPSTGMLVLFFALHVCDEVNVYGFGADSRGNWHHYWENNRYAGEFRKTGVHDADFEAHIIDILAKASKIEVYRGN. The pIC50 is 4.9. (8) The drug is COc1ccc(C2(C#N)CCC(C(=O)O)CC2)cc1OC1CCCC1. The target protein sequence is QAPLHLLDEDYLGQARHMLSKVGMWDFDIFLFDRLTNGNSLVTLLCHLFNTHGLIHHFKLDMVTLHRFLVMVQEDYHSQNPYHNAVHAADVTQAMHCYLKEPKLASFLTPLDIMLGLLAAAAHDVDHPGVNQPFLIKTNHHLANLYQNMSVLENHHWRSTIGMLRESRLLAHLPKEMTQDIEQQLGSLILATDINRQNEFLTRLKAHLHNKDLRLEDAQDRHFMLQIALKCADICNPCRIWEMSKQWSERVCEEFYRQGELEQKFELEISPLCNQQKDSIPSIQIGFMSYIVEPLFREWAHFTGNSTLSENMLGHLAHNKAQWKSLLPRQHRSRGSSGSGPDHDHAGQGTESEEQEGDSP. The pIC50 is 4.4. (9) The compound is CNC(=O)Oc1ccccc1. The target protein sequence is MARSVRTPISPSSSSSSRSSWSSPSSSSFYSLLSSFKASLTRPSSSSSVAHHLAARNNDICRGLFATLVILLRMSALTSAMTDHLTVQTTSGPVRGRSVTVQGRDVHVFTGIPYAKPPVDDLRFRKPVPAEPWHGVLDATRLPATCVQERYEYFPGFSGEEMWNPNTNVSEDCLFMNIWAPAKARLRHGRGTNGGEHSSKTDQDHLIHSATPQNTTNGLPILIWIYGGGFMTGSATLDIYNAEIMSAVGNVIVASFQYRVGAFGFLHLSPVMPGFEEEAPGNVGLWDQALALRWLKENARAFGGNPEWMTLFGGSSGSSSVNAQLMSPVTRGLVKRGMMQSATMNAPWSHMTSEKAVEIGKALVNDCNCNASLLPENPQAVMACMRQVDAKTISVQQWNSYSGILSYPSAPTIDGAFLPADPMTLLKTADLSGYDILIGNVKDEGAYFLLYDFIDYFDKDDATSLPRDKYLEIMNNIFQKASQAEREAIIFQYTSWEGNP.... The pIC50 is 3.1.