Dataset: Drug-target binding data from BindingDB using IC50 measurements. Task: Regression. Given a target protein amino acid sequence and a drug SMILES string, predict the binding affinity score between them. We predict pIC50 (pIC50 = -log10(IC50 in M); higher means more potent). Dataset: bindingdb_ic50. (1) The pIC50 is 7.3. The compound is CCN(CC)C(=O)c1cc2cc(O)c(O)c([N+](=O)[O-])c2cc1C(=O)N(CC)CC. The target protein (P48450) has sequence MTEGTCLRRRGGPYKTEPATDLTRWRLHNELGRQRWTYYQAEEDPGREQTGLEAHSLGLDTTSYFKNLPKAQTAHEGALNGVTFYAKLQAEDGHWAGDYGGPLFLLPGLLITCHIAHIPLPAGYREEMVRYLRSVQLPDGGWGLHIEDKSTVFGTALSYVSLRILGIGPDDPDLVRARNILHKKGGAVAIPSWGKFWLAVLNVYSWEGINTLFPEMWLLPEWFPAHPSTLWCHCRQVYLPMSYCYATRLSASEDPLVQSLRQELYVEDYASIDWPAQKNNVCPDDMYTPHSWLLHVVYGLLNLYERFHSTSLRKWAIQLLYEHVAADDRFTKCISIGPISKTVNMLIRWSVDGPSSPAFQEHVSRIKDYLWLGLDGMKMQGTNGSQTWDTSFAVQALLEAGAHRRPEFLPCLQKAHEFLRLSQVPDNNPDYQKYYRHMHKGGFPFSTLDCGWIVADCTAEALKAVLLLQERCPSITEHVPRERLYDAVAVLLSMRNSDGG.... (2) The compound is OCC1OC(OCc2ccccc2)C(O)C(O)C1O. The target protein (P08191) has sequence MKRVITLFAVLLMGWSVNAWSFACKTANGTAIPIGGGSANVYVNLAPVVNVGQNLVVDLSTQIFCHNDYPETITDYVTLQRGSAYGGVLSNFSGTVKYSGSSYPFPTTSETPRVVYNSRTDKPWPVALYLTPVSSAGGVAIKAGSLIAVLILRQTNNYNSDDFQFVWNIYANNDVVVPTGGCDVSARDVTVTLPDYPGSVPIPLTVYCAKSQNLGYYLSGTTADAGNSIFTNTASFSPAQGVGVQLTRNGTIIPANNTVSLGAVGTSAVSLGLTANYARTGGQVTAGNVQSIIGVTFVYQ. The pIC50 is 6.4. (3) The small molecule is CCCCCCCCCC(=O)N[C@@H](CNC1CC(CO)C(O)C(O)C1O)[C@H](O)c1ccccc1. The target protein (P17439) has sequence MAARLIGFFLFQAVSWAYGAQPCIPKSFGYSSVVCVCNASYCDSLDPVTLPALGTFSRYESTRRGRRMELSVGAIQANRTGTGLLLTLQPEKKFQKVKGFGGAMTDATALNILALSPPTQKLLLRSYFSTNGIEYNIIRVPMASCDFSIRVYTYADTPNDFQLSNFSLPEEDTKLKIPLIHQALKMSSRPISLFASPWTSPTWLKTNGRVNGKGSLKGQPGDIFHQTWANYFVKFLDAYAKYGLRFWAVTAENEPTAGLFTGYPFQCLGFTPEHQRDFISRDLGPALANSSHDVKLLMLDDQRLLLPRWAEVVLSDPEAAKYVHGIAVHWYMDFLAPAKATLGETHRLFPNTMLFASEACVGSKFWEQSVRLGSWDRGMQYSHSIITNLLYHVTGWTDWNLALNPEGGPNWVRNFVDSPIIVDIPKDAFYKQPMFYHLGHFSKFIPEGSQRVALVASESTDLETVALLRPDGSAVVVVLNRSSEDVPLTISDPDLGFLET.... The pIC50 is 5.3. (4) The drug is Nc1nc(F)nc2c1ncn2C1O[C@H](CO)[C@@H](O)[C@H]1O. The target protein sequence is GRQKARGAATRARQKQRASLETMDKAVQRFRLQNPDLDSEALLTLPLLQLVQKLQSGELSPEAVFFTYLGKAWEVNKGTNCVTSYLTDCETQLSQAPRQGLLYGVPVSLKECFSYKGHDSTLGLSLNEGMPSESDCVVVQVLKLQGAVPFVHTNVPQSMLSFDCSNPLFGQTMNPWKSSKSPGGSSGGEGALIGSGGSPLGLGTDIGGSIRFPSAFCGICGLKPTGNRLSKSGLKGCVYGQTAVQLSLGPMARDVESLALCLKALLCEHLFTLDPTVPPLPFREEVYRSSRPLRVGYYETDNYTMPSPAMRRALIETKQRLEAAGHTLIPFLPNNIPYALEVLSAGGLFSDGGRSFLQNFKGDFVDPCLGDLILILRLPSWFKRLLSLLLKPLFPRLAAFLNSMRPRSAEKLWKLQHEIEMYRQSVIAQWKAMNLDVLLTPMLGPALDLNTPGRATGAISYTVLYNCLDFPAGVVPVTTVTAEDDAQMELYKGYFGDIWD.... The pIC50 is 3.0. (5) The compound is CC[C@H](C)[C@H](OCc1ccccc1)[C@H]1[C@H]([N+](=O)[O-])[C@H](c2ccccc2)N[C@]1(C)C(=O)NCCC(=O)O. The target protein (P24063) has sequence MSFRIAGPRLLLLGLQLFAKAWSYNLDTRPTQSFLAQAGRHFGYQVLQIEDGVVVGAPGEGDNTGGLYHCRTSSEFCQPVSLHGSNHTSKYLGMTLATDAAKGSLLACDPGLSRTCDQNTYLSGLCYLFPQSLEGPMLQNRPAYQECMKGKVDLVFLFDGSQSLDRKDFEKILEFMKDVMRKLSNTSYQFAAVQFSTDCRTEFTFLDYVKQNKNPDVLLGSVQPMFLLTNTFRAINYVVAHVFKEESGARPDATKVLVIITDGEASDKGNISAAHDITRYIIGIGKHFVSVQKQKTLHIFASEPVEEFVKILDTFEKLKDLFTDLQRRIYAIEGTNRQDLTSFNMELSSSGISADLSKGHAVVGAVGAKDWAGGFLDLREDLQGATFVGQEPLTSDVRGGYLGYTVAWMTSRSSRPLLAAGAPRYQHVGQVLLFQAPEAGGRWNQTQKIEGTQIGSYFGGELCSVDLDQDGEAELLLIGAPLFFGEQRGGRVFTYQRRQS.... The pIC50 is 4.6. (6) The target protein sequence is MSDPLHVTFVCTGNICRSPMAEKMFAQQLRHRGLGDAVRVTSAGTGNWHVGSCADERAAGVLRAHGYPTDHRAAQVGTEHLAADLLVALDRNHARLLRQLGVEAARVRMLRSFDPRSGTHALDVEDPYYGDHSDFEEVFAVIESALPGLHDWVDERLARNGPS. The pIC50 is 4.3. The drug is COc1cc(C(=O)/C=C/c2ccc3ccccc3c2)ccc1O. (7) The small molecule is NS(=O)(=O)OC[C@@H]1C[C@@H](N2CCc3c(N[C@H]4CCc5ccccc54)ncnc32)C[C@@H]1O. The target protein (Q13564) has sequence MAQLGKLLKEQKYDRQLRLWGDHGQEALESAHVCLINATATGTEILKNLVLPGIGSFTIIDGNQVSGEDAGNNFFLQRSSIGKNRAEAAMEFLQELNSDVSGSFVEESPENLLDNDPSFFCRFTVVVATQLPESTSLRLADVLWNSQIPLLICRTYGLVGYMRIIIKEHPVIESHPDNALEDLRLDKPFPELREHFQSYDLDHMEKKDHSHTPWIVIIAKYLAQWYSETNGRIPKTYKEKEDFRDLIRQGILKNENGAPEDEENFEEAIKNVNTALNTTQIPSSIEDIFNDDRCINITKQTPSFWILARALKEFVAKEGQGNLPVRGTIPDMIADSGKYIKLQNVYREKAKKDAAAVGNHVAKLLQSIGQAPESISEKELKLLCSNSAFLRVVRCRSLAEEYGLDTINKDEIISSMDNPDNEIVLYLMLRAVDRFHKQQGRYPGVSNYQVEEDIGKLKSCLTGFLQEYGLSVMVKDDYVHEFCRYGAAEPHTIAAFLGGA.... The pIC50 is 8.3. (8) The compound is O=C(O)c1cc2c(s1)CCC2. The target protein (P14920) has sequence MRVVVIGAGVIGLSTALCIHERYHSVLQPLDIKVYADRFTPLTTTDVAAGLWQPYLSDPNNPQEADWSQQTFDYLLSHVHSPNAENLGLFLISGYNLFHEAIPDPSWKDTVLGFRKLTPRELDMFPDYGYGWFHTSLILEGKNYLQWLTERLTERGVKFFQRKVESFEEVAREGADVIVNCTGVWAGALQRDPLLQPGRGQIMKVDAPWMKHFILTHDPERGIYNSPYIIPGTQTVTLGGIFQLGNWSELNNIQDHNTIWEGCCRLEPTLKNARIIGERTGFRPVRPQIRLEREQLRTGPSNTEVIHNYGHGGYGLTIHWGCALEAAKLFGRILEEKKLSRMPPSHL. The pIC50 is 4.5. (9) The drug is Cc1ccc(S(=O)(=O)Nc2ccc(C(=O)/C=C/c3ccc(O)cc3)cc2)cc1. The target protein (P16098) has sequence MEVNVKGNYVQVYVMLPLDAVSVNNRFEKGDELRAQLRKLVEAGVDGVMVDVWWGLVEGKGPKAYDWSAYKQLFELVQKAGLKLQAIMSFHQCGGNVGDAVNIPIPQWVRDVGTRDPDIFYTDGHGTRNIEYLTLGVDNQPLFHGRSAVQMYADYMTSFRENMKDFLDAGVIVDIEVGLGPAGEMRYPSYPQSHGWSFPGIGEFICYDKYLQADFKAAAAAVGHPEWEFPNDVGQYNDTPERTQFFRDNGTYLSEKGRFFLAWYSNNLIKHGDRILDEANKVFLGYKVQLAIKISGIHWWYKVPSHAAELTAGYYNLHDRDGYRTIARMLKRHRASINFTCAEMRDLEQSSQAMSAPEELVQQVLSAGWREGLNVACENALPRYDPTAYNTILRNARPHGINQSGPPEHKLFGFTYLRLSNQLVEGQNYVNFKTFVDRMHANLPRDPYVDPMAPLPRSGPEISIEMILQAAQPKLQPFPFQEHTDLPVGPTGGMGGQAEG.... The pIC50 is 3.6.