From a dataset of Drug-target binding data from BindingDB using IC50 measurements. Regression. Given a target protein amino acid sequence and a drug SMILES string, predict the binding affinity score between them. We predict pIC50 (pIC50 = -log10(IC50 in M); higher means more potent). Dataset: bindingdb_ic50. (1) The small molecule is C=CS(=O)(=O)Nc1cccc(Cc2nn(C(C)C)c3ncnc(N)c23)c1. The target protein sequence is QTQGLAKDAWEIPRESLRLEVKLGQGCFGEVWMGTWNGTTRVAIKTLKPGTMSPEAFLQEAQVMKKLRHEKLSQLYAVVSEEPIYIVCEYMSKGSLLDFLKGEMGKYLRLPQLVDMAAQIASGMAYVERMNYVHRDLRAANILVGENLVCKVADFGLARLIEDNEYTARQGAKFPIKWTAPEAALYGRFTIKSDVWSFGILLTELTTKGRVPYPGMVNREVLDQVERGYRMPCPPECPESLHDLMCQCWRKDPEERPTFEYLQAFLEDYFTSTEPQYQPGENL. The pIC50 is 6.4. (2) The pIC50 is 4.0. The small molecule is CC[C@H](C)[C@H](NC(C)=O)C(=O)N[C@H](C(=O)N[C@@H](C)C(=O)NC(Cc1ccccc1)C(=O)C(=O)NCCC(=O)O)[C@@H](C)O. The target protein sequence is MMLNKKVVALCTLTLHLFCIFLCLGKEVRSEENGKIQDDAKKIVSELRFLEKVEDVIEKSNIGGNEVDADENSFNPDTEVPIEEIEEIKMRELKDVKEEKNKNDNHNNNNNNNNISSSSSSSSNTFGEEKEEVSKKKKKLRLIVSENHATTPSFFQESLLEPDVLSFLESKGNLSNLKNINSMIIELKEDTTDDELISYIKILEEKGALIESDKLVSADNIDISGIKDAIRRGEENIDVNDYKSMLEVENDAEDYDKMFGMFNESHAATSKRKRHSTNERGYDTFSSPSYKTYSKSDYLYDDDNNNNNYYYSHSSNGHNSSSRNSSSSRSRPGKYHFNDEFRNLQWGLDLSRLDETQELINEHQVMSTRICVIDSGIDYNHPDLKDNIELNLKELHGRKGFDDDNNGIVDDIYGANFVNNSGNPMDDNYHGTHVSGIISAIGNNNIGVVGVDVNSKLIICKALDEHKLGRLGDMFKCLDYCISRNAHMINGSFSFDEYSG.... (3) The drug is O=C1NC(C(=O)O)CS(=O)(=O)N1. The target protein (B2RQC6) has sequence MAALVLEDGSVLQGRPFGAAVSTAGEVVFQTGMVGYPEALTDPSYKAQILVLTYPLIGNYGIPSDEEDEFGLSKWFESSEIHVAGLVVGECCPTPSHWSANCTLHEWLQQRGIPGLQGVDTRELTKKLREQGSLLGKLVQKGTEPSALPFVDPNARPLAPEVSIKTPRVFNAGGAPRICALDCGLKYNQIRCLCQLGAEVTVVPWDHELDSQKYDGLFLSNGPGDPASYPGVVSTLSRVLSEPNPRPVFGICLGHQLLALAIGAKTYKMRYGNRGHNQPCLLVGTGRCFLTSQNHGFAVDADSLPAGWAPLFTNANDCSNEGIVHDSLPFFSVQFHPEHRAGPSDMELLFDVFLETVREAAAGNIGGQTVRERLAQRLCPPELPIPGSGLPPPRKVLILGSGGLSIGQAGEFDYSGSQAIKALKEENIQTLLINPNIATVQTSQGLADKVYFLPITLHYVTQVIRNERPDGVLLTFGGQTALNCGVELTKAGVLARYGVR.... The pIC50 is 3.3. (4) The compound is O=C(CCCCCCCc1nnc(-c2cccnc2)[nH]1)Nc1ccccc1-c1ccccc1. The target protein (Q99KQ4) has sequence MNAAAEAEFNILLATDSYKVTHYKQYPPNTSKVYSYFECREKKTENSKVRKVKYEETVFYGLQYILNKYLKGKVVTKEKIQEAKEVYREHFQDDVFNERGWNYILEKYDGHLPIEVKAVPEGSVIPRGNVLFTVENTDPECYWLTNWIETILVQSWYPITVATNSREQKKILAKYLLETSGNLDGLEYKLHDFGYRGVSSQETAGIGASAHLVNFKGTDTVAGIALIKKYYGTKDPVPGYSVPAAEHSTITAWGKDHEKDAFEHIVTQFSSVPVSVVSDSYDIYNACEKIWGEDLRHLIVSRSTEAPLIIRPDSGNPLDTVLKVLDILGKKFPVTENSKGYKLLPPYLRVIQGDGVDINTLQEIVEGMKQKKWSIENVSFGSGGALLQKLTRDLLNCSFKCSYVVTNGLGVNVFKDPVADPNKRSKKGRLSLHRTPAGNFVTLEEGKGDLEEYGHDLLHTVFKNGKVTKSYSFDEVRKNAQLNIEQDVAPH. The pIC50 is 6.6. (5) The drug is C[C@@H]1C[C@@]2(NC(=O)CS2)C2(O)O[C@@H]3C[C@@]4(CO)[C@@H](CC[C@@H]5[C@@H]4CC[C@]4(C)[C@@H](C6=CC(=O)OC6)CC[C@]54CO)C[C@H]3O[C@@H]2O1. The target protein (P05024) has sequence MGKGVGRDKYEPAAVSEHGDKKKAKKERDMDELKKEVSMDDHKLSLDELHRKYGTDLSRGLTPARAAEILARDGPNALTPPPTTPEWVKFCRQLFGGFSMLLWIGAILCFLAYGIQAATEEEPQNDNLYLGVVLSAVVIITGCFSYYQEAKSSKIMESFKNMVPQQALVIRNGEKMSINAEEVVVGDLVEVKGGDRIPADLRIISANGCKVDNSSLTGESEPQTRSPDFTNENPLETRNIAFFSTNCVEGTARGIVVYTGDRTVMGRIATLASGLEGGQTPIAAEIEHFIHIITGVAVFLGVSFFILSLILEYTWLEAVIFLIGIIVANVPEGLLATVTVCLTLTAKRMARKNCLVKNLEAVETLGSTSTICSDKTGTLTQNRMTVAHMWSDNQIHEADTTENQSGVSFDKTSATWLALSRIAGLCNRAVFQANQENLPILKRAVAGDASESALLKCIELCCGSVKEMRERYTKIVEIPFNSTNKYQLSIHKNPNTAEPR.... The pIC50 is 7.1. (6) The drug is C=CCOc1ccc(Br)cc1C=C1SC(Nc2ccc(Cl)c(C(=O)O)c2)=NC1=O. The target protein (Q61214) has sequence MHTGGETSACKPSSVRLAPSFSFHAAGLQMAAQMPHSHQYSDRRQPSISDQQVSALPYSDQIQQPLTNQVMPDIVMLQRRMPQTFRDPATAPLRKLSVDLIKTYKHINEVYYAKKKRRHQQGQGDDSSHKKERKVYNDGYDDDNYDYIVKNGEKWMDRYEIDSLIGKGSFGQVVKAYDRVEQEWVAIKIIKNKKAFLNQAQIEVRLLELMNKHDTEMKYYIVHLKRHFMFRNHLCLVFEMLSYNLYDLLRNTNFRGVSLNLTRKFAQQMCTALLFLATPELSIIHCDLKPENILLCNPKRSAIKIVDFGSSCQLGQRIYQYIQSRFYRSPEVLLGMPYDLAIDMWSLGCILVEMHTGEPLFSGANEVDQMNKIVEVLGIPPAHILDQAPKARKFFEKLPDGTWSLKKTKDGKREYKPPGTRKLHNILGVETGGPGGRRAGESGHTVADYLKFKDLILRMLDYDPKTRIQPYYALQHSFFKKTADEGTNTSNSVSTSPAME.... The pIC50 is 4.7. (7) The drug is COC(=O)C1CC(=O)c2oc(=O)c3cc(O)c(OC)c(OC)c3c21. The target protein (P60174) has sequence MAEDGEEAEFHFAALYISGQWPRLRADTDLQRLGSSAMAPSRKFFVGGNWKMNGRKQSLGELIGTLNAAKVPADTEVVCAPPTAYIDFARQKLDPKIAVAAQNCYKVTNGAFTGEISPGMIKDCGATWVVLGHSERRHVFGESDELIGQKVAHALAEGLGVIACIGEKLDEREAGITEKVVFEQTKVIADNVKDWSKVVLAYEPVWAIGTGKTATPQQAQEVHEKLRGWLKSNVSDAVAQSTRIIYGGSVTGATCKELASQPDVDGFLVGGASLKPEFVDIINAKQ. The pIC50 is 3.0. (8) The small molecule is CC(C)C[C@H](NC(=O)[C@H](C[Si](C)(C)C)NC(=O)[C@H](Cc1ccc(O)cc1)NC(=O)[C@@H]1C[Si](C)(C)CN1C(=O)[C@H](CCCCN)NC(=O)[C@@H](N)CCCCCN)C(=O)O. The target protein (O95665) has sequence METSSPRPPRPSSNPGLSLDARLGVDTRLWAKVLFTALYALIWALGAAGNALSAHVVLKARAGRAGRLRHHVLSLALAGLLLLLVGVPVELYSFVWFHYPWVFGDLGCRGYYFVHELCAYATVLSVAGLSAERCLAVCQPLRARSLLTPRRTRWLVALSWAASLGLALPMAVIMGQKHELETADGEPEPASRVCTVLVSRTALQVFIQVNVLVSFVLPLALTAFLNGVTVSHLLALCSQVPSTSTPGSSTPSRLELLSEEGLLSFIVWKKTFIQGGQVSLVRHKDVRRIRSLQRSVQVLRAIVVMYVICWLPYHARRLMYCYVPDDAWTDPLYNFYHYFYMVTNTLFYVSSAVTPLLYNAVSSSFRKLFLEAVSSLCGEHHPMKRLPPKPQSPTLMDTASGFGDPPETRT. The pIC50 is 5.8. (9) The small molecule is CC(=O)N[C@@H]1[C@@H](N)C=C(C(=O)O)O[C@H]1CNc1ccccc1. The target protein sequence is MSIKMTSQRRRASIHKETDSNIKGVDMRFKNVKKTALMLAMFGMATSSNAALFDYNATGDTEFDSPAKQGWMQDNTNNGSGVLTNADGMPAWLVQGNGGRAQWTYSLSTNQHAQASSFGWRMTTEMKVLSGGMITNYYANGTQRVLPIISLDSSGNLVVEFEGQTGRTILATGTAATEYHKFELVFLPGSNPSASFYFDGKLIRDNIQPTASKQNMIVWGNGSSNTDGVAAYRDIKFEIQGDVIFRGPDRIPSIVASSVTPGVVTAFAEKRVGGGDPGALSNTNDIITRTSRDGGITWDTELNLTEQINVSDEFDFSDPRPIYDPSTNTVLVSYARWPTDAAQNGDRIKPWMPNGIFYSVYDVASGNWRAPIDVTDQVKERSFQIAGWGGSELYRRNTNLNSQQDWQSNAKIRIVDGAANQIQVADGGRKYVFTLSIDESGSLVANLNGVSDPIILQSERAKVHSFHDYELQYSALNRSTTLFVDGQAITTWTGEVSQEN.... The pIC50 is 3.0.