From a dataset of Drug-target binding data from BindingDB using IC50 measurements. Regression. Given a target protein amino acid sequence and a drug SMILES string, predict the binding affinity score between them. We predict pIC50 (pIC50 = -log10(IC50 in M); higher means more potent). Dataset: bindingdb_ic50. (1) The small molecule is [N-]=[N+]=NC[C@H]1O[C@@H](n2c(SCC(=O)NCc3ccnc(Cl)c3)nc3c(N)ncnc32)[C@H](O)[C@@H]1O. The target protein (Q5HH78) has sequence MRYTILTKGDSKSNALKHKMMNYMKDFRMIEDSENPEIVISVGGDGTLLQAFHQYSHMLSKVAFVGVHTGHLGFYADWLPHEVEKLIIEINNSEFQVIEYPLLEIIMRYNDNGYETRYLALNEATMKTENGSTLVVDVNLRGKHFERFRGDGLCVSTPSGSTAYNKALGGALIHPSLEAMQITEIASINNRVFRTVGSPLVLPKHHTCLISPVNHDTIRMTIDHVSIKHKNVNSIQYRVANEKVRFARFRPFPFWKRVHDSFISSDEER. The pIC50 is 3.4. (2) The compound is O=C(c1nc2ccc(O)cc2[nH]1)N1CCC(Oc2ccc(Cl)cc2)CC1. The target protein (Q00960) has sequence MKPSAECCSPKFWLVLAVLAVSGSKARSQKSPPSIGIAVILVGTSDEVAIKDAHEKDDFHHLSVVPRVELVAMNETDPKSIITRICDLMSDRKIQGVVFADDTDQEAIAQILDFISAQTLTPILGIHGGSSMIMADKDESSMFFQFGPSIEQQASVMLNIMEEYDWYIFSIVTTYFPGYQDFVNKIRSTIENSFVGWELEEVLLLDMSLDDGDSKIQNQLKKLQSPIILLYCTKEEATYIFEVANSVGLTGYGYTWIVPSLVAGDTDTVPSEFPTGLISVSYDEWDYGLPARVRDGIAIITTAASDMLSEHSFIPEPKSSCYNTHEKRIYQSNMLNRYLINVTFEGRNLSFSEDGYQMHPKLVIILLNKERKWERVGKWKDKSLQMKYYVWPRMCPETEEQEDDHLSIVTLEEAPFVIVESVDPLSGTCMRNTVPCQKRIISENKTDEEPGYIKKCCKGFCIDILKKISKSVKFTYDLYLVTNGKHGKKINGTWNGMIGE.... The pIC50 is 7.8. (3) The drug is COC1CCC2(C=C(c3cc(-c4ccc(Cl)c(Cl)c4)ccc3C)C(=O)N2)CC1. The target protein sequence is ANLIPSQEPFPASDNSGETPQRNGEGHTLPKTPSQAEPASHKGPKDAGRRRNSLPPSHQKPPRNPLSSSDAAPSPELQANGTGTQGLEATDTNGLSSSARPQGQQAGSPSKEDKKQANIKRQLMTNFILGSFDDYSSDEDSVAGSSRESTRKGSRASLGALSLEAYLTTGEAETRVPTMRPSMSGLHLVKRGREHKKLDLHRDFTVASPAEFVTRFGGDRVIEKVLIANNGIAAVKCMRSIRRWAYEMFRNERAIRFVVMVTPEDLKANAEYIKMADHYVPVPGGPNNNNYANVELIVDIAKRIPVQAVWAGWGHASENPKLPELLCKNGVAFLGPPSEAMWALGDKIASTVVAQTLQVPTLPWSGSGLTVEWTEDDLQQGKRISVPEDVYDKGCVKDVDEGLEAAERIGFPLMIKASEGGGGKGIRKAESAEDFPILFRQVQSEIPGSPIFLMKLAQHARHLEVQILADQYGNAVSLFGRDCSIQRRHQKIVEEAPATI.... The pIC50 is 5.8. (4) The compound is O=C(COc1ccccc1)N1CCCC[C@@H]1c1nc(-c2ccccn2)cs1. The target protein (Q63704) has sequence MAEAHQAVAFQFTVTPDGVDFRLSREALRHIYLSGINSWKKRLIRIKNGILRGVYPGSPTSWLVVVMATVGSNYCKVDISMGLVHCIQRCLPTRYGSYGTPQTETLLSMVIFSTGVWATGIFLFRQTLKLLLSYHGWMFEMHSKTSHATKIWAICVRLLSSRRPMLYSFQTSLPKLPVPSVPATIHRYLDSVRPLLDDEAYFRMESLAKEFQDKIAPRLQKYLVLKSWWATNYVSDWWEEYVYLRGRSPIMVNSNYYAMDFVLIKNTSQQAARLGNTVHAMIMYRRKLDREEIKPVMALGMVPMCSYQMERMFNTTRIPGKETDLLQHLSESRHVAVYHKGRFFKVWLYEGSCLLKPRDLEMQFQRILDDTSPPQPGEEKLAALTAGGRVEWAEARQKFFSSGKNKMSLDTIERAAFFVALDEDSHCYNPDDEASLSLYGKSLLHGNCYNRWFDKSFTLISCKNGQLGLNTEHSWADAPIIGHLWEFVLATDTFHLGYTE.... The pIC50 is 6.9. (5) The small molecule is CCCCC(=O)C=C(C)C=CCCC(=O)N1CCCC1=O. The target protein (P54358) has sequence MDGKRKFNGTSNGHAKKPRNPDDDEEMGFEAELAAFENSEDMDQTLLMGDGPENQTTSERWSRPPPPELDPSKHNLEFQQLDVENYLGQPLPGMPGAQIGPVPVVRMFGVTMEGNSVCCHVHGFCPYFYIEAPSQFEEHHCEKLQKALDQKVIADIRNNKDNVQEAVLMVELVEKLNIHGYNGDKKQRYIKISVTLPRFVAAASRLLKKEVIMSEIDFQDCRAFENNIDFDIRFMVDTDVVGCNWIELPMGHWRIRNSHSKPLPESRCQIEVDVAFDRFISHEPEGEWSKVAPFRILSFDIECAGRKGIFPEAKIDPVIQIANMVIRQGEREPFIRNVFTLNECAPIIGSQVLCHDKETQMLDKWSAFVREVDPDILTGYNINNFDFPYLLNRAAHLKVRNFEYLGRIKNIRSVIKEQMLQSKQMGRRENQYVNFEGRVPFDLLFVLLRDYKLRSYTLNAVSYHFLQEQKEDVHHSIITDLQNGDEQTRRRLAMYCLKDA.... The pIC50 is 3.7. (6) The compound is N#Cc1c(N)nc2sc(C(=O)c3cccc(Cl)c3)c(N)c2c1-c1ccccc1I. The target protein sequence is MSLNAAAAADERSRKEMDRFQVERMAGQGTFGTVQLGKEKSTGMSVAIKKVIQDPRFRNRELQIMQDLAVLHHPNIVQLQSYFYTLGERDRRDIYLNVVMEYVPDTLHRCCRNYYRRQVAPPPILIKVFLFQLIRSIGCLHLPSVNVCHRDIKPHNVLVNEADGTLKLCDFGSAKKLSPSEPNVAYICSRYYRAPELIFGNQHYTTAVDIWSVGCIFAEMMLGEPIFRGDNSAGQLHEIVRVLGCPSREVLRKLNPSHTDVDLYNSKGIPWSNVFSDHSLKDAKEAYDLLSALLQYLPEERMKPYEALCHPYFDELHDPATKLPNNKDLPEDLFRFLPNEIEVMSEAQKAKLVRK. The pIC50 is 4.0. (7) The drug is O=C(Cn1sc2ccccc2c1=O)Nc1ccc(Cl)cc1. The target protein (P11411) has sequence MVSEIKTLVTFFGGTGDLAKRKLYPSVFNLYKKGYLQKHFAIVGTARQALNDDEFKQLVRDSIKDFTDDQAQAEAFIEHFSYRAHDVTDAASYAVLKEAIEEAADKFDIDGNRIFYMSVAPRFFGTIAKYLKSEGLLADTGYNRLMIEKPFGTSYDTAAELQNDLENAFDDNQLFRIDHYLGKEMVQNIAALRFGNPIFDAAWNKDYIKNVQVTLSEVLGVEERAGYYDTAGALLDMIQNHTMQIVGWLAMEKPESFTDKDIRAAKNAAFNALKIYDEAEVNKYFVRAQYGAGDSADFKPYLEELDVPADSKNNTFIAGELQFDLPRWEGVPFYVRSGKRLAAKQTRVDIVFKAGTFNFGSEQEAQEAVLSIIIDPKGAIELKLNAKSVEDAFNTRTIDLGWTVSDEDKKNTPEPYERMIHDTMNGDGSNFADWNGVSIAWKFVDAISAVYTADKAPLETYKSGSMGPEASDKLLAANGDAWVFKG. The pIC50 is 4.6. (8) The small molecule is CC(=O)Oc1ccccc1C(=O)O. The target protein (P19440) has sequence MKKKLVVLGLLAVVLVLVIVGLCLWLPSASKEPDNHVYTRAAVAADAKQCSKIGRDALRDGGSAVDAAIAALLCVGLMNAHSMGIGGGLFLTIYNSTTRKAEVINAREVAPRLAFATMFNSSEQSQKGGLSVAVPGEIRGYELAHQRHGRLPWARLFQPSIQLARQGFPVGKGLAAALENKRTVIEQQPVLCEVFCRDRKVLREGERLTLPQLADTYETLAIEGAQAFYNGSLTAQIVKDIQAAGGIVTAEDLNNYRAELIEHPLNISLGDVVLYMPSAPLSGPVLALILNILKGYNFSRESVESPEQKGLTYHRIVEAFRFAYAKRTLLGDPKFVDVTEVVRNMTSEFFAAQLRAQISDDTTHPISYYKPEFYTPDDGGTAHLSVVAEDGSAVSATSTINLYFGSKVRSPVSGILFNNEMDDFSSPSITNEFGVPPSPANFIQPGKQPLSSMCPTIMVGQDGQVRMVVGAAGGTQITTATALAIIYNLWFGYDVKRAVE.... The pIC50 is 4.4. (9) The target protein (P9WJN1) has sequence MSELRLMAVHAHPDDESSKGAATLARYADEGHRVLVVTLTGGERGEILNPAMDLPDVHGRIAEIRRDEMTKAAEILGVEHTWLGFVDSGLPKGDLPPPLPDDCFARVPLEVSTEALVRVVREFRPHVMTTYDENGGYPHPDHIRCHQVSVAAYEAAGDFCRFPDAGEPWTVSKLYYVHGFLRERMQMLQDEFARHGQRGPFEQWLAYWDPDHDFLTSRVTTRVECSKYFSQRDDALRAHATQIDPNAEFFAAPLAWQERLWPTEEFELARSRIPARPPETELFAGIEP. The small molecule is Cc1cccc(C)c1SSCCNC(=O)C(Cc1cccc(Br)c1)N=O. The pIC50 is 4.2. (10) The small molecule is CCC(C)c1ccc(OC(CCCCN)c2ccccc2)cc1. The target protein (Q01650) has sequence MAGAGPKRRALAAPAAEEKEEAREKMLAAKSADGSAPAGEGEGVTLQRNITLLNGVAIIVGTIIGSGIFVTPTGVLKEAGSPGLALVVWAACGVFSIVGALCYAELGTTISKSGGDYAYMLEVYGSLPAFLKLWIELLIIRPSSQYIVALVFATYLLKPLFPTCPVPEEAAKLVACLCVLLLTAVNCYSVKAATRVQDAFAAAKLLALALIILLGFVQIGKGDVSNLDPNFSFEGTKLDVGNIVLALYSGLFAYGGWNYLNFVTEEMINPYRNLPLAIIISLPIVTLVYVLTNLAYFTTLSTEQMLSSEAVAVDFGNYHLGVMSWIIPVFVGLSCFGSVNGSLFTSSRLFFVGSREGHLPSILSMIHPQLLTPVPSLVFTCVMTLLYAFSKDIFSVINFFSFFNWLCVALAIIGMIWLRHRKPELERPIKVNLALPVFFILACLFLIAVSFWKTPVECGIGFTIILSGLPVYFFGVWWKNKPKWLLQGIFSTTVLCQKLM.... The pIC50 is 4.2.