Dataset: Drug-target binding data from BindingDB using IC50 measurements. Task: Regression. Given a target protein amino acid sequence and a drug SMILES string, predict the binding affinity score between them. We predict pIC50 (pIC50 = -log10(IC50 in M); higher means more potent). Dataset: bindingdb_ic50. (1) The pIC50 is 6.0. The target protein sequence is MALRVTADVWARPWQCLHRTRALGSTATQAPKTLKPFEAIPQYSRNKWLKMIQILREQGQENLHLEMHQAFQELGPIFRHSAGGAQIVSVMLPEDAEKLHQVESILPRRMTLESWVAHRELRGLRRGVFLLNGADWRFNRLQLNPNMLSPKAVQSFVPFVDVVARDFVENLKKRMLENVHGSMSMDIQSNVFNYTMEASHFVISGERLGLTGHDLNPESLKFIHALHSMFKSTTQLMFLPKNLTRWTSTQVWKGHFESWDIISEYVTNVSRNVYRELAEGRQQSWSVISEMVAQSTLSMDAIHANSMELIAGSVDTTAISLVMTLFELARNPDVQQALRQESLAAEASIAANPQKAMSDLPLLRAALKETLRLYPIGSSLERIVDSDLVLQNYHVPAGTLVIIYLYSMGRNPAVFPRPERYMPQRWLERKRSFQHLAFGFGVRQCLGRRLAEVEVLLLLHHMLKIFQVETLRQEDVQMAYRFVLMPNPRLVLTIRPVS. The compound is CC(C)[C@](O)(c1ccc2c3c(ccc2c1)C(=O)N(C)C3)c1cnc[nH]1. (2) The drug is C[C@@H](N)[C@H]1CC[C@H](C(=O)Nc2ccncc2)CC1. The target protein sequence is MGNAAAAKKGSEQESVKEFLAKAKEDFLKKWENPAQNTAHLDQFERIKTLGTGSFGRVMLVKHMETGNHYAMKILDKQKVVKLKQIEHTLNEKRILQAVNFPFLVKLEFSFKDNSNLYMVMEYVPGGDMFSHLRRIGRFSEPHARFYAAQIVLTFEYLHSLDLIYRDLKPENLLIDQQGYIQVTDFGFAKRVKGRTWTLCGTPEYLAPEIILSKGYNKAVDWWALGVLIYEMAAGYPPFFADQPIQIYEKIVSGKVRFPSHFSSDLKDLLRNLLQVDLTKRFGNLKNGVNDIKNHKWFATTDWIAIYQRKVEAPFIPKFKGPGDTSNFDDYEEEEIRVSINEKCGKEFSEF. The pIC50 is 4.8. (3) The compound is Cc1cccc(C)c1C(=O)N1CCC(C)(N2CCC(N(Cc3ccccc3)c3ccccc3)CC2)CC1. The target protein (P61814) has sequence MDYQVSSPTYDIDYYTSEPCQKINVKQIAARLLPPLYSLVFIFGFVGNILVVLILINCKRLKSMTDIYLLNLAISDLLFLLTVPFWAHYAAAQWDFGNTMCQLLTGLYFIGFFSGIFFIILLTIDRYLAIVHAVFALKARTVTFGVVTSVITWVVAVFASLPGIIFTRSQREGLHYTCSSHFPYSQYQFWKNFQTLKMVILGLVLPLLVMVICYSGILKTLLRCRNEKKRHRAVRLIFTIMIVYFLFWAPYNIVLLLNTFQEFFGLNNCSSSNRLDQAMQVTETLGMTHCCINPIIYAFVGEKFRNYLLVFFQKHIAKRFCKCCSIFQQEAPERASSVYTRSTGEQEISVGL. The pIC50 is 8.5. (4) The drug is Cc1c(Cc2ccccc2)c(=O)oc2cc(OS(C)(=O)=O)ccc12. The target protein (P30837) has sequence MLRFLAPRLLSLQGRTARYSSAAALPSPILNPDIPYNQLFINNEWQDAVSKKTFPTVNPTTGEVIGHVAEGDRADVDRAVKAAREAFRLGSPWRRMDASERGRLLNLLADLVERDRVYLASLETLDNGKPFQESYALDLDEVIKVYRYFAGWADKWHGKTIPMDGQHFCFTRHEPVGVCGQIIPWNFPLVMQGWKLAPALATGNTVVMKVAEQTPLSALYLASLIKEAGFPPGVVNIITGYGPTAGAAIAQHVDVDKVAFTGSTEVGHLIQKAAGDSNLKRVTLELGGKSPSIVLADADMEHAVEQCHEALFFNMGQCCCAGSRTFVEESIYNEFLERTVEKAKQRKVGNPFELDTQQGPQVDKEQFERVLGYIQLGQKEGAKLLCGGERFGERGFFIKPTVFGGVQDDMRIAKEEIFGPVQPLFKFKKIEEVVERANNTRYGLAAAVFTRDLDKAMYFTQALQAGTVWVNTYNIVTCHTPFGGFKESGNGRELGEDGLK.... The pIC50 is 4.9. (5) The drug is CN(C)CCN1C(=O)[C@@H](Cc2c[nH]c3ccccc23)N=C(c2ccccc2F)c2ccccc21. The target protein (Q63931) has sequence MDVVDSLFVNGSNITSACELGFENETLFCLDRPRPSKEWQPAVQILLYSLIFLLSVLGNTLVITVLIRNKRMRTVTNIFLLSLAVSDLMLCLFCMPFNLIPSLLKDFIFGSAVCKTTTYFMGTSVSVSTFNLVAISLERYGAICKPLQSRVWQTKSHALKVIAATWCLSFTIMTPYPIYSNLVPFTKNNNQTGNMCRFLLPNDVMQQTWHTFLLLILFLIPGIVMMVAYGLISLELYQGIKFDAIQKKSAKERKTSTGSSGPMEDSDGCYLQKSRHPRKLELRQLSPSSSGSNRINRIRSSSSTANLMAKKRVIRMLIVIVVLFFLCWMPIFSANAWRAYDTVSAERHLSGTPISFILLLSYTSSCVNPIIYCFMNKRFRLGFMATFPCCPNPGTPGVRGEMGEEEEGRTTGASLSRYSYSHMSTSAPPP. The pIC50 is 4.0. (6) The drug is COC(=O)N1CCN([C@@H]2Cc3ccc(NC(=O)c4cccc(C)c4-c4ccc(C(F)(F)F)cc4)cc3C2)CC1. The target protein (P55157) has sequence MILLAVLFLCFISSYSASVKGHTTGLSLNNDRLYKLTYSTEVLLDRGKGKLQDSVGYRISSNVDVALLWRNPDGDDDQLIQITMKDVNVENVNQQRGEKSIFKGKSPSKIMGKENLEALQRPTLLHLIHGKVKEFYSYQNEAVAIENIKRGLASLFQTQLSSGTTNEVDISGNCKVTYQAHQDKVIKIKALDSCKIARSGFTTPNQVLGVSSKATSVTTYKIEDSFVIAVLAEETHNFGLNFLQTIKGKIVSKQKLELKTTEAGPRLMSGKQAAAIIKAVDSKYTAIPIVGQVFQSHCKGCPSLSELWRSTRKYLQPDNLSKAEAVRNFLAFIQHLRTAKKEEILQILKMENKEVLPQLVDAVTSAQTSDSLEAILDFLDFKSDSSIILQERFLYACGFASHPNEELLRALISKFKGSIGSSDIRETVMIITGTLVRKLCQNEGCKLKAVVEAKKLILGGLEKAEKKEDTRMYLLALKNALLPEGIPSLLKYAEAGEGPI.... The pIC50 is 6.0. (7) The small molecule is CN1CCc2cc(O)c(O)cc2C1Cc1ccc(O)c(O)c1. The target protein sequence is MAHHHHHHSRAWRHPQFGGHHHHHHALEVLFQGPLGSMEDFVRQCFNPMIVELAEKAMKEYGEDPKIETNKFAAICTHLEVCFMYSDFHFIDERGESIIVESGDPNALLKHRFEIIEGRDRIMAWTVVNSICNTTGVEKPKFLPDLYDYKENRFIEIGVTRREVHIYYLEKANKIKSEKTHIHIFSFTGEEMATKADYTLDEESRARIKTRLFTIRQEMASRSLWDSFRQSERGEETVEER. The pIC50 is 5.7. (8) The small molecule is O=C(NCc1ccc(Cl)cc1)c1cn2c(CCN(CCO)CCO)cc3cc(CN4CCOCC4)cc(c1=O)c32. The target protein (P09884) has sequence MAPVHGDDSLSDSGSFVSSRARREKKSKKGRQEALERLKKAKAGEKYKYEVEDFTGVYEEVDEEQYSKLVQARQDDDWIVDDDGIGYVEDGREIFDDDLEDDALDADEKGKDGKARNKDKRNVKKLAVTKPNNIKSMFIACAGKKTADKAVDLSKDGLLGDILQDLNTETPQITPPPVMILKKKRSIGASPNPFSVHTATAVPSGKIASPVSRKEPPLTPVPLKRAEFAGDDVQVESTEEEQESGAMEFEDGDFDEPMEVEEVDLEPMAAKAWDKESEPAEEVKQEADSGKGTVSYLGSFLPDVSCWDIDQEGDSSFSVQEVQVDSSHLPLVKGADEEQVFHFYWLDAYEDQYNQPGVVFLFGKVWIESAETHVSCCVMVKNIERTLYFLPREMKIDLNTGKETGTPISMKDVYEEFDEKIATKYKIMKFKSKPVEKNYAFEIPDVPEKSEYLEVKYSAEMPQLPQDLKGETFSHVFGTNTSSLELFLMNRKIKGPCWLE.... The pIC50 is 4.7. (9) The small molecule is CC(=N)[PH](O)(O)CC(C)C(=O)O. The target protein sequence is MINVTLEQIKNWIDCEIDEKHLKKTINGVSIDSRKINEGALFIPFKGENVDGHRFITQALNDGAGAVFSEKENKHSEGNQGPIIWVEDTLIALQQLAKAYLNHVNPKVIAVTGSNGKTTTKDMIESVLSTEFKVKKTQGNYNNEIGMPLTLLELDEDTEISILEMGMSGFHQIELLSHIAQPDIAVITNIGESHMQDLGSREGIAKAKFEITTGLKTNGIFIYDGDEPLLKPHVNQVKNAKLISIGLNSDSTYTCHMNDVKNEGIHFTINQKEHYHLPILGTHNMKNAAIAIAIGHELGLNETIIQNNIHNVQLTAMRMERHESSNNVTVINDAYNASPTSMKAAIDTLSVMKGRKILILADVLELGPNSQLMHKQVGEYLKDKNIDVLYTFGKEASYIYDSGKVFVKEAKYFDNKDQLIQTLISQVKPEDKVLVKGSRGMKLEEVVDALL. The pIC50 is 3.9. (10) The drug is O=C(O)CCC(=O)c1cccc([N+](=O)[O-])c1. The target protein (O88867) has sequence MASSDTEGKRVVVIGGGLVGALNACFLAKRNFQVDVYEAREDIRVANFMRGRSINLALSYRGRQALKAVGLEDQIVSKGVPMKARMIHSLSGKKSAIPYGNKSQYILSISREKLNKDLLTAVESYPNAKVHFGHKLSKCCPEEGILTMLGPNKVPRDITCDLIVGCDGAYSTVRAHLMKKPRFDYSQQYIPHGYMELTIPPKNGEYAMEPNCLHIWPRNAFMMIALPNMDKSFTCTLFMSFEEFEKLPTHSDVLDFFQKNFPDAIPLMGEQALMRDFFLLPAQPMISVKCSPFHLKSRCVLMGDAAHAIVPFFGQGMNAGFEDCLVFDELMDKFNNDLSVCLPEFSRFRIPDDHAISDLSMYNYIEMRAHVNSRWFLFQRLLDKFLHALMPSTFIPLYTMVAFTRIRYHEAVLRWHWQKKVINRGLFVLGSLVAIGSAYILVHHLSPRPLELLRSAWTGTSGHWNRSADISPRVPWSH. The pIC50 is 5.2.