From a dataset of Drug-target binding data from BindingDB using IC50 measurements. Regression. Given a target protein amino acid sequence and a drug SMILES string, predict the binding affinity score between them. We predict pIC50 (pIC50 = -log10(IC50 in M); higher means more potent). Dataset: bindingdb_ic50. (1) The compound is O=C(CCl)c1c(Br)sc(Cl)c1Cl. The target is XTSFAESXKPVQQPSAFGS. The pIC50 is 5.7. (2) The drug is CN1CCC[C@H]1C(=O)N1CCC(CCOc2ncc(-c3nc(C#N)nc4c3ncn4C)cc2C(F)(F)F)CC1. The target protein (O70370) has sequence MRAPGHAAIRWLFWMPLVCSVAMEQLQRDPTLDYHWDLWKKTHEKEYKDKNEEEVRRLIWEKNLKFIMIHNLEYSMGMHTYQVGMNDMGDMTNEEILCRMGALRIPRQSPKTVTFRSYSNRTLPDTVDWREKGCVTEVKYQGSCGACWAFSAVGALEGQLKLKTGKLISLSAQNLVDCSNEEKYGNKGCGGGYMTEAFQYIIDNGGIEADASYPYKATDEKCHYNSKNRAATCSRYIQLPFGDEDALKEAVATKGPVSVGIDASHSSFFFYKSGVYDDPSCTGNVNHGVLVVGYGTLDGKDYWLVKNSWGLNFGDQGYIRMARNNKNHCGIASYCSYPEI. The pIC50 is 9.0. (3) The drug is O=c1c(O)c(-c2cc(O)c(O)c(O)c2)oc2cc(O)cc(O)c12. The target protein (Q91WR5) has sequence MNSKCHCVILNDGNFIPVLGFGTALPLECPKSKAKELTKIAIDAGFHHFDSASVYNTEDHVGEAIRSKIADGTVRREDIFYTSKVWCTSLHPELVRASLERSLQKLQFDYVDLYLIHYPMALKPGEENFPVDEHGKLIFDRVDLCATWEAMEKCKDAGLTKSIGVSNFNYRQLEMILNKPGLKYKPVCNQVECHPYLNQMKLLDFCKSKDIVLVAYGVLGTQRYGGWVDQNSPVLLDEPVLGSMAKKYNRTPALIALRYQLQRGIVVLNTSLKEERIKENMQVFEFQLSSEDMKVLDGLNRNMRYIPAAIFKGHPNWPFLDEY. The pIC50 is 4.9. (4) The drug is O=c1[nH]c2c(Br)cccc2cc1O. The target protein sequence is MKAILVVLLYTFATANADTLCIGYHANNSTDTVDTVLEKNVTVTHSVNLLEDKHNGKLCKLRGVAPLHLGKCNIAGWILGNPECESLSTASSWSYIVETPSSDNGTCYPGDFIDYEELREQLSSVSSFERFEIFPKTSSWPNHDSNKGVTAACPHAGAKSFYKNLIWLVKKGNSYPKLSKSYINDKGKEVLVLWGIHHPSTSADQQSLYQNADTYVFVGSSRYSKKFKPEIAIRPKVRDQEGRMNYYWTLVEPGDKITFEATGNLVVPRYAFAMERNAGSGIIISDTPVHDCNTTCQTPKGAINTSLPFQNIHPITIGKCPKYVKSTKLRLATGLRNIPSIQSRGLFGAIAGFIEGGWTGMVDGWYGYHHQNEQGSGYAADLKSTQNAIDEITNKVNSVIEKMNTQFTAVGKEFNHLEKRIENLNKKVDDGFLDIWTYNAELLVLLENERTLDYHDSNVKNLYEKVRSQLKNNAKEIGNGCFEFYHKCDNTCMESVKNGT.... The pIC50 is 5.0. (5) The compound is O=S(=O)(Nc1ccc(O)cc1)c1cccc2cccnc12. The target protein (P14927) has sequence MAGKQAVSASGKWLDGIRKWYYNAAGFNKLGLMRDDTIYEDEDVKEAIRRLPENLYNDRMFRIKRALDLNLKHQILPKEQWTKYEEENFYLEPYLKEVIRERKEREEWAKK. The pIC50 is 4.7. (6) The compound is COCC(=O)NC(Cn1cncn1)CP(=O)(O)O. The target protein (P0CO23) has sequence MSERIASVERTTSETHISCTIDLDHIPGVTEQKINVSTGIGFLDHMFTALAKHGGMSLQLQCKGDLHIDDHHTAEDCALALGEAFKKALGERKGIKRYGYAYAPLDESLSRAVIDISSRPYFMCHLPFTREKVGDLSTEMVSHLLQSFAFAAGVTLHIDSIRGENNHHIAESAFKALALAIRMAISRTGGDDVPSTKGVLAL. The pIC50 is 5.0. (7) The target protein (Q9UUZ4) has sequence MIGSSHAVVALGLFTLYGHSAAAPAIGASNSQTIVTNGTSFALNGDNVSYRFHVNSSTGDLISDHFGGVVSGTIPSPVEPAVNGWVGMPGRIRREFPDQGRGDFRIPAVRIRESAGYTVSDLQYVSHEVIEGKYALPGLPATFGDAQDATTLVVHLYDNYSSVAADLSYSIFPKYDAIVRSVNVTNQGPGNITIEALASISIDFPYEDLDMVSLRGDWAREANVQRSKVQYGVQGFGSSTGYSSHLHNPFLAIVDPATTESQGEAWGFNLVYTGSFSAQVEKGSQGFTRALLGFNPDQLSWNLGPGETLTSPECVAVYSDKGLGSVSRKFHRLYRNHLMKSKFATSDRPVLLNSWEGVYFDYNQSSIETLAEESAALGVHLFVMDDGWFGDKYPRVSDNAGLGDWMPNPARFPDGLTPVVQDITNLTVNGTESTKLRFGIWVEPEMVNPNSTLYHEHPEWALHAGPYPRTERRNQLVLNLALPAVQDFIIDFMTNLLQDT.... The pIC50 is 5.7. The compound is OC[C@H]1NC[C@H](O)[C@@H](O)[C@H]1O. (8) The small molecule is CNc1nc(-c2ccc3c(c2)CCN3C(=O)c2ccccc2OCc2ccc(Cl)cc2)cs1. The target protein sequence is MLASPATETTVLMSQTEADLALRPPPPLGTAGQPRLGPPPRRARRFSGKAEPRPRSSRLSRRSSVDLGLLSSWSLPASPAPDPPDPPDSAGPGPARSPPPSSKEPPEGTWTEGAPVKAAEDSARPELPDSAVGPGSREPLRVPEAVALERRREQEEKEDMETQAVATSPDGRYLKFDIEIGRGSFKTVYRGLDTDTTVEVAWCELQTRKLSRAERQRFSEEVEMLKGLQHPNIVRFYDSWKSVLRGQVCIVLVTELMTSGTLKTYLRRFREMKPRVLQRWSRQILRGLHFLHSRVPPILHRDLKCDNVFITGPTGSVKIGDLGLATLKRASFAKSVIGTPEFMAPEMYEEKYDEAVDVYAFGMCMLEMATSEYPYSECQNAAQIYRKVTSGRKPNSFHKVKIPEVKEIIEGCIRTDKNERFTIQDLLAHAFFREERGVHVELAE. The pIC50 is 6.7. (9) The small molecule is Cn1cnc(S(=O)(=O)N(CCN(Cc2cncn2C)c2ccc(C#N)cc2)Cc2ccc(-c3ccccc3)cc2)c1. The target protein sequence is MGFTSLGLSAPILKAVEEQGYSTPSPIQLQAIPAVIEGKDVMAAAQTGTGKTAGFTLPLLERLSNGPKRKFNQVRALVLTPTRELAAQVHESVEKYSKNLPLTSDVVFGGVKVNPQMQRLRRGVDVLVATPGRLLDLANQNAIKFDQLEILVLDEADRMLDMGFIHDIKKILAKLPKNRQNLLFSATFSDEIRQLAKGLVKDPVEISVAKRNTTAETVEQSVYVMDKGRKPKVLTKLIKDNDWKQVLVFSKTKHGANRLAKTLEEKGVSAAAIHGNKSQGARTKALANFKSGQVRVLVATDIAARGLDIEQLPQVINVDLPKVPEDYVHRIGRTGRAGATGKAISFVSEDEAKELFAIERLIQKVLPRHVLEGFEPVNKVPESKLDTRPIKPKKPKKPKAPRVEHKDGQRSGENRNGNKQGAKQGQKPATKRTPTNNPSGKKEGTDSDKKKRPFSGKPKTKGTGENRGNGSNFGKSKSTPKSDVKPRRQGPRPARKPKAN.... The pIC50 is 6.6.