From a dataset of Drug-target binding data from BindingDB using IC50 measurements. Regression. Given a target protein amino acid sequence and a drug SMILES string, predict the binding affinity score between them. We predict pIC50 (pIC50 = -log10(IC50 in M); higher means more potent). Dataset: bindingdb_ic50. (1) The small molecule is Cc1cc(C)c(Oc2ccc(F)cc2C(=O)Nc2cc[nH]c(=O)c2)cc1C. The target protein (Q9Y5Y9) has sequence MEFPIGSLETNNFRRFTPESLVEIEKQIAAKQGTKKAREKHREQKDQEEKPRPQLDLKACNQLPKFYGELPAELIGEPLEDLDPFYSTHRTFMVLNKGRTISRFSATRALWLFSPFNLIRRTAIKVSVHSWFSLFITVTILVNCVCMTRTDLPEKIEYVFTVIYTFEALIKILARGFCLNEFTYLRDPWNWLDFSVITLAYVGTAIDLRGISGLRTFRVLRALKTVSVIPGLKVIVGALIHSVKKLADVTILTIFCLSVFALVGLQLFKGNLKNKCVKNDMAVNETTNYSSHRKPDIYINKRGTSDPLLCGNGSDSGHCPDGYICLKTSDNPDFNYTSFDSFAWAFLSLFRLMTQDSWERLYQQTLRTSGKIYMIFFVLVIFLGSFYLVNLILAVVTMAYEEQNQATTDEIEAKEKKFQEALEMLRKEQEVLAALGIDTTSLHSHNGSPLTSKNASERRHRIKPRVSEGSTEDNKSPRSDPYNQRRMSFLGLASGKRRAS.... The pIC50 is 5.5. (2) The compound is C[C@H](CCC(N)=O)[C@H]1CC[C@H]2[C@@H]3CC[C@@H]4C[C@H](O)CC[C@]4(C)[C@H]3CC[C@@]21C. The target protein sequence is MNSKSAQGLAGLRNLGNTCFMNSILQCLSNTRELRDYCLQRLYMRDLHHGSNAHTALVEEFAKLIQTIWTSSPNDVVSPSEFKTQIQRYAPRFVGYNQQDAQEFLRFLLDGLHNEVNRVTLRPKSNPENLDHLPDDEKGRQMWRKYLEREDSRIGDLFVGQLKSSLTCTDCGYCSTVFDPFWDLSLPIAKRGYPEVTLMDCMRLFTKEDVLDGDEKPTCCRCRGRKRCIKKFSIQRFPKILVLHLKRFSESRIRTSKLTTFVNFPLRDLDLREFASENTNHAVYNLYAVSNHSGTTMGGHYTAYCRSPGTGEWHTFNDSSVTPMSSSQVRTSDAYLLFYELASPPSRM. The pIC50 is 4.8. (3) The compound is CCCC(CCC)C(=O)OC1C[C@@H]2CC[C@H](C1)[N+]2(C)C. The target protein (P06935) has sequence MSKKPGGPGKNRAVNMLKRGMPRGLSLIGLKRAMLSLIDGKGPIRFVLALLAFFRFTAIAPTRAVLDRWRGVNKQTAMKHLLSFKKELGTLTSAINRRSTKQKKRGGTAGFTILLGLIACAGAVTLSNFQGKVMMTVNATDVTDVITIPTAAGKNLCIVRAMDVGYLCEDTITYECPVLAAGNDPEDIDCWCTKSSVYVRYGRCTKTRHSRRSRRSLTVQTHGESTLANKKGAWLDSTKATRYLVKTESWILRNPGYALVAAVIGWMLGSNTMQRVVFAILLLLVAPAYSFNCLGMSNRDFLEGVSGATWVDLVLEGDSCVTIMSKDKPTIDVKMMNMEAANLADVRSYCYLASVSDLSTRAACPTMGEAHNEKRADPAFVCKQGVVDRGWGNGCGLFGKGSIDTCAKFACTTKATGWIIQKENIKYEVAIFVHGPTTVESHGKIGATQAGRFSITPSAPSYTLKLGEYGEVTVDCEPRSGIDTSAYYVMSVGEKSFLVH.... The pIC50 is 4.3. (4) The drug is Nc1nccc(-c2c[nH]c3ccccc23)n1. The target protein (P00516) has sequence MSELEEDFAKILMLKEERIKELEKRLSEKEEEIQELKRKLHKCQSVLPVPSTHIGPRTTRAQGISAEPQTYRSFHDLRQAFRKFTKSERSKDLIKEAILDNDFMKNLELSQIQEIVDCMYPVEYGKDSCIIKEGDVGSLVYVMEDGKVEVTKEGVKLCTMGPGKVFGELAILYNCTRTATVKTLVNVKLWAIDRQCFQTIMMRTGLIKHTEYMEFLKSVPTFQSLPEEILSKLADVLEETHYENGEYIIRQGARGDTFFIISKGKVNVTREDSPNEDPVFLRTLGKGDWFGEKALQGEDVRTANVIAAEAVTCLVIDRDSFKHLIGGLDDVSNKAYEDAEAKAKYEAEAAFFANLKLSDFNIIDTLGVGGFGRVELVQLKSEESKTFAMKILKKRHIVDTRQQEHIRSEKQIMQGAHSDFIVRLYRTFKDSKYLYMLMEACLGGELWTILRDRGSFEDSTTRFYTACVVEAFAYLHSKGIIYRDLKPENLILDHRGYAKL.... The pIC50 is 3.4. (5) The drug is C[C@@H](N)[C@H]1CC[C@H](C(=O)Nc2ccncc2)CC1. The target protein sequence is MGNAAAAKKGSEQESVKEFLAKAKEDFLKKWENPAQNTAHLDQFERIKTLGTGSFGRVMLVKHMETGNHYAMKILDKQKVVKLKQIEHTLNEKRILQAVNFPFLVKLEFSFKDNSNLYMVMEYVPGGEMFSHLRRIGRFSEPHARFYAAQIVLTFEYLHSLDLIYRDLKPENLLIDQQGYIKVADFGFAKRVKGRTWTLCGTPEYLAPEIILSKGYNKAVDWWALGVLIYEMAAGYPPFFADQPIQIYEKIVSGKVRFPSHFSSDLKDLLRNLLQVDLTKRFGNLKNGVNDIKNHKWFATTDWIAIYQRKVEAPFIPKFKGPGDTSNFDDYEEEEIRVSINEKCGKEFSEF. The pIC50 is 5.0. (6) The pIC50 is 6.4. The target protein sequence is MRFLIIHIAVIVLPFVLMIDVKRENSFFLRHSPKRLYKKADYNNMYDKIIKKQQNRIYDVSSQINQDNINGQNISFNLTFPNYDTSIDIEDIKKILPHRYPFLLVDKVIYMQPNKTIIGLKQVSTNEPFFNGHFPQKQIMPGVLQIEALAQLAGILCLKSDDSQKNNLFLFAGVDGVRWKKPVLPGDTLTMQANLISFKSSLGIAKLSGVGYVNGKVVINISEMTFALSK. The small molecule is O=C(O[C@@H]1Cc2c(O)cc(O)cc2O[C@@H]1c1ccc(O)c(O)c1)c1cc(O)c(O)c(O)c1. (7) The drug is O=C1Nc2ccc(-c3cc[nH]n3)cc2[C@](CNC(=O)c2ccc(F)cc2)(C(F)(F)F)O1. The target protein (Q920L5) has sequence MNMSVLTLQEYEFEKQFNENEAIQWMQENWKKSFLFSALYAAFIFGGRHLMNKRAKFELRKPLVLWSLTLAVFSIFGALRTGAYMLYILMTKGLKQSVCDQSFYNGPVSKFWAYAFVLSKAPELGDTIFIILRKQKLIFLHWYHHITVLLYSWYSYKDMVAGGGWFMTMNYGVHAVMYSYYALRAAGFRVSRKFAMFITLSQITQMLMGCVINYLVFNWMQHDNDQCYSHFQNIFWSSLMYLSYLVLFCHFFFEAYIGKVKKATKAE. The pIC50 is 7.8. (8) The compound is O=C(NCC(F)(F)F)[C@@H]1CN(Cc2cc3ccsc3s2)CCN1C[C@@H](O)C[C@@H](Cc1cccnc1)C(=O)N[C@H]1c2ccccc2OC[C@H]1O. The target protein sequence is PQITLWKRPIVTIKIGGQLKEALLDTGADDTVLEEIDLPGRWKPKIIGGIGGFIKVKQYDQIPIEICGHKVISTVLVGPTPVNVIGRNLMTQLGCTLNF. The pIC50 is 9.3. (9) The small molecule is C[C@H](CCC(=O)NCCS(=O)(=O)O)[C@H]1CC[C@H]2[C@H]3[C@H](C[C@H](O)[C@@]21C)[C@@]1(C)CC[C@@H](O)C[C@H]1C[C@H]3O. The target protein (P26435) has sequence MEVHNVSAPFNFSLPPGFGHRATDKALSIILVLMLLLIMLSLGCTMEFSKIKAHLWKPKGVIVALVAQFGIMPLAAFLLGKIFHLSNIEALAILICGCSPGGNLSNLFTLAMKGDMNLSIVMTTCSSFSALGMMPLLLYVYSKGIYDGDLKDKVPYKGIMISLVIVLIPCTIGIVLKSKRPHYVPYILKGGMIITFLLSVAVTALSVINVGNSIMFVMTPHLLATSSLMPFSGFLMGYILSALFQLNPSCRRTISMETGFQNIQLCSTILNVTFPPEVIGPLFFFPLLYMIFQLAEGLLIIIIFRCYEKIKPPKDQTKITYKAAATEDATPAALEKGTHNGNIPPLQPGPSPNGLNSGQMAN. The pIC50 is 5.2. (10) The small molecule is O=C1NCc2c1c1c3ccccc3n3c1c1c2c2ccccc2n1C[C@@H](O)[C@@H](O)C3. The target protein (P42229) has sequence MAGWIQAQQLQGDALRQMQVLYGQHFPIEVRHYLAQWIESQPWDAIDLDNPQDRAQATQLLEGLVQELQKKAEHQVGEDGFLLKIKLGHYATQLQKTYDRCPLELVRCIRHILYNEQRLVREANNCSSPAGILVDAMSQKHLQINQTFEELRLVTQDTENELKKLQQTQEYFIIQYQESLRIQAQFAQLAQLSPQERLSRETALQQKQVSLEAWLQREAQTLQQYRVELAEKHQKTLQLLRKQQTIILDDELIQWKRRQQLAGNGGPPEGSLDVLQSWCEKLAEIIWQNRQQIRRAEHLCQQLPIPGPVEEMLAEVNATITDIISALVTSTFIIEKQPPQVLKTQTKFAATVRLLVGGKLNVHMNPPQVKATIISEQQAKSLLKNENTRNECSGEILNNCCVMEYHQATGTLSAHFRNMSLKRIKRADRRGAESVTEEKFTVLFESQFSVGSNELVFQVKTLSLPVVVIVHGSQDHNATATVLWDNAFAEPGRVPFAVPD.... The pIC50 is 6.2.