From a dataset of Drug-target binding data from BindingDB using IC50 measurements. Regression. Given a target protein amino acid sequence and a drug SMILES string, predict the binding affinity score between them. We predict pIC50 (pIC50 = -log10(IC50 in M); higher means more potent). Dataset: bindingdb_ic50. (1) The compound is O=C(c1cccs1)N1CCCC1c1ccc(-c2cccc(Cl)c2)nc1. The target protein (P41594) has sequence MVLLLILSVLLLKEDVRGSAQSSERRVVAHMPGDIIIGALFSVHHQPTVDKVHERKCGAVREQYGIQRVEAMLHTLERINSDPTLLPNITLGCEIRDSCWHSAVALEQSIEFIRDSLISSEEEEGLVRCVDGSSSSFRSKKPIVGVIGPGSSSVAIQVQNLLQLFNIPQIAYSATSMDLSDKTLFKYFMRVVPSDAQQARAMVDIVKRYNWTYVSAVHTEGNYGESGMEAFKDMSAKEGICIAHSYKIYSNAGEQSFDKLLKKLTSHLPKARVVACFCEGMTVRGLLMAMRRLGLAGEFLLLGSDGWADRYDVTDGYQREAVGGITIKLQSPDVKWFDDYYLKLRPETNHRNPWFQEFWQHRFQCRLEGFPQENSKYNKTCNSSLTLKTHHVQDSKMGFVINAIYSMAYGLHNMQMSLCPGYAGLCDAMKPIDGRKLLESLMKTNFTGVSGDTILFDENGDSPGRYEIMNFKEMGKDYFDYINVGSWDNGELKMDDDEVW.... The pIC50 is 8.1. (2) The compound is CC1=CC2Cc3nc4cc(Cl)ccc4c(N)c3C(C1)C2. The target protein (P23795) has sequence MRPPWCPLHTPSLTPPLLLLLFLIGGGAEAEGPEDPELLVMVRGGRLRGLRLMAPRGPVSAFLGIPFAEPPVGPRRFLPPEPKRPWPGVLNATAFQSVCYQYVDTLYPGFEGTEMWNPNRELSEDCLYLNVWTPYPRPSSPTPVLVWIYGGGFYSGASSLDVYDGRFLTQAEGTVLVSMNYRVGAFGFLALPGSREAPGNVGLLDQRLALQWVQENVAAFGGDPTSVTLFGESAGAASVGMHLLSPPSRGLFHRAVLQSGAPNGPWATVGVGEARRRATLLARLVGCPPGGAGGNDTELVACLRARPAQDLVDHEWRVLPQESVFRFSFVPVVDGDFLSDTPEALINAGDFHGLQVLVGVVKDEGSYFLVYGAPGFSKDNESLISRAQFLAGVRVGVPQASDLAAEAVVLHYTDWLHPEDPARLREALSDVVGDHNVVCPVAQLAGRLAAQGARVYAYIFEHRASTLSWPLWMGVPHGYEIEFIFGLPLEPSLNYTIEER.... The pIC50 is 9.1. (3) The compound is CC(C)CCc1nc2c([nH]1)C(=O)N(C)C1=N[C@@H]3CCC[C@@H]3N12. The target protein (P14100) has sequence MGSTATETEELENTTFKYLIGEQTEKMWQRLKGILRCLVKQLEKGDVNVIDLKKNIEYAASVLEAVYIDETRRLLDTDDELSDIQSDSVPSEVRDWLASTFTRKMGMMKKKSEEKPRFRSIVHVVQAGIFVERMYRKSYHMVGLAYPEAVIVTLKDVDKWSFDVFALNEASGEHSLKFMIYELFTRYDLINRFKIPVSCLIAFAEALEVGYSKYKNPYHNLIHAADVTQTVHYIMLHTGIMHWLTELEILAMVFAAAIHDYEHTGTTNNFHIQTRSDVAILYNDRSVLENHHVSAAYRLMQEEEMNVLINLSKDDWRDLRNLVIEMVLSTDMSGHFQQIKNIRNSLQQPEGLDKAKTMSLILHAADISHPAKSWKLHHRWTMALMEEFFLQGDKEAELGLPFSPLCDRKSTMVAQSQIGFIDFIVEPTFSLLTDSTEKIIIPLIEEDSKTKTPSYGASRRSNMKGTTNDGTYSPDYSLASVDLKSFKNSLVDIIQQNKER.... The pIC50 is 7.8. (4) The compound is CN(C)C(=O)n1nnnc1Cc1ccc(-c2ccccc2)cc1. The target protein (Q8BLF1) has sequence MRSSCVLLAALLALAAYYVYIPLPSAVSDPWKLMLLDATFRGAQQVSNLIHSLGLNHHLIALNFIITSFGKQSARSSPKVKVTDTDFDGVEVRVFEGSPKPEEPLRRSVIYIHGGGWALASAKISYYDQLCTTMAEELNAVIVSIEYRLVPQVYFPEQIHDVIRATKYFLQPEVLDKYKVDPGRVGISGDSAGGNLAAALGQQFTYVASLKNKLKLQALVYPVLQALDFNTPSYQQSMNTPILPRHVMVRYWLDYFKGNYDFVEAMIVNNHTSLDVERAAALRARLDWTSLLPSSIKKNYKPIMQTTGNARIVQEIPQLLDAAASPLIAEQEVLEALPKTYILTCEHDVLRDDGIMYAKRLESAGVNVTLDHFEDGFHGCMIFTSWPTNFSVGIRTRNSYIKWLDQNL. The pIC50 is 8.1. (5) The drug is CO[C@H]1CN(c2nc(-c3nccn3C)c(C(=O)O)s2)CC[C@H]1NC(=O)c1[nH]c(C)c(Cl)c1Cl. The target protein (Q79EC5) has sequence MTAYILTAEAEADLRGIIRYTRREWGAAQVRRYIAKLEQGIARLAAGEGPFKDMSELFPALRMARCEHHYVFCLPRAGEPALVVAILHERMDLMTRLADRLKG. The pIC50 is 6.9. (6) The pIC50 is 6.7. The target protein (Q63120) has sequence MDKFCNSTFWDLSLLESPEADLPLCFEQTVLVWIPLGFLWLLAPWQLYSVYRSRTKRSSITKFYLAKQVFVVFLLILAAIDLSLALTEDTGQATVPPVRYTNPILYLCTWLLVLAVQHSRQWCVRKNSWFLSLFWILSVLCGVFQFQTLIRALLKDSKSNMAYSYLFFVSYGFQIVLLILTAFSGPSDSTQTPSVTASFLSSITFSWYDRTVLKGYKHPLTLEDVWDIDEGFKTRSVTSKFEAAMTKDLQKARQAFQRRLQKSQRKPEATLHGLNKKQSQSQDVLVLEEAKKKSEKTTKDYPKSWLIKSLFKTFHVVILKSFILKLIHDLLVFLNPQLLKLLIGFVKSSNSYVWFGYICAILMFAVTLIQSFCLQSYFQHCFVLGMCVRTTVMSSIYKKALTLSNLARKQYTIGETVNLMSVDSQKLMDATNYMQLVWSSVIQITLSIFFLWRELGPSILAGVGVMVLLIPVNGVLATKIRNIQVQNMKNKDKRLKIMNE.... The compound is CCCCC/C=C\C/C=C\C=C\C=C\[C@H](SC[C@H](NC(=O)CCC(N)C(=O)O)C(=O)NCC(=O)O)C(O)CCCC(=O)O.