From a dataset of Drug-target binding data from BindingDB using IC50 measurements. Regression. Given a target protein amino acid sequence and a drug SMILES string, predict the binding affinity score between them. We predict pIC50 (pIC50 = -log10(IC50 in M); higher means more potent). Dataset: bindingdb_ic50. (1) The drug is CCCCCCCCCCCCCCCCCC(=O)C=C1[C@H](O)[C@@H](C(=O)OC)[C@H](c2ccccc2)N1CCc1ccccc1. The target protein (P04053) has sequence MDPPRASHLSPRKKRPRQTGALMASSPQDIKFQDLVVFILEKKMGTTRRAFLMELARRKGFRVENELSDSVTHIVAENNSGSDVLEWLQAQKVQVSSQPELLDVSWLIECIRAGKPVEMTGKHQLVVRRDYSDSTNPGPPKTPPIAVQKISQYACQRRTTLNNCNQIFTDAFDILAENCEFRENEDSCVTFMRAASVLKSLPFTIISMKDTEGIPCLGSKVKGIIEEIIEDGESSEVKAVLNDERYQSFKLFTSVFGVGLKTSEKWFRMGFRTLSKVRSDKSLKFTRMQKAGFLYYEDLVSCVTRAEAEAVSVLVKEAVWAFLPDAFVTMTGGFRRGKKMGHDVDFLITSPGSTEDEEQLLQKVMNLWEKKGLLLYYDLVESTFEKLRLPSRKVDALDHFQKCFLIFKLPRQRVDSDQSSWQEGKTWKAIRVDLVLCPYERRAFALLGWTGSRQFERDLRRYATHERKMILDNHALYDKTKRIFLKAESEEEIFAHLGLD.... The pIC50 is 4.3. (2) The compound is COc1cc2[nH]ncc2cc1Nc1ncnc2[nH]nc(-c3ccc(F)cc3)c12. The target protein sequence is VSSQKLEKPIEMGSSEPLPIADGDRRRKKKRRGRATDSLPGKFEDMYKLTSELLGEGAYAKVQGAVSLQNGKEYAVKIIEKQAGHSRSRVFREVETLYQCQGNKNILELIEFFEDDTRFYLVFEKLQGGSILAHIQKQKHFNEREASRVVRDVAAALDFLHTKDKVSLCHLGWSAMAPSGLTAAPTSLGSSDPPTSASQVAGTTGIAHRDLKPENILCESPEKVSPVKICDFDLGSGMKLNNSCTPITTPELTTPCGSAEYMAPEVVEVFTDQATFYDKRCDLWSLGVVLYIMLSGYPPFVGHCGADCGWDRGEVCRVCQNKLFESIQEGKYEFPDKDWAHISSEAKDLISKLLVRDAKQRLSAAQVLQHPWVQGQAPEKGLPTPQVLQRNSSTMDLTLFAAEAIALNRQLSQHEENELAEEP. The pIC50 is 8.7. (3) The small molecule is N=C(N)NCCC[C@H](NC(=O)[C@@H]1CCC(=O)NCCCC[C@H](N)C(=O)N2CCC[C@H]2C(=O)N1)C(=O)O. The target protein (Q9QWJ9) has sequence MERGLPLLCATLALALALAGAFRSDKCGGTIKIENPGYLTSPGYPHSYHPSEKCEWLIQAPEPYQRIMINFNPHFDLEDRDCKYDYVEVIDGENEGGRLWGKFCGKIAPSPVVSSGPFLFIKFVSDYETHGAGFSIRYEIFKRGPECSQNYTAPTGVIKSPGFPEKYPNSLECTYIIFAPKMSEIILEFESFDLEQDSNPPGGVFCRYDRLEIWDGFPEVGPHIGRYCGQKTPGRIRSSSGILSMVFYTDSAIAKEGFSANYSVLQSSISEDFKCMEALGMESGEIHSDQITASSQYGTNWSVERSRLNYPENGWTPGEDSYREWIQVDLGLLRFVTAVGTQGAISKETKKKYYVKTYRVDISSNGEDWITLKEGNKAIIFQGNTNPTDVVFGVFPKPLITRFVRIKPASWETGISMRFEVYGCKITDYPCSGMLGMVSGLISDSQITASNQGDRNWMPENIRLVTSRTGWALPPSPHPYINEWLQVDLGDEKIVRGVII.... The pIC50 is 6.7. (4) The compound is CC(C)SCC[C@@H](N)[C@H](O)C(=O)N[C@@H](CO)c1ccccc1. The target protein (P53582) has sequence MAAVETRVCETDGCSSEAKLQCPTCIKLGIQGSYFCSQECFKGSWATHKLLHKKAKDEKAKREVSSWTVEGDINTDPWAGYRYTGKLRPHYPLMPTRPVPSYIQRPDYADHPLGMSESEQALKGTSQIKLLSSEDIEGMRLVCRLAREVLDVAAGMIKPGVTTEEIDHAVHLACIARNCYPSPLNYYNFPKSCCTSVNEVICHGIPDRRPLQEGDIVNVDITLYRNGYHGDLNETFFVGEVDDGARKLVQTTYECLMQAIDAVKPGVRYRELGNIIQKHAQANGFSVVRSYCGHGIHKLFHTAPNVPHYAKNKAVGVMKSGHVFTIEPMICEGGWQDETWPDGWTAVTRDGKRSAQFEHTLLVTDTGCEILTRRLDSARPHFMSQF. The pIC50 is 4.5. (5) The pIC50 is 3.8. The drug is CCCCCCOc1ccc2cc(S(=O)(=O)N[C@H](CCC(=O)O)C(=O)O)ccc2c1. The target protein (P14900) has sequence MADYQGKNVVIIGLGLTGLSCVDFFLARGVTPRVMDTRMTPPGLDKLPEAVERHTGSLNDEWLMAADLIVASPGIALAHPSLSAAADAGIEIVGDIELFCREAQAPIVAITGSNGKSTVTTLVGEMAKAAGVNVGVGGNIGLPALMLLDDECELYVLELSSFQLETTSSLQAVAATILNVTEDHMDRYPFGLQQYRAAKLRIYENAKVCVVNADDALTMPIRGADERCVSFGVNMGDYHLNHQQGETWLRVKGEKVLNVKEMKLSGQHNYTNALAALALADAAGLPRASSLKALTTFTGLPHRFEVVLEHNGVRWINDSKATNVGSTEAALNGLHVDGTLHLLLGGDGKSADFSPLARYLNGDNVRLYCFGRDGAQLAALRPEVAEQTETMEQAMRLLAPRVQPGDMVLLSPACASLDQFKNFEQRGNEFARLAKELG. (6) The drug is O=c1[nH]nc(COc2ccccc2)cc1O. The target protein (P18894) has sequence MRVAVIGAGVIGLSTALCIHERYHPTQPLHMKIYADRFTPFTTSDVAAGLWQPYLSDPSNPQEAEWSQQTFDYLLSCLHSPNAEKMGLALISGYNLFRDEVPDPFWKNAVLGFRKLTPSEMDLFPDYGYGWFNTSLLLEGKSYLPWLTERLTERGVKLIHRKVESLEEVARGVDVIINCTGVWAGALQADASLQPGRGQIIQVEAPWIKHFILTHDPSLGIYNSPYIIPGSKTVTLGGIFQLGNWSGLNSVRDHNTIWKSCCKLEPTLKNARIVGELTGFRPVRPQVRLEREWLRHGSSSAEVIHNYGHGGYGLTIHWGCAMEAANLFGKILEEKKLSRLPPSHL. The pIC50 is 8.3.