From a dataset of Drug-target binding data from BindingDB using IC50 measurements. Regression. Given a target protein amino acid sequence and a drug SMILES string, predict the binding affinity score between them. We predict pIC50 (pIC50 = -log10(IC50 in M); higher means more potent). Dataset: bindingdb_ic50. (1) The drug is O=C1CCCc2c(ccc3ccccc23)O1. The target protein (P06700) has sequence MTIPHMKYAVSKTSENKVSNTVSPTQDKDAIRKQPDDIINNDEPSHKKIKVAQPDSLRETNTTDPLGHTKAALGEVASMELKPTNDMDPLAVSAASVVSMSNDVLKPETPKGPIIISKNPSNGIFYGPSFTKRESLNARMFLKYYGAHKFLDTYLPEDLNSLYIYYLIKLLGFEVKDQALIGTINSIVHINSQERVQDLGSAISVTNVEDPLAKKQTVRLIKDLQRAINKVLCTRLRLSNFFTIDHFIQKLHTARKILVLTGAGVSTSLGIPDFRSSEGFYSKIKHLGLDDPQDVFNYNIFMHDPSVFYNIANMVLPPEKIYSPLHSFIKMLQMKGKLLRNYTQNIDNLESYAGISTDKLVQCHGSFATATCVTCHWNLPGERIFNKIRNLELPLCPYCYKKRREYFPEGYNNKVGVAASQGSMSERPPYILNSYGVLKPDITFFGEALPNKFHKSIREDILECDLLICIGTSLKVAPVSEIVNMVPSHVPQVLINRDPV.... The pIC50 is 4.9. (2) The drug is CCCCCCCCCC[C@H](C)[C@@H](O)[C@@H](C)C=C(C)C=C(C)C(=O)[C@H](C)C=C(C)C(=O)O[C@H](CO)[C@@H](O)[C@H](O)C(=O)O. The target protein (O13332) has sequence MASSILRSKIIQKPYQLFHYYFLSEKAPGSTVSDLNFDTNIQTSLRKLKHHHWTVGEIFHYGFLVSILFFVFVVFPASFFIKLPIILAFATCFLIPLTSQFFLPALPVFTWLALYFTCAKIPQEWKPAITVKVLPAMETILYGDNLSNVLATITTGVLDILAWLPYGIIHFSFPFVLAAIIFLFGPPTALRSFGFAFGYMNLLGVLIQMAFPAAPPWYKNLHGLEPANYSMHGSPGGLGRIDKLLGVDMYTTGFSNSSIIFGAFPSLHSGCCIMEVLFLCWLFPRFKFVWVTYASWLWWSTMYLTHHYFVDLIGGAMLSLTVFEFTKYKYLPKNKEGLFCRWSYTEIEKIDIQEIDPLSYNYIPINSNDNESRLYTRVYQESQVSPPSRAETPEAFEMSNFSRSRQSSKTQVPLSNLTNNDQVPGINEEDEEEEGDEISSSTPSVFEDEPQGSTYAASSATSVDDLDSKRN. The pIC50 is 9.2. (3) The drug is O=C(NC[C@H]1CC[C@H](Oc2ccnc3ccccc23)CC1)c1cccc(F)c1. The target protein sequence is MEVQLGLGRVYPRPPSKTYRGAFQNLFQSVREVIQNPGPRHPEAASAAPPGASLLLLQQQQQQQQQQQQQQQQQQQQQETSPRQQQQQQGEDGSPQAHRRGPTGYLVLDEEQQPSQPQSALECHPERGCVPEPGAAVAASKGLPQQLPAPPDEDDSAAPSTLSLLGPTFPGLSSCSADLKDILSEASTMQLLQQQQQEAVSEGSSSGRAREASGAPTSSKDNYLGGTSTISDNAKELCKAVSVSMGLGVEALEHLSPGEQLRGDCMYAPLLGVPPAVRPTPCAPLAECKGSLLDDSAGKSTEDTAEYSPFKGGYTKGLEGESLGCSGSAAAGSSGTLELPSTLSLYKSGALDEAAAYQSRDYYNFPLALAGPPPPPPPPHPHARIKLENPLDYGSAWAAAAAQCRYGDLASLHGAGAAGPGSGSPSAAASSSWHTLFTAEEGQLYGPCGGGGGGGGGGGGGGGGGGGGGGGGEAGAVAPYGYTRPPQGLAGQESDFTAPD.... The pIC50 is 6.7. (4) The compound is Cn1ccnc1SC[C@@]1(C)[C@H](C(=O)O)N2C(=O)C[C@H]2S1(=O)=O. The target protein (P13661) has sequence MKNTIHINFAIFLIIANIIYSSASASTDISTVASPLFEGTEGCFLLYDASTNAEIAQFNKAKCATQMAPDSTFKIALSLMAFDAEIIDQKTIFKWDKTPKGMEIWNSNHTPKTWMQFSVVWVSQEITQKIGLNKIKNYLKDFDYGNQDFSGDKERNNGLTEAWLESSLKISPEEQIQFLRKIINHNLPVKNSAIENTIENMYLQDLDNSTKLYGKTGAGFTANRTLQNGWFEGFIISKSGHKYVFVSALTGNLGSNLTSSIKAKKNAITILNTLNL. The pIC50 is 6.3. (5) The pIC50 is 8.4. The target protein sequence is MALSDLVLLRWLRDSRHSRKLILFIVFLALLLDNMLLTVVVPIIPSYLYSIKHEKNSTEIQTTRPELVVSTSESIFSYYNNSTVLITGNATGTLPGGQSHKATSTQHTVANTTVPSDCPSEDRDLLNENVQVGLLFASKATVQLLTNPFIGLLTNRIGYPIPMFAGFCIMFISTVMFAFSSSYAFLLIARSLQGIGSSCSSVAGMGMLASVYTDDEERGKPMGIALGGLAMGVLVGPPFGSVLYEFVGKTAPFLVLAALVLLDGAIQLFVLQPSRVQPESQKGTPLTTLLKDPYILIAAGSICFANMGIAMLETALPIWMMETMCSRKWQLGVAFLPASISYLIGTNIFGILAHKMGRWLCALLGMVIVGISILCIPFAKNIYGLIAPNFGVGFAIGMVDSSMMPIMGYLVDLRHVSVYGSVYAIADVAFCMGYAIGPSAGGAIAKAIGFPWLMTIIGIIDIAFAPLCFFLRSPPAKEEKMAILMDHNCPIKRKMYTQNN.... The small molecule is COC(=O)[C@H]1[C@H]2C[C@@H]3c4[nH]c5cc(OC)ccc5c4CCN3C[C@H]2C[C@@H](OC(=O)c2cc(OC)c(OC)c(OC)c2)[C@@H]1OC. (6) The small molecule is C[C@H](CCC(=O)NCCS(=O)(=O)O)[C@H]1CC[C@H]2[C@H]3[C@H](C[C@H](O)[C@@]21C)[C@@]1(C)CC[C@@H](O)C[C@H]1C[C@H]3O. The target protein sequence is MDALCGSGELGSKFWDSNLSVHTENPDLTPCFQNSLLAWVPCIYLWVALPCYLLYLRHHCRGYIILSHLSKLKMVLGVLLWCVSWADLFYSFHGLVHGRAPAPVFFVTPLVVGVTMLLATLLIQYERLQGVQSSGVLIIFWFLCVVCAIVPFRSKILLAKAEGEISDPFRFTTFYIHFALVLSALILACFREKPPFFSAKNVDPNPYPETSAGFLSRLFFWWFTKMAIYGYRHPLEEKDLWSLKEEDRSQMVVQQLLEAWRKQEKQTARHKASAAPGKNASGEDEVLLGARPRPRKPSFLKALLATFGSSFLISACFKLIQDLLSFINPQLLSILIRFISNPMAPSWWGFLVAGLMFLCSMMQSLILQHYYHYIFVTGVKFRTGIMGVIYRKALVITNSVKRASTVGEIVNLMSVDAQRFMDLAPFLNLLWSAPLQIILAIYFLWQNLGPSVLAGVAFMVLLIPLNGAVAVKMRAFQVKQMKLKDSRIKLMSEILNGIKV.... The pIC50 is 4.4.