From a dataset of Drug-target binding data from BindingDB using IC50 measurements. Regression. Given a target protein amino acid sequence and a drug SMILES string, predict the binding affinity score between them. We predict pIC50 (pIC50 = -log10(IC50 in M); higher means more potent). Dataset: bindingdb_ic50. (1) The small molecule is CN1C(=[NH2+])N(C)[C@]2(Cc3c([nH]c4ccccc34)[C@@]3(C(=O)N(C)C(=[NH2+])N3C)[C@H]2c2c[nH]c3ccccc23)C1=O. The target protein (O75311) has sequence MAHVRHFRTLVSGFYFWEAALLLSLVATKETDSARSRSAPMSPSDFLDKLMGRTSGYDARIRPNFKGPPVNVTCNIFINSFGSIAETTMDYRVNIFLRQKWNDPRLAYSEYPDDSLDLDPSMLDSIWKPDLFFANEKGANFHEVTTDNKLLRIFKNGNVLYSIRLTLTLSCPMDLKNFPMDVQTCIMQLESFGYTMNDLIFEWQDEAPVQVAEGLTLPQFLLKEEKDLRYCTKHYNTGKFTCIEVRFHLERQMGYYLIQMYIPSLLIVILSWVSFWINMDAAPARVALGITTVLTMTTQSSGSRASLPKVSYVKAIDIWMAVCLLFVFSALLEYAAVNFVSRQHKELLRFRRKRKNKTEAFALEKFYRFSDMDDEVRESRFSFTAYGMGPCLQAKDGMTPKGPNHPVQVMPKSPDEMRKVFIDRAKKIDTISRACFPLAFLIFNIFYWVIYKILRHEDIHQQQD. The pIC50 is 3.5. (2) The small molecule is COC1CN(C(=O)[C@@]2(C)CN(C(C)C)C(=O)c3c(O)c(=O)c(-c4ncc(Cc5ccc(F)cc5)s4)cn32)C1. The target protein (P12504) has sequence MENRWQVMIVWQVDRMRINTWKRLVKHHMYISRKAKDWFYRHHYESTNPKISSEVHIPLGDAKLVITTYWGLHTGERDWHLGQGVSIEWRKKRYSTQVDPDLADQLIHLHYFDCFSESAIRNTILGRIVSPRCEYQAGHNKVGSLQYLALAALIKPKQIKPPLPSVRKLTEDRWNKPQKTKGHRGSHTMNGH. The pIC50 is 7.9. (3) The drug is C[C@H](CCC(=O)NO)[C@H]1CC[C@H]2[C@@H]3CC[C@@H]4C[C@H](O)CC[C@]4(C)[C@H]3CC[C@@]21C. The target protein sequence is MNSKSAQGLAGLRNLGNTCFMNSILQCLSNTRELRDYCLQRLYMRDLHHGSNAHTALVEEFAKLIQTIWTSSPNDVVSPSEFKTQIQRYAPRFVGYNQQDAQEFLRFLLDGLHNEVNRVTLRPKSNPENLDHLPDDEKGRQMWRKYLEREDSRIGDLFVGQLKSSLTCTDCGYCSTVFDPFWDLSLPIAKRGYPEVTLMDCMRLFTKEDVLDGDEKPTCCRCRGRKRCIKKFSIQRFPKILVLHLKRFSESRIRTSKLTTFVNFPLRDLDLREFASENTNHAVYNLYAVSNHSGTTMGGHYTAYCRSPGTGEWHTFNDSSVTPMSSSQVRTSDAYLLFYELASPPSRM. The pIC50 is 5.0. (4) The small molecule is O=[N+]([O-])c1cc([N+](=O)[O-])c(O)c(O)c1O. The target protein (P54300) has sequence MKKALLFSLISMVGFSPASQATQVLNGYWGYQEFLDEFPEQRNLTNALSEAVRAQPVPLSKPTQRPIKISVVYPGQQVSDYWVRNIASFEKRLYKLNINYQLNQVFTRPNADIKQQSLSLMEALKSKSDYLIFTLDTTRHRKFVEHVLDSTNTKLILQNITTPVREWDKHQPFLYVGFDHAEGSRELATEFGKFFPKHTYYSVLYFSEGYISDVRGDTFIHQVNRDNNFELQSAYYTKATKQSGYDAAKASLAKHPDVDFIYACSTDVALGAVDALAELGREDIMINGWGGGSAELDAIQKGDLDITVMRMNDDTGIAMAEAIKWDLEDKPVPTVYSGDFEIVTKADSPERIEALKKRAFRYSDN. The pIC50 is 5.5. (5) The drug is C[C@H]1COCCN1c1nc(N2CCOC[C@@H]2C)c2ccc(-c3ccc(F)c(CN)c3)nc2n1. The target protein (P42345) has sequence MLGTGPAAATTAATTSSNVSVLQQFASGLKSRNEETRAKAAKELQHYVTMELREMSQEESTRFYDQLNHHIFELVSSSDANERKGGILAIASLIGVEGGNATRIGRFANYLRNLLPSNDPVVMEMASKAIGRLAMAGDTFTAEYVEFEVKRALEWLGADRNEGRRHAAVLVLRELAISVPTFFFQQVQPFFDNIFVAVWDPKQAIREGAVAALRACLILTTQREPKEMQKPQWYRHTFEEAEKGFDETLAKEKGMNRDDRIHGALLILNELVRISSMEGERLREEMEEITQQQLVHDKYCKDLMGFGTKPRHITPFTSFQAVQPQQSNALVGLLGYSSHQGLMGFGTSPSPAKSTLVESRCCRDLMEEKFDQVCQWVLKCRNSKNSLIQMTILNLLPRLAAFRPSAFTDTQYLQDTMNHVLSCVKKEKERTAAFQALGLLSVAVRSEFKVYLPRVLDIIRAALPPKDFAHKRQKAMQVDATVFTCISMLARAMGPGIQQD.... The pIC50 is 7.6. (6) The compound is COc1ccc(-n2cc(-c3ccc(OC(F)F)cc3)[n+]3c2CCCCC3)cc1. The target protein sequence is MFNPMTPPQVNSYSEPCCLRPLHSQGVPSMGTEGLSGLPFCHQANFMSGSQGYGAARETSSCTEGSLFPPPPPPRSSVKLTKKRALSISPLSDASLDLQTVIRTSPSSLVAFINSRCTSPGGSYGHLSIGTMSPSLGFPPQMSHQKGTSPPYGVQPCVPHDSTRGSMMLHPQARGPRATCQLKSELDMMVGKCPEDPLEGDMSSPNSTGIQDHLLGMLDGREDLEREEKPEPESVYETDCRWDGCSQEFDSQEQLVHHINSEHIHGERKEFVCHWGGCSRELRPFKAQYMLVVHMRRHTGEKPHKCTFEGCRKSYSRLENLKTHLRSHTGEKPYMCEQEGCSKAFSNASDRAKHQNRTHSNEKPYVCKLPGCTKRYTDPSSLRKHVKTVHGPDAHVTKRHRGDGPLPRAQPLSTVEPKREREGGSGREESRLTVPESAMPQQSPGAQSSCSSDHSPAGSAANTDSGVEMAGNAGGSTEDLSSLDEGPCVSATGLSTLRRL.... The pIC50 is 5.6.