From a dataset of Drug-target binding data from BindingDB using IC50 measurements. Regression. Given a target protein amino acid sequence and a drug SMILES string, predict the binding affinity score between them. We predict pIC50 (pIC50 = -log10(IC50 in M); higher means more potent). Dataset: bindingdb_ic50. (1) The compound is CC(C)(CCn1cnc2c(=O)[nH]c(N)nc21)CCP(=O)(O)O. The target protein (P85973) has sequence MENEFTYEDYQRTAEWLRSHTKHRPQVAVICGSGLGGLTAKLTQPQAFDYNEIPNFPQSTVQGHAGRLVFGFLNGRSCVMMQGRFHMYEGYSLSKVTFPVRVFHLLGVDTLVVTNAAGGLNPKFEVGDIMLIRDHINLPGFCGQNPLRGPNDERFGVRFPAMSDAYDRDMRQKAFNAWKQMGEQRELQEGTYIMSAGPTFETVAESCLLRMLGADAVGMSTVPEVIVARHCGLRVFGFSLITNKVVMDYNNLEKASHQEVLEAGKAAAQKLEQFVSILMESIPPRERAN. The pIC50 is 5.7. (2) The small molecule is O=C1NCCCc2c1oc1ccc(O)cc21. The target protein (Q13563) has sequence MVNSSRVQPQQPGDAKRPPAPRAPDPGRLMAGCAAVGASLAAPGGLCEQRGLEIEMQRIRQAAARDPPAGAAASPSPPLSSCSRQAWSRDNPGFEAEEEEEEVEGEEGGMVVEMDVEWRPGSRRSAASSAVSSVGARSRGLGGYHGAGHPSGRRRRREDQGPPCPSPVGGGDPLHRHLPLEGQPPRVAWAERLVRGLRGLWGTRLMEESSTNREKYLKSVLRELVTYLLFLIVLCILTYGMMSSNVYYYTRMMSQLFLDTPVSKTEKTNFKTLSSMEDFWKFTEGSLLDGLYWKMQPSNQTEADNRSFIFYENLLLGVPRIRQLRVRNGSCSIPQDLRDEIKECYDVYSVSSEDRAPFGPRNGTAWIYTSEKDLNGSSHWGIIATYSGAGYYLDLSRTREETAAQVASLKKNVWLDRGTRATFIDFSVYNANINLFCVVRLLVEFPATGGVIPSWQFQPLKLIRYVTTFDFFLAACEIIFCFFIFYYVVEEILEIRIHKL.... The pIC50 is 6.5. (3) The compound is NC(=O)n1c(-c2cccc(NC(=O)Nc3ccc(F)c(Cl)c3)c2)cc2ccccc21. The target protein (P01127) has sequence MNRCWALFLSLCCYLRLVSAEGDPIPEELYEMLSDHSIRSFDDLQRLLHGDPGEEDGAELDLNMTRSHSGGELESLARGRRSLGSLTIAEPAMIAECKTRTEVFEISRRLIDRTNANFLVWPPCVEVQRCSGCCNNRNVQCRPTQVQLRPVQVRKIEIVRKKPIFKKATVTLEDHLACKCETVAAARPVTRSPGGSQEQRAKTPQTRVTIRTVRVRRPPKGKHRKFKHTHDKTALKETLGA. The pIC50 is 5.0. (4) The pIC50 is 8.5. The target protein (Q9Z1B7) has sequence MSLTRKRGFYKQDINKTAWELPKTYLAPAHVGSGAYGAVCSAIDKRTGEKVAIKKLSRPFQSEIFAKRAYRELLLLKHMHHENVIGLLDVFTPASSLRSFHDFYLVMPFMQTDLQKIMGMEFSEDKVQYLVYQMLKGLKYIHSAGIVHRDLKPGNLAVNEDCELKILDFGLARHTDTEMTGYVVTRWYRAPEVILSWMHYNQTVDIWSVGCIMAEMLTGKTLFKGKDYLDQLTQILKVTGVPGAEFVQKLKDKAAKSYIQSLPQSPKKDFTQLFPRASPQAADLLDKMLELDVDKRLTAAQALAHPFFEPFRDPEEETEAQQPFDDALEHEKLSVDEWKQHIYKEISNFSPIARKDSRRRSGMKLQ. The small molecule is O=C1NCc2nc(Sc3ccc(F)cc3F)c(N3CC4CC3CN4C3CC3)cc2N1c1c(Cl)cccc1Cl. (5) The drug is C[C@@H](O)[C@H](NC(=O)[C@@H]1CSSC[C@H](NC(=O)[C@H](Cc2ccccc2)NC(N)=O)C(=O)N[C@@H](Cc2ccc(NC(N)=O)cc2)C(=O)N[C@H](Cc2c[nH]c3ccccc23)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H]([C@@H](C)O)C(=O)N1)C(N)=O. The target protein (P30874) has sequence MDMADEPLNGSHTWLSIPFDLNGSVVSTNTSNQTEPYYDLTSNAVLTFIYFVVCIIGLCGNTLVIYVILRYAKMKTITNIYILNLAIADELFMLGLPFLAMQVALVHWPFGKAICRVVMTVDGINQFTSIFCLTVMSIDRYLAVVHPIKSAKWRRPRTAKMITMAVWGVSLLVILPIMIYAGLRSNQWGRSSCTINWPGESGAWYTGFIIYTFILGFLVPLTIICLCYLFIIIKVKSSGIRVGSSKRKKSEKKVTRMVSIVVAVFIFCWLPFYIFNVSSVSMAISPTPALKGMFDFVVVLTYANSCANPILYAFLSDNFKKSFQNVLCLVKVSGTDDGERSDSKQDKSRLNETTETQRTLLNGDLQTSI. The pIC50 is 8.1. (6) The target protein (Q9R0C9) has sequence MPWAVGRRWAWITLFLTIVAVLIQAVWLWLGTQSFVFQREEIAQLARQYAGLDHELAFSRLIVELRRLHPGHVLPDEELQWVFVNAGGWMGAMCLLHASLSEYVLLFGTALGSHGHSGRYWAEISDTIISGTFHQWREGTTKSEVYYPGETVVHGPGEATAVEWGPNTWMVEYGRGVIPSTLAFALSDTIFSTQDFLTLFYTLRAYARGLRLELTTYLFGQDP. The pIC50 is 9.1. The compound is C(=C\C1CCCN(CCc2ccccc2)C1)\c1ccccc1. (7) The compound is CCCCCCCCCC[C@@H](O)[C@@H]1CC[C@@H]([C@@H]2CC[C@@H]([C@H](O)CCCCCCCCCC(O)CC3=CC(C)OC3=O)O2)O1. The target protein (P03887) has sequence MFMINILMLIIPILLAVAFLTLVERKVLGYMQLRKGPNVVGPYGLLQPIADAIKLFIKEPLRPATSSASMFILAPIMALGLALTMWIPLPMPYPLINMNLGVLFMLAMSSLAVYSILWSGWASNSKYALIGALRAVAQTISYEVTLAIILLSVLLMSGSFTLSTLITTQEQMWLILPAWPLAMMWFISTLAETNRAPFDLTEGESELVSGFNVEYAAGPFALFFMAEYANIIMMNIFTAILFLGTSHNPHMPELYTINFTIKSLLLTMSFLWIRASYPRFRYDQLMHLLWKNFLPLTLALCMWHVSLPILTSGIPPQT. The pIC50 is 9.1. (8) The drug is O=C(Nc1ccc([N+](=O)[O-])cc1)OC(CN1CCCc2ccccc21)c1ccc(Cl)cc1. The target protein sequence is PISPIETVPVKLKPGMDGPKVKQWPLTEEKIKALVEICAELEEEGKISRIGPENPYNTPVFAIKKKDSTKWRKLVDFRELNKRTQDFWEVQLGIPHPAGLKKKKSVTVLDVGDAYFSIPLDEDFRKYTAFTIPSTNNETPGTRYQYNVLPQGWKGSPAIFQSSMTKILEPFRKQNPDIVIYQYVDDLYVGSDLEIGQHRTKVEELRQHLWRWGFYTPDKKHQKEPPFLWMGYELHPDKWTVQPIVLPEKDSWTVNDIQK. The pIC50 is 5.3. (9) The drug is CC1(C2CCC3C4CC=C5CC(O)CCC5(C)C4CCC32C)NN1. The target protein (P11715) has sequence MWELVGLLLLILAYFFWVKSKTPGAKLPRSLPSLPLVGSLPFLPRRGHMHVNFFKLQEKYGPIYSLRLGTTTTVIIGHYQLAREVLIKKGKEFSGRPQMVTQSLLSDQGKGVAFADAGSSWHLHRKLVFSTFSLFKDGQKLEKLICQEAKSLCDMMLAHDKESIDLSTPIFMSVTNIICAICFNISYEKNDPKLTAIKTFTEGIVDATGDRNLVDIFPWLTIFPNKGLEVIKGYAKVRNEVLTGIFEKCREKFDSQSISSLTDILIQAKMNSDNNNSCEGRDPDVFSDRHILATVGDIFGAGIETTTTVLKWILAFLVHNPEVKKKIQKEIDQYVGFSRTPTFNDRSHLLMLEATIREVLRIRPVAPMLIPHKANVDSSIGEFTVPKDTHVVVNLWALHHDENEWDQPDQFMPERFLDPTGSHLITPTQSYLPFGAGPRSCIGEALARQELFVFTALLLQRFDLDVSDDKQLPRLEGDPKVVFLIDPFKVKITVRQAWMD.... The pIC50 is 4.8. (10) The target protein (Q9QYJ6) has sequence MEDGPSNNASCFRRLTECFLSPSLTDEKVKAYLSLHPQVLDEFVSESVSAETVEKWLKRKNNKAEDEPSPKEVSRYQDTNMQGVVYELNSYIEQRLDTGGDNHLLLYELSSIIRIATKADGFALYFLGECNNSLCVFTPPGMKEGQPRLIPAGPITQGTTISAYVAKSRKTLLVEDILGDERFPRGTGLESGTRIQSVLCLPIVTAIGDLIGILELYRHWGKEAFCLSHQEVATANLAWASVAIHQVQVCRGLAKQTELNDFLLDVSKTYFDNIVAIDSLLEHIMIYAKNLVNADRCALFQVDHKNKELYSDLFDIGEEKEGKPVFKKTKEIRFSIEKGIAGQVARTGEVLNIPDAYADPRFNREVDLYTGYTTRNILCMPIVSRGSVIGVVQMVNKISGSAFSKTDENNFKMFAVFCALALHCANMYHRIRHSECIYRVTMEKLSYHSICTSEEWQGLMHFNLPARICRDIELFHFDIGPFENMWPGIFVYMIHRSCGT.... The drug is NS(=O)(=O)N1CCN(c2ccc(-c3c(CSc4ccc5ccccc5n4)nc4c(N5CCOCC5)ccnn34)cn2)CC1. The pIC50 is 8.7.