This data is from Drug-target binding data from BindingDB using IC50 measurements. The task is: Regression. Given a target protein amino acid sequence and a drug SMILES string, predict the binding affinity score between them. We predict pIC50 (pIC50 = -log10(IC50 in M); higher means more potent). Dataset: bindingdb_ic50. (1) The drug is O=S(=O)(CCC(F)(F)F)Nc1ccc(F)c(Nc2ncccc2-c2ncnc3[nH]cnc23)c1F. The target is CKENALLRYLLDKDD. The pIC50 is 7.0. (2) The drug is O=C([O-])COc1cccc(C[C@@H]2CCCC[C@@H]2c2nc(-c3ccccc3)c(-c3ccccc3)o2)c1. The target protein (P43119) has sequence MADSCRNLTYVRGSVGPATSTLMFVAGVVGNGLALGILSARRPARPSAFAVLVTGLAATDLLGTSFLSPAVFVAYARNSSLLGLARGGPALCDAFAFAMTFFGLASMLILFAMAVERCLALSHPYLYAQLDGPRCARLALPAIYAFCVLFCALPLLGLGQHQQYCPGSWCFLRMRWAQPGGAAFSLAYAGLVALLVAAIFLCNGSVTLSLCRMYRQQKRHQGSLGPRPRTGEDEVDHLILLALMTVVMAVCSLPLTIRCFTQAVAPDSSSEMGDLLAFRFYAFNPILDPWVFILFRKAVFQRLKLWVCCLCLGPAHGDSQTPLSQLASGRRDPRAPSAPVGKEGSCVPLSAWGEGQVEPLPPTQQSSGSAVGTSSKAEASVACSLC. The pIC50 is 6.9. (3) The small molecule is COc1ccc(OCCCC(=O)O)cc1Cc1cnc(N)nc1N. The target protein sequence is MTRAEVGLVWAQSTSGVIGRGGDIPWSVPEDLTRFKEVTMGHTVIMGRRTWESLPAKVRPLPGRRNVVVSRRPDFVAEGARVAGSLEAALAYAGSDPAPWVIGGAQIYLLALPHATRCEVTEIEIDLRRDDDDALAPALDDSWVGETGEWLASRSGLRYRFHSYRRDPRSSVRGCSPSRPS. The pIC50 is 8.3. (4) The compound is Cc1ccc(-c2ccccc2C(=O)Nc2ccc(C(=O)N3C[C@@H]4C5CCC(C5)N4Cc4cc(Cl)ccc43)c(Cl)c2)cc1. The target protein (P30518) has sequence MLMASTTSAVPGHPSLPSLPSNSSQERPLDTRDPLLARAELALLSIVFVAVALSNGLVLAALARRGRRGHWAPIHVFIGHLCLADLAVALFQVLPQLAWKATDRFRGPDALCRAVKYLQMVGMYASSYMILAMTLDRHRAICRPMLAYRHGSGAHWNRPVLVAWAFSLLLSLPQLFIFAQRNVEGGSGVTDCWACFAEPWGRRTYVTWIALMVFVAPTLGIAACQVLIFREIHASLVPGPSERPGGRRRGRRTGSPGEGAHVSAAVAKTVRMTLVIVVVYVLCWAPFFLVQLWAAWDPEAPLEGAPFVLLMLLASLNSCTNPWIYASFSSSVSSELRSLLCCARGRTPPSLGPQDESCTTASSSLAKDTSS. The pIC50 is 7.5. (5) The compound is NC(=O)c1ccc(Oc2ccc(C(N)=O)cc2)cc1. The target protein sequence is AGQTLKGPWNNLERLAENTGEFQEVVRAFYDTLDAARSSIRVVRVERVSHPLLQQQYELYRERLLQRCERRPVEQVLYHGTTAPAVPDICAHGFNRSFCGRNATVYGKGVYFARRASLSVQDRYSPPNADGHKAVFVARVLTGDYGQGRRGLRAPPLRGPGHVLLRYDSAVDCICQPSIFVIFHDTQALPTHLITCEHVPRASPDDPSG. The pIC50 is 6.5. (6) The drug is CN1CCC(O)(c2nc(-c3ccc(F)cc3)c(-c3ccncc3)o2)CC1. The target protein (P04409) has sequence MADVFPAAEPAAPQDVANRFARKGALRQKNVHEVKNHRFIARFFKQPTFCSHCTDFIWGFGKQGFQCQVCCFVVHKRCHEFVTFSCPGADKGPDTDDPRSKHKFKIHTYGSPTFCDHCGSLLYGLIHQGMKCDTCDMNVHKQCVINVPSLCGMDHTEKRGRIYLKAEVTDEKLHVTVRDAKNLIPMDPNGLSDPYVKLKLIPDPKNESKQKTKTIRSTLNPRWDESFTFKLKPSDKDRRLSEEIWDWDRTTRNDFMGSLSFGVSELMKMPASGWYKLLNQEEGEYYNVPIPEGDEEGNVELRQKFEKAKLGPAGNKVISPSEDRRQPSNNLDRVKLTDFNFLMVLGKGSFGKVMLADRKGTEELYAIKILKKDVVIQDDDVECTMVEKRVLALLDKPPFLTQLHSCFQTVDRLYFVMEYVNGGDLMYHIQQVGKFKEPQAVFYAAEISIGLFFLHKRGIIYRDLKLDNVMLDSEGHIKIADFGMCKEHMMDGVTTRTFCG.... The pIC50 is 4.3. (7) The compound is Cc1cccc(C2=N[C@H](NC(=O)[C@H](CCC(F)(F)F)[C@H](CCCC(F)(F)F)C(N)=O)C(=O)Nc3c(Cl)cccc32)c1. The target protein (Q9UM47) has sequence MGPGARGRRRRRRPMSPPPPPPPVRALPLLLLLAGPGAAAPPCLDGSPCANGGRCTQLPSREAACLCPPGWVGERCQLEDPCHSGPCAGRGVCQSSVVAGTARFSCRCPRGFRGPDCSLPDPCLSSPCAHGARCSVGPDGRFLCSCPPGYQGRSCRSDVDECRVGEPCRHGGTCLNTPGSFRCQCPAGYTGPLCENPAVPCAPSPCRNGGTCRQSGDLTYDCACLPGFEGQNCEVNVDDCPGHRCLNGGTCVDGVNTYNCQCPPEWTGQFCTEDVDECQLQPNACHNGGTCFNTLGGHSCVCVNGWTGESCSQNIDDCATAVCFHGATCHDRVASFYCACPMGKTGLLCHLDDACVSNPCHEDAICDTNPVNGRAICTCPPGFTGGACDQDVDECSIGANPCEHLGRCVNTQGSFLCQCGRGYTGPRCETDVNECLSGPCRNQATCLDRIGQFTCICMAGFTGTYCEVDIDECQSSPCVNGGVCKDRVNGFSCTCPSGFS.... The pIC50 is 8.6. (8) The compound is CCCC(C[PH](O)(O)C(C)=N)C(=O)O. The target protein sequence is MINVTLEQIKNWIDCEIDEKHLKKTINGVSIDSRKINEGALFIPFKGENVDGHRFITQALNDGAGAVFSEKENKHSEGNQGPIIWVEDTLIALQQLAKAYLNHVNPKVIAVTGSNGKTTTKDMIESVLSTEFKVKKTQGNYNNEIGMPLTLLELDEDTEISILEMGMSGFHQIELLSHIAQPDIAVITNIGESHMQDLGSREGIAKAKFEITTGLKTNGIFIYDGDEPLLKPHVNQVKNAKLISIGLNSDSTYTCHMNDVKNEGIHFTINQKEHYHLPILGTHNMKNAAIAIAIGHELGLNETIIQNNIHNVQLTAMRMERHESSNNVTVINDAYNASPTSMKAAIDTLSVMKGRKILILADVLELGPNSQLMHKQVGEYLKDKNIDVLYTFGKEASYIYDSGKVFVKEAKYFDNKDQLIQTLISQVKPEDKVLVKGSRGMKLEEVVDALL. The pIC50 is 4.7. (9) The drug is O=P(O)(O)O[C@H]1[C@H](O)[C@@H](OP(=O)(O)O)[C@H](OP(=O)(O)O)[C@@H](O)[C@H]1O. The target protein (Q62688) has sequence MAEGAASREAPAPLDVAGGEDDPRAGADAASGDAAPEASGGRMRDRRSGVALPGNAGVPADSEAGLLEAARATPRRTSIIKDPSNQKCGGRKKTVSFSSMPSEKKISSAHDCISFMQAGCELKKVRPNSRIYNRFFTLDTDLQALRWEPSKKDLEKAKLDISAIKEIRLGKNTETFRNNGLADQICEDCAFSILHGENYESLDLVANSADVANIWVSGLRYLVSRSKQPLDFMEGNQNTPRFMWLKTVFEAADVDGNGIMLEDTSVELIKQLNPTLKESKIRLKFKEIQKSKEKLTTRVTEEEFCEAFCELCTRPEVYFLLVQISKNKEYLDANDLMLFLEVEQGVTHVTEDMCLDIIRRYELSEDGRQKGFLAIDGFTQYLLSPECDIFDPEQKKVAQDMTQPLSHYYINASHNTYLIEDQFRGPADINGYVRALKMGCRSIELDVSDGPDNEPILCNRNNMAMLLSFRSVLEVINKFAFVASEYPLILCLGNHCSLPQ.... The pIC50 is 7.4. (10) The drug is O=c1cc(N2CCOCC2)oc2c(-c3ccccc3)cccc12. The target protein (P32871) has sequence MPPRPSSGELWGIHLMPPRILVECLLPNGMIVTLECLREATLITIKHELFKEARKYPLHQLLQDESSYIFVSVTQEAEREEFFDETRRLCDLRLFQPFLKVIEPVGNREEKILNREIGFAIGMPVCEFDMVKDPEVQDFRRNILNVCKEAVDLRDLNSPHSRAMYVYPPNVESSPELPKHIYNKLDKGQIIVVIWVIVSPNNDKQKYTLKINHDCVPEQVIAEAIRKKTRSMLLSSEQLKLCVLEYQGKYILKVCGCDEYFLEKYPLSQYKYIRSCIMLGRMPNLMLMAKESLYSQLPMDCFTMPSYSRRISTATPYMNGETSTKSLWVINSALRIKILCATYVNVNIRDIDKIYVRTGIYHGGEPLCDNVNTQRVPCSNPRWNEWLNYDIYIPDLPRAARLCLSICSVKGRKGAKEEHCPLAWGNINLFDYTDTLVSGKMALNLWPVPHGLEDLLNPIGVTGSNPNKETPCLELEFDWFSSVVKFPDMSVIEEHANWSV.... The pIC50 is 5.0.