From a dataset of Drug-target binding data from BindingDB using IC50 measurements. Regression. Given a target protein amino acid sequence and a drug SMILES string, predict the binding affinity score between them. We predict pIC50 (pIC50 = -log10(IC50 in M); higher means more potent). Dataset: bindingdb_ic50. (1) The compound is Cc1csc2nc([C@H](C)NC(=O)/C=C/c3cccn3C)oc(=O)c12. The target protein (P03234) has sequence MVQAPSVYVCGFVERPDAPPKDACLHLDPLTVKSQLPLKKPLPLTVEHLPDAPVGSVFGLYQSRAGLFSAASITSGDFLSLLDSIYHDCDIAQSQRLPLPREPKVEALHAWLPSLSLASLHPDIPQTTADGGKLSFFDHVSICALGRRRGTTAVYGTDLAWVLKHFSDLEPSIAAQIENDANAAKRESGCPEDHPLPLTKLIAKAIDAGFLRNRVETLRQDRGVANIPAESYLKASDAPDLQKPDKALQSPPPASTDPATMLSGNAGEGATACGGSAAAGQDLISVPRNTFMTLLQTNLDNKPPRQTPLPYAAPLPPFSHQAIATAPSYGPGAGAVAPAGGYFTSPGGYYAGPAGGDPGAFLAMDAHTYHPHPHPPPAYFGLPGLFGPPPPVPPYYGSHLRADYVPAPSRSNKRKRDPEEDEEGGGLFPGEDATLYRKDIAGLSKSVNELQHTLQALRRETLSYGHTGVGYCPQQGPCYTHSGPYGFQPHQSYEVPRYVP.... The pIC50 is 6.7. (2) The drug is CC(C)C[C@H](NC(=O)[C@H](CO)NC(=O)[C@H](CO)NC(=O)[C@H](CS)NC(=O)[C@H](C)NC(=O)[C@H](CC(N)=O)NC(=O)[C@@H](NC(=O)CN)C(C)C)C(=O)N[C@@H](Cc1ccccc1)C(=O)O. The target protein sequence is MELLNSYNFVLFVLTQMILMFTIPAIISGIKYSKLDYFFIIVISTLSLFLFKMFDSASLIILTSFIIIMYFVKIKWYSILLIMTSQIILYCANYMYIVIYAYITKISDSIFVIFPSFFVVYVTISILFSYIINRVLKKISTPYLILNKGFLIVISTILLLTFSLFFFYSQINSDEAKVIRQYSFIFIGITIFLSILTFVISQFLLKEMKYKRNQEEIETYYEYTLKIEAINNEMRKFRHDYVNILTTLSEYIREDDMPGLRDYFNKNIVPMKDNLQMNAIKLNGIENLKVREIKGLITAKILRAQEMNIPISIEIPDEVSSINLNMIDLSRSIGIILDNAIEASTEIDDPIIRVAFIESENSVTFIVMNKCADDIPRIHELFQESFSTKGEGRGLGLSTLKEIADNADNVLLDTIIENGFFIQKVEIINN. The pIC50 is 8.5. (3) The small molecule is CN(CCC=C1c2ccccc2CCc2ccccc21)CCC(=O)O. The target protein (P31650) has sequence MTAEQALPLGNGKAAEEARGSETLGGGGGGAAGTREARDKAVHERGHWNNKVEFVLSVAGEIIGLGNVWRFPYLCYKNGGGAFLIPYVVFFICCGIPVFFLETALGQFTSEGGITCWRRVCPLFEGIGYATQVIEAHLNVYYIIILAWAIFYLSNCFTTELPWATCGHEWNTEKCVEFQKLNFSNYSHVSLQNATSPVMEFWERRVLAISDGIEHIGNLRWELALCLLAAWTICYFCIWKGTKSTGKVVYVTATFPYIMLLILLIRGVTLPGASEGIKFYLYPDLSRLSDPQVWVDAGTQIFFSYAICLGCLTALGSYNNYNNNCYRDCIMLCCLNSGTSFVAGFAIFSVLGFMAYEQGVPIAEVAESGPGLAFIAYPKAVTMMPLSPLWATLFFMMLIFLGLDSQFVCVESLVTAVVDMYPKVFRRGYRRELLILALSIISYFLGLVMLTEGGMYIFQLFDSYAASGMCLLFVAIFECVCIGWVYGSNRFYDNIEDMIG.... The pIC50 is 2.8. (4) The compound is O=C1c2cccc3c(NCCO)ccc(c23)C(=O)N1c1cccc(Br)c1. The target protein sequence is MSTLPKPQRKTKRNTNRRPMDVKFPGGGQIVGGVYLLPRKGPRLGVRATRKTSERSQPRGRRQPIPKARQPQGRHWAQPGYPWPLYGSEGCGWAGWLLSPRGSRPHWGPNDPRRRSRNLGKVIDTLTCGFADLMWYIPVVGAPLGGVAAALAHGVRAIEDGINYATGNLPGCSFSIFLLALLSCLTTPASALTYGNSSGLYHLTNDCSNSSIVLEADAMILHLPGCLPCVRVGNQSTCWHAVSPTLATPNASTPATGFRRHVDLLAGAAVVCSSLYIGDLCGSLFLAGQLFAFQPRRHWTVQDCNCSIYTGHVTGHKMAWDMMMNWSPTTTLVLSSILRVPEICASVIFGGHWGILLAVAYFGMAGNWLKVLAVLFLFAGVEAQTMIAHGVSQTTSGFASLLTPGAKQNIQLINTNGSWHINRTALNCNDSLQTGFLASLFYTHKFNSSGCPERMAACKPLAEFRQGWGQITHKNVSGPSDDRPYCWHYAPRPCEVVPAR.... The pIC50 is 6.3. (5) The drug is CCCCC[C@H]1NC[C@@H](O)[C@@H](O)[C@H]1CO. The target protein (P06865) has sequence MTSSRLWFSLLLAAAFAGRATALWPWPQNFQTSDQRYVLYPNNFQFQYDVSSAAQPGCSVLDEAFQRYRDLLFGSGSWPRPYLTGKRHTLEKNVLVVSVVTPGCNQLPTLESVENYTLTINDDQCLLLSETVWGALRGLETFSQLVWKSAEGTFFINKTEIEDFPRFPHRGLLLDTSRHYLPLSSILDTLDVMAYNKLNVFHWHLVDDPSFPYESFTFPELMRKGSYNPVTHIYTAQDVKEVIEYARLRGIRVLAEFDTPGHTLSWGPGIPGLLTPCYSGSEPSGTFGPVNPSLNNTYEFMSTFFLEVSSVFPDFYLHLGGDEVDFTCWKSNPEIQDFMRKKGFGEDFKQLESFYIQTLLDIVSSYGKGYVVWQEVFDNKVKIQPDTIIQVWREDIPVNYMKELELVTKAGFRALLSAPWYLNRISYGPDWKDFYIVEPLAFEGTPEQKALVIGGEACMWGEYVDNTNLVPRLWPRAGAVAERLWSNKLTSDLTFAYERL.... The pIC50 is 3.8. (6) The drug is OCc1nc2ccccc2n1CC(O)Cn1ccc2ccccc21. The target protein (P25090) has sequence METNFSTPLNEYEEVSYESAGYTVLRILPLVVLGVTFVLGVLGNGLVIWVAGFRMTRTVTTICYLNLALADFSFTATLPFLIVSMAMGEKWPFGWFLCKLIHIVVDINLFGSVFLIGFIALDRCICVLHPVWAQNHRTVSLAMKVIVGPWILALVLTLPVFLFLTTVTIPNGDTYCTFNFASWGGTPEERLKVAITMLTARGIIRFVIGFSLPMSIVAICYGLIAAKIHKKGMIKSSRPLRVLTAVVASFFICWFPFQLVALLGTVWLKEMLFYGKYKIIDILVNPTSSLAFFNSCLNPMLYVFVGQDFRERLIHSLPTSLERALSEDSAPTNDTAANSASPPAETELQAM. The pIC50 is 4.2. (7) The small molecule is CC(C)(C)Sc1c(CC(C)(C)C(=O)[O-])n(Cc2ccc(Cl)cc2)c2ccc(OCc3ccc4ccccc4n3)cc12. The target protein (P20291) has sequence MDQEAVGNVVLLAIVTLISVVQNAFFAHKVELESKAQSGRSFQRTGTLAFERVYTANQNCVDAYPTFLVVLWTAGLLCSQVPAAFAGLMYLFVRQKYFVGYLGERTQSTPGYIFGKRIILFLFLMSLAGILNHYLIFFFGSDFENYIRTITTTISPLLLIP. The pIC50 is 8.0. (8) The drug is CN(C)C(=O)/C(C#N)=C/c1cccc(-n2cc(-c3ccc(Cl)cc3)c3c(N)ncnc32)c1. The target protein sequence is MGSNKSKPKDASQRRRSLEPAENVHGAGGGAFPASQTPSKPASADGHRGPSAAFAPAAAEPKLFGGFNSSDTVTSPQRAGPLAGGVTTFVALYDYESRTETDLSFKKGERLQIVNNTEGDWWLAHSLSTGQTGYIPSNYVAPSDSIQAEEWYFGKITRRESERLLLNAENPRGTFLVRESETTKGAYCLSVSDFDNAKGLNVKHYKIRKLDSGGFYITSRTQFNSLQQLVAYYSKHADGLCHRLTTVCPTSKPQTQGLAKDAWEIPRESLRLEVKLGQGCFGEVWMGTWNGTTRVAIKTLKPGTMSPEAFLQEAQVMKKLRHEKLVQLYAVVSEEPIYIVTEYMCKGSLLDFLKGETGKYLRLPQLVDMAAQIASGMAYVERMNYVHRDLRAANILVGENLVCKVADFGLARLIEDNEYTARQGAKFPIKWTAPEAALYGRFTIKSDVWSFGILLTELTTKGRVPYPGMVNREVLDQVERGYRMPCPPECPESLHDLMCQ.... The pIC50 is 7.1. (9) The pIC50 is 5.2. The target protein (P0A9P4) has sequence MGTTKHSKLLILGSGPAGYTAAVYAARANLQPVLITGMEKGGQLTTTTEVENWPGDPNDLTGPLLMERMHEHATKFETEIIFDHINKVDLQNRPFRLNGDNGEYTCDALIIATGASARYLGLPSEEAFKGRGVSACATCDGFFYRNQKVAVIGGGNTAVEEALYLSNIASEVHLIHRRDGFRAEKILIKRLMDKVENGNIILHTNRTLEEVTGDQMGVTGVRLRDTQNSDNIESLDVAGLFVAIGHSPNTAIFEGQLELENGYIKVQSGIHGNATQTSIPGVFAAGDVMDHIYRQAITSAGTGCMAALDAERYLDGLADAK. The drug is O=c1c2ccccc2[se]n1-c1ccccc1.