From a dataset of Drug-target binding data from BindingDB using IC50 measurements. Regression. Given a target protein amino acid sequence and a drug SMILES string, predict the binding affinity score between them. We predict pIC50 (pIC50 = -log10(IC50 in M); higher means more potent). Dataset: bindingdb_ic50. (1) The compound is Nc1nnc(SCc2ccccc2)[nH]1. The target protein (P0A079) has sequence MIVKTEEELQALKEIGYICAKVRNTMQAATKPGITTKELDNIAKELFEEYGAISAPIHDENFPGQTCISVNEEVAHGIPSKRVIREGDLVNIDVSALKNGYYADTGISFVVGESDDPMKQKVCDVATMAFENAIAKVKPGTKLSNIGKAVHNTARQNDLKVIKNLTGHGVGLSLHEAPAHVLNYFDPKDKTLLTEGMVLAIEPFISSNASFVTEGKNEWAFETSDKSFVAQIEHTVIVTKDGPILTTKIEEE. The pIC50 is 6.2. (2) The drug is O=C(O)c1ccncc1Nc1nn(CCN2CCC(F)(F)CC2)c2ccc(F)cc12. The target protein sequence is MAGVGPGGYAAEFVPPPECPVFEPSWEEFTDPLSFIGRIRPLAEKTGICKIRPPKDWQPPFACEVKSFRFTPRVQRLNELEAMTRVRLDFLDQLAKFWELQGSTLKIPVVERKILDLYALSKIVASKGGFEMVTKEKKWSKVGSRLGYLPGKGTGSLLKSHYERILYPYELFQSGVSLMGVQMPNLDLKEKVEPEVLSTDTQTSPEPGTRMNILPKRTRRVKTQSESGDVSRNTELKKLQIFGAGPKVVGLAMGTKDKEDEVTRRRKVTNRSDAFNMQMRQRKGTLSVNFVDLYVCMFCGRGNNEDKLLLCDGCDDSYHTFCLIPPLPDVPKGDWRCPKCVAEECSKPREAFGFEQAVREYTLQSFGEMADNFKSDYFNMPVHMVPTELVEKEFWRLVSSIEEDVIVEYGADISSKDFGSGFPVKDGRRKILPEEEEYALSGWNLNNMPVLEQSVLAHINVDISGMKVPWLYVGMCFSSFCWHIEDHWSYSINYLHWGEP.... The pIC50 is 6.0. (3) The compound is CC1=C(CC(=O)O)c2cc(F)ccc2/C1=C\c1ccc(S(C)=O)cc1. The target protein sequence is MATFVELSTKAKMPIVGLGTWKSPLGKVKEAVKVAIDAGYRHIDCAYVYQNEHEVGEAIQEKIQEKAVKREDLFIVSKLWPTFFERPLVRKAFEKTLKDLKLSYLDVYLIHWPQGFKSGDDLFPRDDKGNAIGGKATFLDAWEAMEELVDEGLVKALGVSNFSHFQIEKLLNKPGLKYKPVTNQVECHPYLTQEKLIQYCHSKGITVTAYSPLGSPDRPWAKPEDPSLLEDPKIKEIAAKHKKTAAQVLIRFHIQRNVIVIPKSVTPARIVENIQVFDFKLSDEEMATILSFNRNWRACNLLQSSHLEDYPFNAEY. The pIC50 is 5.6. (4) The compound is O=C1C(c2c(O)[nH]c3ccccc23)=Nc2ccccc21. The target protein (P10584) has sequence MKAPVRVAVTGAAGQIGYSLLFRIAAGEMLGKDQPVILQLLEIPQAMKALEGVVMELEDCAFPLLAGLEATDDPKVAFKDADYALLVGAAPRKAGMERRDLLQVNGKIFTEQGRALAEVAKKDVKVLVVGNPANTNALIAYKNAPGLNPRNFTAMTRLDHNRAKAQLAKKTGTGVDRIRRMTVWGNHSSTMFPDLFHAEVDGRPALELVDMEWYEKVFIPTVAQRGAAIIQARGASSAASAANAAIEHIRDWALGTPEGDWVSMAVPSQGEYGIPEGIVYSFPVTAKDGAYRVVEGLEINEFARKRMEITAQELLDEMEQVKALGLI. The pIC50 is 4.5. (5) The compound is O=C1c2ccc(O)cc2CCC1Cc1ccccc1-c1ccccc1. The target protein (Q09128) has sequence MSCPIDKRRTLIAFLRRLRDLGQPPRSVTSKASASRAPKEVPLCPLMTDGETRNVTSLPGPTNWPLLGSLLEIFWKGGLKKQHDTLAEYHKKYGQIFRMKLGSFDSVHLGSPSLLEALYRTESAHPQRLEIKPWKAYRDHRNEAYGLMILEGQEWQRVRSAFQKKLMKPVEIMKLDKKINEVLADFLERMDELCDERGRIPDLYSELNKWSFESICLVLYEKRFGLLQKETEEEALTFITAIKTMMSTFGKMMVTPVELHKRLNTKVWQAHTLAWDTIFKSVKPCIDNRLQRYSQQPGADFLCDIYQQDHLSKKELYAAVTELQLAAVETTANSLMWILYNLSRNPQAQRRLLQEVQSVLPDNQTPRAEDLRNMPYLKACLKESMRLTPSVPFTTRTLDKPTVLGEYALPKGTVLTLNTQVLGSSEDNFEDSHKFRPERWLQKEKKINPFAHLPFGIGKRMCIGRRLAELQLHLALCWIIQKYDIVATDNEPVEMLHLGI.... The pIC50 is 5.7. (6) The small molecule is O=C1Nc2c(c(=O)nc3sc([N+](=O)[O-])cn23)C(c2ccccc2C(F)(F)F)N1. The target protein (P9WP55) has sequence MSIAEDITQLIGRTPLVRLRRVTDGAVADIVAKLEFFNPANSVKDRIGVAMLQAAEQAGLIKPDTIILEPTSGNTGIALAMVCAARGYRCVLTMPETMSLERRMLLRAYGAELILTPGADGMSGAIAKAEELAKTDQRYFVPQQFENPANPAIHRVTTAEEVWRDTDGKVDIVVAGVGTGGTITGVAQVIKERKPSARFVAVEPAASPVLSGGQKGPHPIQGIGAGFVPPVLDQDLVDEIITVGNEDALNVARRLAREEGLLVGISSGAATVAALQVARRPENAGKLIVVVLPDFGERYLSTPLFADVAD. The pIC50 is 4.8. (7) The compound is O=C(Nc1cccc(O)c1C(=O)O)c1ccc(Br)c(Oc2ccccc2)c1. The target protein (P0DB00) has sequence MIFSKISQVAHYVPQQLVTNNDLASIMDTSHEWIFSRTGIAERHISRDEMTSDLAIQVADQLLTQSGLKADAIDFIIVATISPDATMPSTAAKVQAAIAATSAFAFDMTAACSGFVFALAMADKLIASGAYQNGMVIGAETLSKLVNWQDRATAVLFGDGAGGVLLEASKDKHVLAETLHTDGARCQSLISGETSLSSPYSIGKKAIATIQMDGRAIFDFAIRDVSKSILTLMAQSDITKDDIDYCLLHQANRRILDKIARKIDVPREKFLENMMRYGNTSAASIPILLSEAVQKGQIRLDGTQKILLSGFGGGLTWGSLIVKI. The pIC50 is 7.5. (8) The drug is OB(O)c1cc2ccccc2s1. The target protein (P07478) has sequence MNLLLILTFVAAAVAAPFDDDDKIVGGYICEENSVPYQVSLNSGYHFCGGSLISEQWVVSAGHCYKSRIQVRLGEHNIEVLEGNEQFINAAKIIRHPKYNSRTLDNDILLIKLSSPAVINSRVSAISLPTAPPAAGTESLISGWGNTLSSGADYPDELQCLDAPVLSQAECEASYPGKITNNMFCVGFLEGGKDSCQGDSGGPVVSNGELQGIVSWGYGCAQKNRPGVYTKVYNYVDWIKDTIAANS. The pIC50 is 3.7. (9) The small molecule is O=c1cc(OCC2COc3ncccc3O2)cc2n1CCc1cc(-c3cncnc3)ccc1-2. The target protein (Q9NQS5) has sequence MWNSSDANFSCYHESVLGYRYVAVSWGVVVAVTGTVGNVLTLLALAIQPKLRTRFNLLIANLTLADLLYCTLLQPFSVDTYLHLHWRTGATFCRVFGLLLFASNSVSILTLCLIALGRYLLIAHPKLFPQVFSAKGIVLALVSTWVVGVASFAPLWPIYILVPVVCTCSFDRIRGRPYTTILMGIYFVLGLSSVGIFYCLIHRQVKRAAQALDQYKLRQASIHSNHVARTDEAMPGRFQELDSRLASGGPSEGISSEPVSAATTQTLEGDSSEVGDQINSKRAKQMAEKSPPEASAKAQPIKGARRAPDSSSEFGKVTRMCFAVFLCFALSYIPFLLLNILDARVQAPRVVHMLAANLTWLNGCINPVLYAAMNRQFRQAYGSILKRGPRSFHRLH. The pIC50 is 8.0. (10) The drug is O=C(Cc1ccc(I)cc1)N[C@H]1CCOC1=O. The target protein (P25084) has sequence MALVDGFLELERSSGKLEWSAILQKMASDLGFSKILFGLLPKDSQDYENAFIVGNYPAAWREHYDRAGYARVDPTVSHCTQSVLPIFWEPSIYQTRKQHEFFEEASAAGLVYGLTMPLHGARGELGALSLSVEAENRAEANRFMESVLPTLWMLKDYALQSGAGLAFEHPVSKPVVLTSREKEVLQWCAIGKTSWEISVICNCSEANVNFHMGNIRRKFGVTSRRVAAIMAVNLGLITL. The pIC50 is 5.8.