From a dataset of Drug-target binding data from BindingDB using IC50 measurements. Regression. Given a target protein amino acid sequence and a drug SMILES string, predict the binding affinity score between them. We predict pIC50 (pIC50 = -log10(IC50 in M); higher means more potent). Dataset: bindingdb_ic50. (1) The drug is CNC(=O)C(CCCCCCNC(=O)c1cc(-c2cccc(NC(=O)OC(C)(C)C)c2)on1)N=O. The target protein (Q9BY41) has sequence MEEPEEPADSGQSLVPVYIYSPEYVSMCDSLAKIPKRASMVHSLIEAYALHKQMRIVKPKVASMEEMATFHTDAYLQHLQKVSQEGDDDHPDSIEYGLGYDCPATEGIFDYAAAIGGATITAAQCLIDGMCKVAINWSGGWHHAKKDEASGFCYLNDAVLGILRLRRKFERILYVDLDLHHGDGVEDAFSFTSKVMTVSLHKFSPGFFPGTGDVSDVGLGKGRYYSVNVPIQDGIQDEKYYQICESVLKEVYQAFNPKAVVLQLGADTIAGDPMCSFNMTPVGIGKCLKYILQWQLATLILGGGGYNLANTARCWTYLTGVILGKTLSSEIPDHEFFTAYGPDYVLEITPSCRPDRNEPHRIQQILNYIKGNLKHVV. The pIC50 is 4.0. (2) The compound is CCN(c1cc(Cl)cc(C(=O)NCc2c(OC)nn(C)c2OC)c1C)[C@H]1CC[C@H](N(C)C)CC1. The target protein sequence is MGQTGKKSEKGPVCWRKRVKSEYMRLRQLKRFRRADEVKSMFSSNRQKILERTEILNQEWKQRRIQPVHILTSVSSLRGTRECSVTSDLDFPTQVIPLKTLNAVASVPIMYSWSPLQQNFMVEDETVLHNIPYMGDEVLDQDGTFIEELIKNYDGKVHGDRECGFINDEIFVELVNALGQYNDDDDDDDGDDPEEREEKQKDLEDHRDDKESRPPRKFPSDKIFEAISSMFPDKGTAEELKEKYKELTEQQLPGALPPECTPNIDGPNAKSVQREQSLHSFHTLFCRRCFKYDCFLHPFHATPNTYKRKNTETALDNKPCGPQCYQHLEGAKEFAAALTAERIKTPPKRPGGRRRGRLPNNSSRPSTPTINVLESKDTDSDREAGTETGGENNDKEEEEKKDETSSSSEANSRCQTPIKMKPNIEPPENVEWSGAEASMFRVLIGTYYDNFCAIARLIGTKTCRQVYEFRVKESSIIAPAPAEDVDTPPRKKKRKHRLWA.... The pIC50 is 4.8. (3) The drug is O=C(NCc1ccc(S(=O)(=O)c2cccc(C(F)(F)F)c2)cc1)c1cnc2[nH]ncc2c1. The target protein (Q80Z29) has sequence MNAAAEAEFNILLATDSYKVTHYKQYPPNTSKVYSYFECREKKTENSKVRKVKYEETVFYGLQYILNKYLKGKVVTKEKIQEAKEVYREHFQDDVFNERGWNYILEKYDGHLPIEVKAVPEGSVIPRGNVLFTVENTDPECYWLTNWIETILVQSWYPITVATNSREQKKILAKYLLETSGNLDGLEYKLHDFGYRGVSSQETAGIGASAHLVNFKGTDTVAGIALIKKYYGTKDPVPGYSVPAAEHSTITAWGKDHEKDAFEHIVTQFSSVPVSVVSDSYDIYNACEKIWGEDLRHLIVSRSTEAPLIIRPDSGNPLDTVLKVLDILGKKFPVSENSKGYKLLPPYLRVIQGDGVDINTLQEIVEGMKQKKWSIENVSFGSGGALLQKLTRDLLNCSFKCSYVVTNGLGVNVFKDPVADPNKRSKKGRLSLHRTPAGTFVTLEEGKGDLEEYGHDLLHTVFKNGKVTKSYSFDEVRKNAQLNMEQDVAPH. The pIC50 is 7.5. (4) The drug is Nc1nc2c(ncn2Cc2ccccc2CCP(=O)(O)O)c(=O)[nH]1. The target protein (P23492) has sequence MENEFTYEDYETTAKWLLQHTEYRPQVAVICGSGLGGLTAHLKEAQIFDYNEIPNFPQSTVQGHAGRLVFGLLNGRCCVMMQGRFHMYEGYSLSKVTFPVRVFHLLGVETLVVTNAAGGLNPNFEVGDIMLIRDHINLPGFCGQNPLRGPNDERFGVRFPAMSDAYDRDMRQKAFTAWKQMGEQRKLQEGTYVMLAGPNFETVAESRLLKMLGADAVGMSTVPEVIVARHCGLRVFGFSLITNKVVMDYENLEKANHMEVLDAGKAAAQTLERFVSILMESIPLPDRGS. The pIC50 is 6.3. (5) The compound is CO[C@H]1CC[C@]2(C=C(c3cc(-c4ccc(Cl)cc4)ccc3C)C(=O)N2)CC1. The target protein sequence is LDLLEEKEGSLSPASVGSDTLSDLGISSLQDGLALHIRSSMSGLHLVKQGRDRKKIDSQRDFTVASPAEFVTRFGGNKVIEKVLIANNGIAAVKCMRSIRRWSYEMFRNERAIRFVVMVTPEDLKANAEYIKMADHYVPVPGGPNNNNYANVELILDIAKRIPVQAVWAGWGHASENPKLPELLLKNGIAFMGPPSQAMWALGDKIASSIVAQTAGIPTLPWSGSGLRVDWQENDFSKRILNVPQELYEKGYVKDVDDGLQAAEEVGYPVMIKASEGGGGKGIRKVNNADDFPNLFRQVQAEVPGSPIFVMRLAKQSRHLEVQILADQYGNAISLFGRDCSVQRRHQKIIEEAPATIATPAVFEHMEQCAVKLAKMVGYVSAGTVEYLYSQDGSFYFLELNPRLQVEHPCTEMVADVNLPAAQLQIAMGIPLYRIKDIRMMYGVSPWGDSPIDFEDSAHVPCPRGHVIAARITSENPDEGFKPSSGTVQELNFRSNKNVW.... The pIC50 is 6.5. (6) The drug is Nc1c[n+](Cc2[nH]c(=O)[nH]c(=O)c2Cl)c[nH]1. The target protein (P07650) has sequence MFLAQEIIRKKRDGHALSDEEIRFFINGIRDNTISEGQIAALAMTIFFHDMTMPERVSLTMAMRDSGTVLDWKSLHLNGPIVDKHSTGGVGDVTSLMLGPMVAACGGYIPMISGRGLGHTGGTLDKLESIPGFDIFPDDNRFREIIKDVGVAIIGQTSSLAPADKRFYATRDITATVDSIPLITASILAKKLAEGLDALVMDVKVGSGAFMPTYELSEALAEAIVGVANGAGVRTTALLTDMNQVLASSAGNAVEVREAVQFLTGEYRNPRLFDVTMALCVEMLISGKLAKDDAEARAKLQAVLDNGKAAEVFGRMVAAQKGPTDFVENYAKYLPTAMLTKAVYADTEGFVSEMDTRALGMAVVAMGGGRRQASDTIDYSVGFTDMARLGDQVDGQRPLAVIHAKDENNWQEAAKAVKAAIKLADKAPESTPTVYRRISE. The pIC50 is 5.1. (7) The compound is Fc1cc(-c2c[nH]nn2)ccn1. The target protein (P28776) has sequence MALSKISPTEGSRRILEDHHIDEDVGFALPHPLVELPDAYSPWVLVARNLPVLIENGQLREEVEKLPTLSTDGLRGHRLQRLAHLALGYITMAYVWNRGDDDVRKVLPRNIAVPYCELSEKLGLPPILSYADCVLANWKKKDPNGPMTYENMDILFSFPGGDCDKGFFLVSLLVEIAASPAIKAIPTVSSAVERQDLKALEKALHDIATSLEKAKEIFKRMRDFVDPDTFFHVLRIYLSGWKCSSKLPEGLLYEGVWDTPKMFSGGSAGQSSIFQSLDVLLGIKHEAGKESPAEFLQEMREYMPPAHRNFLFFLESAPPVREFVISRHNEDLTKAYNECVNGLVSVRKFHLAIVDTYIMKPSKKKPTDGDKSEEPSNVESRGTGGTNPMTFLRSVKDTTEKALLSWP. The pIC50 is 6.9. (8) The compound is CNc1cc(C(=O)N2CCCC(c3ccccc3OC)C2)cnn1. The target protein sequence is METTMGFMDDNATNTSTSFLSVLNPHGAHATSFPFNFSYSDYDMPLDEDEDVTNSRTFFAAKIVIGMALVGIMLVCGIGNFIFIAALVRYKKLRNLTNLLIANLAISDFLVAIVCCPFEMDYYVVRQLSWEHGHVLCTSVNYLRTVSLYVSTNALLAIAIDRYLAIVHPLRPRMKCQTATGLIALVWTVSILIAIPSAYFTTETVLVIVKSQEKIFCGQIWPVDQQLYYKSYFLFIFGIEFVGPVVTMTLCYARISRELWFKAVPGFQTEQIRKRLRCRRKTVLVLMCILTAYVLCWAPFYGFTIVRDFFPTVFVKEKHYLTAFYIVECIAMSNSMINTLCFVTVKNDTVKYFKKIMLLHWKASYNGGKSSADLDLKTIGMPATEEVDCIRLK. The pIC50 is 5.9. (9) The drug is Cc1nn(C)c(C)c1N[S+](=O)([O-])c1ccc(-c2cccc(CN3CCN(C)CC3)c2)cc1. The target protein (Q9UVX3) has sequence MSDSKDRKGKAPEGQSSEKKDGAVNITPQMAESLLENNPALRNETAGMDKDKAAEAMRKMNIAELLTGLSVSGKNQKDMASYKFWQTQPVPRFDETSTDTGGPIKIIDPEKVSKEPDALLEGFEWATLDLTNETELQELWDLLTYHYVEDDNAMFRFRYSQSFLHWALMSPGWKKEWHVGVRATKSRKLVASICGVPTEINVRNQKLKVVEINFLCIHKKLRSKRLTPVLIKEITRRCYLNGIYQAIYTAGVVLPTPVSSCRYYHRPLDWLKLYEVGFSPLPAGSTKARQITKNHLPSTTSTPGLRPMEPKDIDTVHDLLQRYLSRFALNQAFTREEVDHWLVHKPETVKEQVVWAYVVEDPETHKITDFFSFYNLESTVIQNPKHDNVRAAYLYYYATETAFTNNMKALKERLLMLMNDALILAKKAHFDVFNALTLHDNPLFLEQLKFGAGDGQLHFYLYNYRTAPVPGGVNEKNLPDEKRMGGVGIVML. The pIC50 is 7.8. (10) The small molecule is Cc1nc2ccc(Br)cc2c(-c2ccc(Cl)cc2)c1C(OC(C)(C)C)C(=O)O. The target protein sequence is MGARASVLSGGELDKWEKIRLRPGGKKQYKLKHIVWASRELERFAVNPGLLETSEGCRQILGQLQPSLQTGSEELRSLYNTIAVLYCVHQRIDVKDTKEALDKIEEEQNKSKKKAQQAAADTGNNSQVSQNYPIVQNLQGQMVHQAISPRTLNAWVKVVEEKAFSPEVIPMFSALSEGATPQDLNTMLNTVGGHQAAMQMLKETINEEAAEWDRLHPVHAGPIAPGQMREPRGSDIAGTTSTLQEQIGWMTHNPPIPVGEIYKRWIILGLNKIVRMYSPTSILDIRQGPKEPFRDYVDRFYKTLRAEQASQEVKNWMTETLLVQNANPDCKTILKALGPGATLEEMMTACQGVGGPGHKARVLAEAMSQVTNPATIMIQKGNFRNQRKTVKCFNCGKEGHIAKNCRAPRKKGCWKCGKEGHQMKDCTERQANFLREDLAFPQGKAREFSSEQTRANSPTRRELQVWGRDNNSLSEAGADRQGTVSFSFPQITLWQRPLVT.... The pIC50 is 3.7.