This data is from Drug-target binding data from BindingDB using IC50 measurements. The task is: Regression. Given a target protein amino acid sequence and a drug SMILES string, predict the binding affinity score between them. We predict pIC50 (pIC50 = -log10(IC50 in M); higher means more potent). Dataset: bindingdb_ic50. (1) The compound is Nc1cccc(-c2nnc(-c3ccccc3)o2)c1. The target protein (P51608) has sequence MVAGMLGLREEKSEDQDLQGLKDKPLKFKKVKKDKKEEKEGKHEPVQPSAHHSAEPAEAGKAETSEGSGSAPAVPEASASPKQRRSIIRDRGPMYDDPTLPEGWTRKLKQRKSGRSAGKYDVYLINPQGKAFRSKVELIAYFEKVGDTSLDPNDFDFTVTGRGSPSRREQKPPKKPKSPKAPGTGRGRGRPKGSGTTRPKAATSEGVQVKRVLEKSPGKLLVKMPFQTSPGGKAEGGGATTSTQVMVIKRPGRKRKAEADPQAIPKKRGRKPGSVVAAAAAEAKKKAVKESSIRSVQETVLPIKKRKTRETVSIEVKEVVKPLLVSTLGEKSGKGLKTCKSPGRKSKESSPKGRSSSASSPPKKEHHHHHHHSESPKAPVPLLPPLPPPPPEPESSEDPTSPPEPQDLSSSVCKEEKMPRGGSLESDGCPKEPAKTQPAVATAATAAEKYKHRGEGERKDIVSSSMPRPNREEPVDSRTPVTERVS. The pIC50 is 5.1. (2) The compound is COc1cc(-c2cn[nH]c2)ccc1NC(=O)C1COc2ccccc2O1. The target protein sequence is MSRPPPTGKMPGAPETAPGDGAGASRQRKLEALIRDPRSPINVESLLDGLNSLVLDLDFPALRKNKNIDNFLNRYEKIVKKIRGLQMKAEDYDVVKVIGRGAFGEVQLVRHKASQKVYAMKLLSKFEMIKRSDSAFFWEERDIMAFANSPWVVQLFYAFQDDRYLYMVMEYMPGGDLVNLMSNYDVPEKWAKFYTAEVVLALDAIHSMGLIHRDVKPDNMLLDKHGHLKLADFGTCMKMDETGMVHCDTAVGTPDYISPEVLKSQGGDGFYGRECDWWSVGVFLYEMLVGDTPFYADSLVGTYSKIMDHKNSLCFPEDAEISKHAKNLICAFLTDREVRLGRNGVEEIRQHPFFKNDQWHWDNIRETAAPVVPELSSDIDSSNFDDIEDDKGDVETFPIPKAFVGNQLPFIGFTYYRENLLLSDSPSCRENDSIQSRKNEESQEIQKKLYTLEEHLSNEMQAKEELEQKCKSVNTRLEKTAKELEEEITLRKSVESALRQ.... The pIC50 is 8.5. (3) The small molecule is CN(C)CCNc1cc(F)cc(-c2cccc3[nH]c(-c4n[nH]c5cnc(-c6cncc(NC(=O)C(C)(C)C)c6)cc45)nc23)c1. The target protein (P04628) has sequence MGLWALLPGWVSATLLLALAALPAALAANSSGRWWGIVNVASSTNLLTDSKSLQLVLEPSLQLLSRKQRRLIRQNPGILHSVSGGLQSAVRECKWQFRNRRWNCPTAPGPHLFGKIVNRGCRETAFIFAITSAGVTHSVARSCSEGSIESCTCDYRRRGPGGPDWHWGGCSDNIDFGRLFGREFVDSGEKGRDLRFLMNLHNNEAGRTTVFSEMRQECKCHGMSGSCTVRTCWMRLPTLRAVGDVLRDRFDGASRVLYGNRGSNRASRAELLRLEPEDPAHKPPSPHDLVYFEKSPNFCTYSGRLGTAGTAGRACNSSSPALDGCELLCCGRGHRTRTQRVTERCNCTFHWCCHVSCRNCTHTRVLHECL. The pIC50 is 6.8. (4) The drug is CC(C)c1cccc(C(C)C)c1NC(=O)NCCc1ccccc1. The target protein sequence is PLFLKEVGSHFDDFVTNLIEKSASLDNGGCALTTFSILKEMKNNHRAKDLRAPPEQGKIFVARRSLLDELFEVDHIRTIYHMFIALLILFILSTLVVDYIDEGRLVLEFNLLSYAFGKLPTVVWTWWTMFLSTLSIPYFLFQHWANGYSKSSHPLMYSLFHGLLFMVFQLGILGFGPTYIVLAYTLPPASRFIVILEQIRLIMKAHSFVRENVPRVLNSAKEKSSTVPIPTVNQYLYFLFAPTLIYRDSYPRTPTVRWGYVAMQFAQVFGCLFYVYYIFERLCAPLFRNIKQEPFSARVLVLCIF. The pIC50 is 7.1. (5) The pIC50 is 6.3. The small molecule is Brc1ccc(-c2nsc3cc(OCCCCN4CCOCC4)ccc23)cc1. The target protein (P38605) has sequence MWKLKIAEGGSPWLRTTNNHVGRQFWEFDPNLGTPEDLAAVEEARKSFSDNRFVQKHSADLLMRLQFSRENLISPVLPQVKIEDTDDVTEEMVETTLKRGLDFYSTIQAHDGHWPGDYGGPMFLLPGLIITLSITGALNTVLSEQHKQEMRRYLYNHQNEDGGWGLHIEGPSTMFGSVLNYVTLRLLGEGPNDGDGDMEKGRDWILNHGGATNITSWGKMWLSVLGAFEWSGNNPLPPEIWLLPYFLPIHPGRMWCHCRMVYLPMSYLYGKRFVGPITSTVLSLRKELFTVPYHEVNWNEARNLCAKEDLYYPHPLVQDILWASLHKIVEPVLMRWPGANLREKAIRTAIEHIHYEDENTRYICIGPVNKVLNMLCCWVEDPNSEAFKLHLPRIHDFLWLAEDGMKMQGYNGSQLWDTGFAIQAILATNLVEEYGPVLEKAHSFVKNSQVLEDCPGDLNYWYRHISKGAWPFSTADHGWPISDCTAEGLKAALLLSKVPK.... (6) The drug is CCn1cnc(-c2cc3nccc(Oc4ccc(NC(=O)CC(=O)NC5CCCCC5)cc4F)c3s2)c1. The target protein (P08581) has sequence MKAPAVLAPGILVLLFTLVQRSNGECKEALAKSEMNVNMKYQLPNFTAETPIQNVILHEHHIFLGATNYIYVLNEEDLQKVAEYKTGPVLEHPDCFPCQDCSSKANLSGGVWKDNINMALVVDTYYDDQLISCGSVNRGTCQRHVFPHNHTADIQSEVHCIFSPQIEEPSQCPDCVVSALGAKVLSSVKDRFINFFVGNTINSSYFPDHPLHSISVRRLKETKDGFMFLTDQSYIDVLPEFRDSYPIKYVHAFESNNFIYFLTVQRETLDAQTFHTRIIRFCSINSGLHSYMEMPLECILTEKRKKRSTKKEVFNILQAAYVSKPGAQLARQIGASLNDDILFGVFAQSKPDSAEPMDRSAMCAFPIKYVNDFFNKIVNKNNVRCLQHFYGPNHEHCFNRTLLRNSSGCEARRDEYRTEFTTALQRVDLFMGQFSEVLLTSISTFIKGDLTIANLGTSEGRFMQVVVSRSGPSTPHVNFLLDSHPVSPEVIVEHTLNQNG.... The pIC50 is 6.2. (7) The compound is C/C=C1\NC(=O)[C@H](C)NC(=O)[C@H](C(C)C)NC(=O)[C@H](NC(=O)[C@H](Cc2ccccc2)NC(=O)[C@@H](CC(C)C)NC(=O)[C@@H](Cc2ccccc2)NC(=O)CC)[C@@H](C)OC1=O. The target protein (Q704Y3) has sequence MEKWASLDSDESEPPAQENSCPDPPDRDPNSKPPPAKPHIFATRSRTRLFGKGDSEEASPMDCPYEEGGLASCPIITVSSVVTLQRSVDGPTCLRQTSQDSVSTGVETPPRLYDRRSIFDAVAQSNCQELESLLSFLQKSKKRLTDSEFKDPETGKTCLLKAMLNLHNGQNDTIALLLDIARKTDSLKQFVNASYTDSYYKGQTALHIAIERRNMALVTLLVENGADVQAAANGDFFKKTKGRPGFYFGELPLSLAACTNQLAIVKFLLQNSWQPADISARDSVGNTVLHALVEVADNTADNTKFVTNMYNEILILGAKLHPTLKLEELTNKKGLTPLALAASSGKIGVLAYILQREIHEPECRHLSRKFTEWAYGPVHSSLYDLSCIDTCEKNSVLEVIAYSSSETPNRHDMLLVEPLNRLLQDKWDRFVKRIFYFNFFVYCLYMIIFTTAAYYRPVEGLPPYKLNNTVGDYFRVTGEILSVSGGVYFFFRGIQYFLQR.... The pIC50 is 3.6. (8) The compound is COC(=O)[C@@H]1C[C@H](CC(C)C)N(C(=O)c2ccccn2)[C@@]12C(=O)Nc1cc(F)ccc12. The target protein (O75643) has sequence MADVTARSLQYEYKANSNLVLQADRSLIDRTRRDEPTGEVLSLVGKLEGTRMGDKAQRTKPQMQEERRAKRRKRDEDRHDINKMKGYTLLSEGIDEMVGIIYKPKTKETRETYEVLLSFIQAALGDQPRDILCGAADEVLAVLKNEKLRDKERRKEIDLLLGQTDDTRYHVLVNLGKKITDYGGDKEIQNMDDNIDETYGVNVQFESDEEEGDEDVYGEVREEASDDDMEGDEAVVRCTLSANLVASGELMSSKKKDLHPRDIDAFWLQRQLSRFYDDAIVSQKKADEVLEILKTASDDRECENQLVLLLGFNTFDFIKVLRQHRMMILYCTLLASAQSEAEKERIMGKMEADPELSKFLYQLHETEKEDLIREERSRRERVRQSRMDTDLETMDLDQGGEALAPRQVLDLEDLVFTQGSHFMANKRCQLPDGSFRRQRKGYEEVHVPALKPKPFGSEEQLLPVEKLPKYAQAGFEGFKTLNRIQSKLYRAALETDENLL.... The pIC50 is 4.1.