From a dataset of Drug-target binding data from BindingDB using IC50 measurements. Regression. Given a target protein amino acid sequence and a drug SMILES string, predict the binding affinity score between them. We predict pIC50 (pIC50 = -log10(IC50 in M); higher means more potent). Dataset: bindingdb_ic50. (1) The drug is N#Cc1c(-c2cccc(Br)c2)no[n+]1[O-]. The target protein sequence is MPPADGTSQWLRKTVDSAAVILFSKTTCPYCKKVKDVLAEAKIKHATIELDQLSNGSAIQKCLASFSKIETVPQMFVRGKFIGDSQTVLKYYSNDELAGIVNESKYDYDLIVIGGGSGGLAAGKEAAKYGAKTAVLDYVEPTPIGTTWGLGGTCVNVGCIPKKLMHQAGLLSHALEDAEHFGWSLDRSKISHNWSTMVEGVQSHIGSLNWGYKVALRDNQVTYLNAKGRLISPHEVQITDKNQKVSTITGNKIILATGERPKYPEIPGAVEYGITSDDLFSLPYFPGKTLVIGASYVALECAGFLASLGGDVTVMVRSILLRGFDQQMAEKVGDYMENHGVKFAKLCVPDEIKQLKVVDTENNKPGLLLVKGHYTDGKKFEEEFETVIFAVGREPQLSKVLCETVGVKLDKNGRVVCTDDEQTTVSNVYAIGDINAGKPQLTPVAIQAGRYLARRLFAGATELTDYSNVATTVFTPLEYGACGLSEEDAIEKYGDKDIEV.... The pIC50 is 5.5. (2) The drug is Fc1ccc(Nc2nc(NCCN3CCOCC3)nc(Nc3ccc(F)cc3)n2)cc1. The target protein sequence is MNEITSPEQYDLETTALKVPPHSIEAEQAVLGGLMLDNNAWERVSDSVSDGDFYRHDHRLIFRAIYKLAEANQPIDVVTLSEQLEKEGQLAQVGGLAYLGELAKNIPSVANIKAYAQIIRERATLRQLIGISNEIADSAFHPEGRGANEILDEAERKIFEVAEARPKTGGPVGISDILVKTIDRIDYLFNTTEALTGVSTGFTDLDEKTSGLQPADLIIVAGRPSMGKTTFAMNLVENAVMRTDKAVLVYSLEMPSDSIVMRMLSSLGRIDQTKVRSGKLDDEDWPRLTSAINLLNDKKLFIDDTAGISPSEMRARTRRLVREHGDLALIMIDYLQLMQIPGSSGDNRTNEISEISRSLKALAKEFNCPVIALSQLNRSLEQRPNKRPVNSDLRESGAIEQDADVIMFVYRDEVYHPETEFKGVAEIIIGKQRNGPIGTVRLAFIGKYTRFENLAAGMYNFEDE. The pIC50 is 4.0. (3) The drug is Cc1cccc(COc2ccccc2-c2nc3cccnc3o2)c1. The target protein (Q8NHU3) has sequence MDIIETAKLEEHLENQPSDPTNTYARPAEPVEEENKNGNGKPKSLSSGLRKGTKKYPDYIQIAMPTESRNKFPLEWWKTGIAFIYAVFNLVLTTVMITVVHERVPPKELSPPLPDKFFDYIDRVKWAFSVSEINGIILVGLWITQWLFLRYKSIVGRRFCFIIGTLYLYRCITMYVTTLPVPGMHFQCAPKLNGDSQAKVQRILRLISGGGLSITGSHILCGDFLFSGHTVTLTLTYLFIKEYSPRHFWWYHLICWLLSAAGIICILVAHEHYTIDVIIAYYITTRLFWWYHSMANEKNLKVSSQTNFLSRAWWFPIFYFFEKNVQGSIPCCFSWPLSWPPGCFKSSCKKYSRVQKIGEDNEKST. The pIC50 is 7.7. (4) The drug is N#Cc1c(N)nc2sc(C(=O)c3ccc(Cl)cc3)c(N)c2c1-c1ccccc1I. The target protein (P51136) has sequence MSSKDQILEKDKKETDDNGNKKTTTTTSSSSSSSSSSKPRSNKFDKVIIKSNGVCYITEGVIGNGSFGVVTQAIVADTKEVVAIKKVLQDQRYKNRELQIMKMLNHINIVSLKNSFYTSDNDEVYLNLVLEYVPDTVYRVSRHYSMSKQPVPNIFVKLYIYQLCRSINYIHSLGICHRDIKPQNLLLDTSTSTLKLCDFGSAKILIKGETNVSYICSRHYRAPELIFGSTNYTTTIDVWSLGCVLAELLLGQPLFPGENGIDQLVEIIKVLGTPTKEQIHAMNPYYTSFKFPEIKANPWPRVFKAKDVPAESIDLISKILLYDPSSRLKPVEICAHPFFDELRDPKTCLPDGKPLPPLFNFTIAEQTSIGPKLAKTLIPSHAMNQIELPSPLFPNLAISSSNQSSSSNSNANVSSNLNSHSASPSTTSSSSSTPNSIPVQSPSTTNTTSSTTNNTTTTTTTTTTSNH. The pIC50 is 4.0. (5) The drug is CN(C)c1ccc(C(C2C(=O)CC(C)(C)CC2=O)C2C(=O)CC(C)(C)CC2=O)cc1. The target protein (O75342) has sequence MATYKVRVATGTDLLSGTRDSISLTIVGTQGESHKQLLNHFGRDFATGAVGQYTVQCPQDLGELIIIRLHKERYAFFPKDPWYCNYVQICAPNGRIYHFPAYQWMDGYETLALREATGKTTADDSLPVLLEHRKEEIRAKQDFYHWRVFLPGLPSYVHIPSYRPPVRRHRNPNRPEWNGYIPGFPILINFKATKFLNLNLRYSFLKTASFFVRLGPMALAFKVRGLLDCKHSWKRLKDIRKIFPGKKSVVSEYVAEHWAEDTFFGYQYLNGVNPGLIRRCTRIPDKFPVTDDMVAPFLGEGTCLQAELEKGNIYLADYRIMEGIPTVELSGRKQHHCAPLCLLHFGPEGKMMPIAIQLSQTPGPDCPIFLPSDSEWDWLLAKTWVRYAEFYSHEAIAHLLETHLIAEAFCLALLRNLPMCHPLYKLLIPHTRYTVQINSIGRAVLLNEGGLSAKGMSLGVEGFAGVMVRALSELTYDSLYLPNDFVERGVQDLPGYYYRD.... The pIC50 is 4.8. (6) The compound is CC(C)(C)C(=O)ON[C@@H](CCc1ccccc1)C(=O)N[C@@H]1[C@H](SC2CCCCC2)O[C@@H](CO)[C@H](O)[C@H]1O. The target protein (P9WJN3) has sequence MSETPRLLFVHAHPDDESLSNGATIAHYTSRGAQVHVVTCTLGEEGEVIGDRWAQLTADHADQLGGYRIGELTAALRALGVSAPIYLGGAGRWRDSGMAGTDQRSQRRFVDADPRQTVGALVAIIRELRPHVVVTYDPNGGYGHPDHVHTHTVTTAAVAAAGVGSGTADHPGDPWTVPKFYWTVLGLSALISGARALVPDDLRPEWVLPRADEIAFGYSDDGIDAVVEADEQARAAKVAALAAHATQVVVGPTGRAAALSNNLALPILADEHYVLAGGSAGARDERGWETDLLAGLGFTASGT. The pIC50 is 3.7. (7) The drug is Nc1c(C(=O)Nc2cccc(F)c2)sc2nc(-c3cccs3)ccc12. The target protein (P0A752) has sequence MKSLQALFGGTFDPVHYGHLKPVETLANLIGLTRVTIIPNNVPPHRPQPEANSVQRKHMLELAIADKPLFTLDERELKRNAPSYTAQTLKEWRQEQGPDVPLAFIIGQDSLLTFPTWYEYETILDNAHLIVCRRPGYPLEMAQPQYQQWLEDHLTHNPEDLHLQPAGKIYLAETPWFNISATIIRERLQNGESCEDLLPEPVLTYINQQGLYR. The pIC50 is 4.2. (8) The small molecule is CC(C)(C)NCCCNS(=O)(=O)c1cccc(Br)c1. The target protein (Q5T6S3) has sequence MENRALDPGTRDSYGATSHLPNKGALAKVKNNFKDLMSKLTEGQYVLCRWTDGLYYLGKIKRVSSSKQSCLVTFEDNSKYWVLWKDIQHAGVPGEEPKCNICLGKTSGPLNEILICGKCGLGYHQQCHIPIAGSADQPLLTPWFCRRCIFALAVRKGGALKKGAIARTLQAVKMVLSYQPEELEWDSPHRTNQQQCYCYCGGPGEWYLRMLQCYRCRQWFHEACTQCLNEPMMFGDRFYLFFCSVCNQGPEYIERLPLRWVDVVHLALYNLGVQSKKKYFDFEEILAFVNHHWELLQLGKLTSTPVTDRGPHLLNALNSYKSRFLCGKEIKKKKCIFRLRIRVPPNPPGKLLPDKGLLPNENSASSELRKRGKSKPGLLPHEFQQQKRRVYRRKRSKFLLEDAIPSSDFTSAWSTNHHLASIFDFTLDEIQSLKSASSGQTFFSDVDSTDAASTSGSASTSLSYDSRWTVGSRKRKLAAKAYMPLRAKRWAAELDGRCPS.... The pIC50 is 4.0. (9) The compound is O=C(COc1ccc(C2=NCCN2)cc1)Nc1ccccc1. The target protein sequence is MVISKPINARPLPAGLTASQQWTLLEWIHMAGHIETENELKAFLDQVLSQAPSERLLLALGRLNNQNQIQRLERVLNVSYPSDWLDQYMKENYAQHDPILRIHLGQGPVMWEERFNRAKGAEEKRFIAEATQNGMGSGITFSAASERNNIGSILSIAGREPGRNAALVAMLNCLTPHLHQAAIRVANLPPASPSNMPLSQREYDIFHWMSRGKTNWEIATILDISERTVKFHVANVIRKLNANNRTHAIVLGMHLAMPPSTVANE. The pIC50 is 4.2.