This data is from Drug-target binding data from BindingDB using IC50 measurements. The task is: Regression. Given a target protein amino acid sequence and a drug SMILES string, predict the binding affinity score between them. We predict pIC50 (pIC50 = -log10(IC50 in M); higher means more potent). Dataset: bindingdb_ic50. (1) The target protein (O19175) has sequence GEEVAVKLESQKARHPQLLYESKLYKILQGGVGIPHIRWYGQEKDYNVLVMDLLGPSLEDLFNFCSRRFTMKTVLMLADQMISRIEYVHTKNFIHRDIKPDNFLMGIGRHCNKLFLIDFGLAKKY. The pIC50 is 4.3. The drug is Nc1nc2c(-c3cc(Br)c(Br)[nH]3)nccc2[nH]1. (2) The pIC50 is 4.0. The drug is Cc1ccccc1C(=O)Nc1cccc(NC(=O)c2cccs2)c1. The target protein (P18054) has sequence MGRYRIRVATGAWLFSGSYNRVQLWLVGTRGEAELELQLRPARGEEEEFDHDVAEDLGLLQFVRLRKHHWLVDDAWFCDRITVQGPGACAEVAFPCYRWVQGEDILSLPEGTARLPGDNALDMFQKHREKELKDRQQIYCWATWKEGLPLTIAADRKDDLPPNMRFHEEKRLDFEWTLKAGALEMALKRVYTLLSSWNCLEDFDQIFWGQKSALAEKVRQCWQDDELFSYQFLNGANPMLLRRSTSLPSRLVLPSGMEELQAQLEKELQNGSLFEADFILLDGIPANVIRGEKQYLAAPLVMLKMEPNGKLQPMVIQIQPPNPSSPTPTLFLPSDPPLAWLLAKSWVRNSDFQLHEIQYHLLNTHLVAEVIAVATMRCLPGLHPIFKFLIPHIRYTMEINTRARTQLISDGGIFDKAVSTGGGGHVQLLRRAAAQLTYCSLCPPDDLADRGLLGLPGALYAHDALRLWEIIARYVEGIVHLFYQRDDIVKGDPELQAWCR.... (3) The drug is Fc1ccc2sc([C@]34CNC[C@H]3C4)cc2c1. The target protein sequence is MLLARMKPQVQPELGGADQLPEQPLRPCKTADLLVVKERNGVQCLLASQDGDAQPRETWGKEIDFLLSVVGFAVDLANVWRFPYLCYKNGGGAFLIPYTLFLIIAGMPLFYMELALGQFNREGAATVWKICPFFKGVGYAVILIALYVGFYYNVIIAWSLYYLFASFTLNLPWTNCGHAWNSPNCTDPKLLNASVLGDHTKYSKYKFTPAAEFYERGVLHLHESSGIHDIGLPQWQLLLCLMVVIVVLYFSLWKGVKTSGKVVWITATLPYFVLFVLLVHGVTLPGASNGINAYLHIDFYRLKEATVWIDAATQIFFSLGAGFGVLIAFASYNKFDNNCYRDALLTSTINCVTSFISGFAIFSILGYMAHEHKVKIEDVATEGAGLVFVLYPEAISTLSGSTFWAVLFFLMLLALGLDSSMGGMEAVITGLADDFQVLKRHRKLFTCAVTLGTFLLAMFCITKGGIYVLTLLDTFAAGTSILFAVLMEAIGVSWFYGVDR.... The pIC50 is 7.2. (4) The drug is O=C(NCC(F)(F)F)[C@@H]1CN(Cc2ccc(-c3ccncc3)o2)CCN1C[C@@H](O)C[C@@H](Cc1ccccc1)C(=O)N[C@H]1c2ccccc2OC[C@H]1O. The target protein sequence is PQITLWKRPIVTIKIGGQLKEALLDTGADDTVLEEMSLPGKWKPKIIGGIGGFVKVRQYDQVPIEICGHKVIGTVLIGPTPANIIGRNLMTQLGCTLNF. The pIC50 is 9.1. (5) The small molecule is Br.Cn1ccncc1=N. The target protein (O97972) has sequence MEGGFTGGDEYQKHFLPRDYLNTYYSFQSGPSPEAEMLKFNLECLHKTFGPGGLQGDTLIDIGSGPTIYQVLAACESFKDITLSDFTDRNREELAKWLKKEPGAYDWTPALKFACELEGNSGRWQEKAEKLRATVKRVLKCDANLSNPLTPVVLPPADCVLTLLAMECACCSLDAYRAALRNLASLLKPGGHLVTTVTLQLSSYMVGEREFSCVALEKEEVEQAVLDAGFDIEQLLYSPQSYSASTAPNRGVCFLVARKKPGS. The pIC50 is 2.3. (6) The small molecule is Cc1ccn2c(-c3ccc4cccc(OC5CCNCCC5F)c4n3)nnc2c1. The target protein (O70444) has sequence MLLSKFGSLAHLCGPGGVDHLPVKILQPAKADKESFEKVYQVGAVLGSGGFGTVYAGSRIADGLPVAVKHVVKERVTEWGSLGGMAVPLEVVLLRKVGAAGGARGVIRLLDWFERPDGFLLVLERPEPAQDLFDFITERGALDEPLARRFFAQVLAAVRHCHNCGVVHRDIKDENLLVDLRSGELKLIDFGSGAVLKDTVYTDFDGTRVYSPPEWIRYHRYHGRSATVWSLGVLLYDMVCGDIPFEQDEEILRGRLFFRRRVSPECQQLIEWCLSLRPSERPSLDQIAAHPWMLGTEGSVPENCDLRLCALDTDDGASTTSSSESL. The pIC50 is 8.5. (7) The drug is N#CC1=C(SCc2ccccc2)SC(c2ccccc2O)NC1=O. The target protein (Q8ZEY1) has sequence MTTANQPICPSPAKWPSPAKLNLFLYITGQRADGYHQLQTLFQFLDYGDQLTIEPRDDNQIRLLTPIAGVENEQNLIVRAAKMLQKHPGNTPVPRGADISIDKCLPMGGGLGGGSSNAATVLVALNLLWQCGLTDEQLADLGLTLGADVPVFVRGHAAFAEGIGEKLQPAEPVEKWYLVIHPGVNIPTPIIFSDPELKRNTPIRPLAALLSTPYANDCEPIARKRFREVEQALSWLLEYAPSRLTGTGACVFAEFDTESSARQVLSIAPEWLHGFVARGVNVSPLHRVRSGKIESSERR. The pIC50 is 5.0. (8) The pIC50 is 3.8. The small molecule is CCc1ccc(S(=O)(=O)Nc2ccc(NC(=O)Nc3ccccc3[N+](=O)[O-])cc2)cc1. The target protein (C3L5T6) has sequence MRKIGIIGGTFDPPHYGHLLIANEVYHALNLEEVWFLPNQIPPHKQGRNITSVESRLQMLELATEAEEHFSICLEELSRKGPSYTYDTMLQLTKKYPDVQFHFIIGGDMVEYLPKWYNIEALLDLVTFVGVARPGYKLRTPYPITTVEIPEFAVSSSLLRERYKEKKTCKYLLPEKVQVYIERNGLYES.