This data is from Drug-target binding data from BindingDB using IC50 measurements. The task is: Regression. Given a target protein amino acid sequence and a drug SMILES string, predict the binding affinity score between them. We predict pIC50 (pIC50 = -log10(IC50 in M); higher means more potent). Dataset: bindingdb_ic50. (1) The small molecule is O=C(O)Cc1cccc(Oc2ccccc2)c1. The target protein sequence is MRAAAAGLGPGRLHAWAARRGLGRFPARVPRAAGGRSPCPASISNSRTLRLAAAGNTFCLASTLSSGCWEPCSWPSASGPGVRRAFFPTSQGGQIQGGLDPVWLFVVIGGIMSVLGFAGCIGALRENTFLLKFFSVFLGLIFFLELAAGILAFVFKDWIRDQLNLFINNNVKAYRDDIDLQNLIDFAQEYWSCCGARGPNDWNLNIRTSTALTSNPSRERCGVPFFCWVRTLRKTSQYPCGYTSAQTELEHKIHLHQSWWPFEKWLKKPDGWPGLGAIASFKMGIAGPNPRSTSRQERPTGKLPGVMATCLAHRAVVGASKDALPHSQWQGLWDVCYDLSPRLCEHGTQEATEAGLGLNWGCTGAGLGLFTTELCVWGVHMCVCVCVCVCVRVCLCLCVRVRGMHVCALVSTPGVSTPLLVRWWPFQSFKGDGARRVGHVPASPSLLWDVSLCGLGACCLRPLHIHHDLEPAWSSPWPQCHSLEMGPRILSVSLSRLPLR.... The pIC50 is 4.8. (2) The small molecule is Cn1ccnc1Sc1cc(NCCCO)c([N+](=O)[O-])c2nonc12. The target protein sequence is MSIQHFRVALIPFFAAFCLPVFAHPETLVKVKDAEDKLGARVGYIELDLNSGKILESFRPEERFPMMSTFKVLLCGAVLSRVDAGQEQLGRRIHYSQNDLVEYSPVTEKHLTDGMTVRELCSAAITMSDNTAANLLLTTIGGPKELTAFLHNMGDHVTRLDRWEPELNEAIPNDERDTTMPAAMATTLRKLLTGELLTLASRQQLIDWMEADKVAGPLLRSALPAGWFIADKSGAGERGSRGIIAALGPDGKPSRIVVIYTTGSQATMDERNRQIAEIGASLIKHW. The pIC50 is 4.7. (3) The compound is Cc1cc(Oc2cncc(F)c2)cc(C(=O)Nc2cccc(F)n2)n1. The target protein (P05177) has sequence MALSQSVPFSATELLLASAIFCLVFWVLKGLRPRVPKGLKSPPEPWGWPLLGHVLTLGKNPHLALSRMSQRYGDVLQIRIGSTPVLVLSRLDTIRQALVRQGDDFKGRPDLYTSTLITDGQSLTFSTDSGPVWAARRRLAQNALNTFSIASDPASSSSCYLEEHVSKEAKALISRLQELMAGPGHFDPYNQVVVSVANVIGAMCFGQHFPESSDEMLSLVKNTHEFVETASSGNPLDFFPILRYLPNPALQRFKAFNQRFLWFLQKTVQEHYQDFDKNSVRDITGALFKHSKKGPRASGNLIPQEKIVNLVNDIFGAGFDTVTTAISWSLMYLVTKPEIQRKIQKELDTVIGRERRPRLSDRPQLPYLEAFILETFRHSSFLPFTIPHSTTRDTTLNGFYIPKKCCVFVNQWQVNHDPELWEDPSEFRPERFLTADGTAINKPLSEKMMLFGMGKRRCIGEVLAKWEIFLFLAILLQQLEFSVPPGVKVDLTPIYGLTMK.... The pIC50 is 7.0. (4) The compound is CCCCCN(CCCCC)C(=O)[C@@H](Cc1c[nH]c2ccccc12)NC(=O)c1cc2ccccc2[nH]1. The target protein (Q63931) has sequence MDVVDSLFVNGSNITSACELGFENETLFCLDRPRPSKEWQPAVQILLYSLIFLLSVLGNTLVITVLIRNKRMRTVTNIFLLSLAVSDLMLCLFCMPFNLIPSLLKDFIFGSAVCKTTTYFMGTSVSVSTFNLVAISLERYGAICKPLQSRVWQTKSHALKVIAATWCLSFTIMTPYPIYSNLVPFTKNNNQTGNMCRFLLPNDVMQQTWHTFLLLILFLIPGIVMMVAYGLISLELYQGIKFDAIQKKSAKERKTSTGSSGPMEDSDGCYLQKSRHPRKLELRQLSPSSSGSNRINRIRSSSSTANLMAKKRVIRMLIVIVVLFFLCWMPIFSANAWRAYDTVSAERHLSGTPISFILLLSYTSSCVNPIIYCFMNKRFRLGFMATFPCCPNPGTPGVRGEMGEEEEGRTTGASLSRYSYSHMSTSAPPP. The pIC50 is 7.3. (5) The drug is CC(C)(C)OC(=O)NCCCOc1ccc2c(c1)c(C#N)cn2-c1ccc(C(=O)O)cc1. The target protein (P80457) has sequence MTADELVFFVNGKKVVEKNADPETTLLAYLRRKLGLRGTKLGCGEGGCGACTVMLSKYDRLQDKIIHFSANACLAPICTLHHVAVTTVEGIGSTKTRLHPVQERIAKSHGSQCGFCTPGIVMSMYTLLRNQPEPTVEEIEDAFQGNLCRCTGYRPILQGFRTFAKNGGCCGGNGNNPNCCMNQKKDHTVTLSPSLFNPEEFMPLDPTQEPIFPPELLRLKDVPPKQLRFEGERVTWIQASTLKELLDLKAQHPEAKLVVGNTEIGIEMKFKNQLFPMIICPAWIPELNAVEHGPEGISFGAACALSSVEKTLLEAVAKLPTQKTEVFRGVLEQLRWFAGKQVKSVASLGGNIITASPISDLNPVFMASGTKLTIVSRGTRRTVPMDHTFFPSYRKTLLGPEEILLSIEIPYSREDEFFSAFKQASRREDDIAKVTCGMRVLFQPGSMQVKELALCYGGMADRTISALKTTQKQLSKFWNEKLLQDVCAGLAEELSLSPDA.... The pIC50 is 7.5. (6) The drug is Cn1nc(-c2ccccn2)nc2c(=O)n(C)c(=O)nc1-2. The target protein (O14672) has sequence MVLLRVLILLLSWAAGMGGQYGNPLNKYIRHYEGLSYNVDSLHQKHQRAKRAVSHEDQFLRLDFHAHGRHFNLRMKRDTSLFSDEFKVETSNKVLDYDTSHIYTGHIYGEEGSFSHGSVIDGRFEGFIQTRGGTFYVEPAERYIKDRTLPFHSVIYHEDDINYPHKYGPQGGCADHSVFERMRKYQMTGVEEVTQIPQEEHAANGPELLRKKRTTSAEKNTCQLYIQTDHLFFKYYGTREAVIAQISSHVKAIDTIYQTTDFSGIRNISFMVKRIRINTTADEKDPTNPFRFPNIGVEKFLELNSEQNHDDYCLAYVFTDRDFDDGVLGLAWVGAPSGSSGGICEKSKLYSDGKKKSLNTGIITVQNYGSHVPPKVSHITFAHEVGHNFGSPHDSGTECTPGESKNLGQKENGNYIMYARATSGDKLNNNKFSLCSIRNISQVLEKKRNNCFVESGQPICGNGMVEQGEECDCGYSDQCKDECCFDANQPEGRKCKLKPG.... The pIC50 is 4.2.