This data is from Drug-target binding data from BindingDB using IC50 measurements. The task is: Regression. Given a target protein amino acid sequence and a drug SMILES string, predict the binding affinity score between them. We predict pIC50 (pIC50 = -log10(IC50 in M); higher means more potent). Dataset: bindingdb_ic50. (1) The compound is Nc1ncnc2c1nc(NCc1ccc(-c3cccnc3)cc1)n2[C@@H]1O[C@H](CO)[C@@H](O)[C@H]1O. The target protein (Q62773) has sequence MAKSEGRKSASQDTSENGMENPGLELMEVGNLEQGKTLEEVTQGHSLKDGLGHSSLWRRILQPFTKARSFYQRHAGLFKKILLGLLCLAYAAYLLAACILNFRRALALFVITCLVIFILACHFLKKFFAKKSIRCLKPLKNTRLRLWLKRVFMGAAVVGLILWLALDTAQRPEQLISFAGICMFILILFACSKHHSAVSWRTVFWGLGLQFVFGILVIRTEPGFNAFQWLGDQIQIFLAYTVEGSSFVFGDTLVQSVFAFQSLPIIIFFGCVMSILYYLGLVQWVIQKIAWFLQITMGTTAAETLAVAGNIFVGMTEAPLLIRPYLADMTLSEIHAVMTGGFATIAGTVLGAFISFGIDASSLISASVMAAPCALALSKLVYPEVEESKFKSKEGVKLPRGEERNILEAASNGATDAIALVANVAANLIAFLAVLAFINSTLSWLGEMVDIHGLTFQVICSYVLRPMVFMMGVQWADCPLVAEIVGVKFFINEFVAYQQL.... The pIC50 is 4.0. (2) The compound is O=C(O)[C@H]1/C(=C/CO)O[C@@H]2CC(=O)N21. The target protein (P25910) has sequence MKTVFILISMLFPVAVMAQKSVKISDDISITQLSDKVYTYVSLAEIEGWGMVPSNGMIVINNHQAALLDTPINDAQTEMLVNWVTDSLHAKVTTFIPNHWHGDCIGGLGYLQRKGVQSYANQMTIDLAKEKGLPVPEHGFTDSLTVSLDGMPLQCYYLGGGHATDNIVVWLPTENILFGGCMLKDNQATSIGNISDADVTAWPKTLDKVKAKFPSARYVVPGHGDYGGTELIEHTKQIVNQYIESTSKP. The pIC50 is 6.6. (3) The drug is CC1=C(C)Cc2c(-c3ccccc3)[nH]c(-c3ccccc3)c2C1. The target protein (P22437) has sequence MSRRSLSLWFPLLLLLLLPPTPSVLLADPGVPSPVNPCCYYPCQNQGVCVRFGLDNYQCDCTRTGYSGPNCTIPEIWTWLRNSLRPSPSFTHFLLTHGYWLWEFVNATFIREVLMRLVLTVRSNLIPSPPTYNSAHDYISWESFSNVSYYTRILPSVPKDCPTPMGTKGKKQLPDVQLLAQQLLLRREFIPAPQGTNILFAFFAQHFTHQFFKTSGKMGPGFTKALGHGVDLGHIYGDNLERQYHLRLFKDGKLKYQVLDGEVYPPSVEQASVLMRYPPGVPPERQMAVGQEVFGLLPGLMLFSTIWLREHNRVCDLLKEEHPTWDDEQLFQTTRLILIGETIKIVIEEYVQHLSGYFLQLKFDPELLFRAQFQYRNRIAMEFNHLYHWHPLMPNSFQVGSQEYSYEQFLFNTSMLVDYGVEALVDAFSRQRAGRIGGGRNFDYHVLHVAVDVIKESREMRLQPFNEYRKRFGLKPYTSFQELTGEKEMAAELEELYGDI.... The pIC50 is 7.0. (4) The small molecule is Cc1cc(O)ccc1Cc1c(C(C)C)n[nH]c1O[C@@H]1O[C@H](CO)[C@@H](O)[C@H](O)[C@H]1O. The pIC50 is 6.7. The target protein (P53790) has sequence MDSSTLSPAVTATDAPIQSYERIRNAADISVIVIYFVVVMAVGLWAMFSTNRGTVGGFFLAGRSMVWWPIGASLFASNIGSGHFVGLAGTGAAAGIAMGGFEWNALVFVVVLGWLFVPIYIKAGVVTMPEYLRKRFGGKRIQIYLSVLSLLLYIFTKISADIFSGAIFINLALGLDIYLAIFILLAITALYTITGGLAAVIYTDTLQTAIMLVGSFILTGFAFREVGGYEAFMDKYMKAIPTLVSDGNITVKEECYTPRADSFHIFRDPITGDMPWPGLIFGLSILALWYWCTDQVIVQRCLSAKNMSHVKAGCTLCGYLKLLPMFLMVMPGMISRILYTDKIACVLPSECKKYCGTPVGCTNIAYPTLVVELMPNGLRGLMLSVMMASLMSSLTSIFNSASTLFTMDIYTKIRKGASEKELMIAGRLFILVLIGISIAWVPIVQSAQSGQLFDYIQSITSYLGPPIAAVFLLAIFCKRVNEPGAFWGLILGFLIGISRM.... (5) The small molecule is CC(C)Oc1ccc(-c2cc3onc(-c4ccccc4)c3c(=O)n2C)cc1. The target protein (P35400) has sequence MVQLGKLLRVLTLMKFPCCVLEVLLCVLAAAARGQEMYAPHSIRIEGDVTLGGLFPVHAKGPSGVPCGDIKRENGIHRLEAMLYALDQINSDPNLLPNVTLGARILDTCSRDTYALEQSLTFVQALIQKDTSDVRCTNGEPPVFVKPEKVVGVIGASGSSVSIMVANILRLFQIPQISYASTAPELSDDRRYDFFSRVVPPDSFQAQAMVDIVKALGWNYVSTLASEGSYGEKGVESFTQISKEAGGLCIAQSVRIPQERKDRTIDFDRIIKQLLDTPNSRAVVIFANDEDIKQILAAAKRADQVGHFLWVGSDSWGSKINPLHQHEDIAEGAITIQPKRATVEGFDAYFTSRTLENNRRNVWFAEYWEENFNCKLTISGSKKEDTDRKCTGQERIGKDSNYEQEGKVQFVIDAVYAMAHALHHMNKDLCADYRGVCPEMEQAGGKKLLKYIRHVNFNGSAGTPVMFNKNGDAPGRYDIFQYQTTNTTNPGYRLIGQWTD.... The pIC50 is 5.0.