This data is from Drug-target binding data from BindingDB using IC50 measurements. The task is: Regression. Given a target protein amino acid sequence and a drug SMILES string, predict the binding affinity score between them. We predict pIC50 (pIC50 = -log10(IC50 in M); higher means more potent). Dataset: bindingdb_ic50. (1) The drug is CCc1nc(N)nc(N)c1C#CCc1cc(OC)c(OC)c(-c2ccncc2)c1. The target protein (P13955) has sequence MTLSIIVAHDKQRVIGYQNQLPWHLPNDLKHIKQLTTGNTLVMARKTFNSIGKPLPNRRNVVLTNQASFHHEGVDVINSLDEIKELSGHVFIFGGQTLYEAMIDQVDDMYITVIDGKFQGDTFFPPYTFENWEVESSVEGQLDEKNTIPHTFLHLVRRKGK. The pIC50 is 6.4. (2) The compound is Cc1ccc(S(=O)(=O)NC[C@H]2OC(O)[C@H](NC(=O)/C=C/c3ccc(Cl)c(Cl)c3)[C@@H](O)[C@@H]2O)cc1. The pIC50 is 5.7. The target protein (P52789) has sequence MIASHLLAYFFTELNHDQVQKVDQYLYHMRLSDETLLEISKRFRKEMEKGLGATTHPTAAVKMLPTFVRSTPDGTEHGEFLALDLGGTNFRVLWVKVTDNGLQKVEMENQIYAIPEDIMRGSGTQLFDHIAECLANFMDKLQIKDKKLPLGFTFSFPCHQTKLDESFLVSWTKGFKSSGVEGRDVVALIRKAIQRRGDFDIDIVAVVNDTVGTMMTCGYDDHNCEIGLIVGTGSNACYMEEMRHIDMVEGDEGRMCINMEWGAFGDDGSLNDIRTEFDQEIDMGSLNPGKQLFEKMISGMYMGELVRLILVKMAKEELLFGGKLSPELLNTGRFETKDISDIEGEKDGIRKAREVLMRLGLDPTQEDCVATHRICQIVSTRSASLCAATLAAVLQRIKENKGEERLRSTIGVDGSVYKKHPHFAKRLHKTVRRLVPGCDVRFLRSEDGSGKGAAMVTAVAYRLADQHRARQKTLEHLQLSHDQLLEVKRRMKVEMERGLS.... (3) The drug is C=CC(=O)Nc1cncc(-c2cnc3[nH]cc(-c4cn(C)c(=O)c5[nH]ccc45)c3c2)c1. The target protein (P19174) has sequence MAGAASPCANGCGPGAPSDAEVLHLCRSLEVGTVMTLFYSKKSQRPERKTFQVKLETRQITWSRGADKIEGAIDIREIKEIRPGKTSRDFDRYQEDPAFRPDQSHCFVILYGMEFRLKTLSLQATSEDEVNMWIKGLTWLMEDTLQAPTPLQIERWLRKQFYSVDRNREDRISAKDLKNMLSQVNYRVPNMRFLRERLTDLEQRSGDITYGQFAQLYRSLMYSAQKTMDLPFLEASTLRAGERPELCRVSLPEFQQFLLDYQGELWAVDRLQVQEFMLSFLRDPLREIEEPYFFLDEFVTFLFSKENSVWNSQLDAVCPDTMNNPLSHYWISSSHNTYLTGDQFSSESSLEAYARCLRMGCRCIELDCWDGPDGMPVIYHGHTLTTKIKFSDVLHTIKEHAFVASEYPVILSIEDHCSIAQQRNMAQYFKKVLGDTLLTKPVEISADGLPSPNQLKRKILIKHKKLAEGSAYEEVPTSMMYSENDISNSIKNGILYLEDP.... The pIC50 is 5.3. (4) The target protein (P43404) has sequence MPDPAAHLPFFYGSISRAEAEEHLKLAGMADGLFLLRQCLRSLGGYVLSLVHDVRFHHFPIERQLNGTYAIAGGKAHCGPAELCQFYSQDPDGLPCNLRKPCNRPPGLEPQPGVFDCLRDAMVRDYVRQTWKLEGDALEQAIISQAPQVEKLIATTAHERMPWYHSSLTREEAERKLYSGQQTDGKFLLRPRKEQGTYALSLVYGKTVYHYLISQDKAGKYCIPEGTKFDTLWQLVEYLKLKADGLIYRLKEVCPNSSASAAVAAPTLPAHPSTFTQPQRRVDTLNSDGYTPEPARLASSTDKPRPMPMDTSVYESPYSDPEELKDKKLFLKRENLLVADIELGCGNFGSVRQGVYRMRKKQIDVAIKVLKQGTEKADKDEMMREAQIMHQLDNPYIVRLIGVCQAEALMLVMEMAGGGPLHKFLLGKKEEIPVSNVAELLHQVAMGMKYLEEKNFVHRDLAARNVLLVNRHYAKISDFGLSKALGADDSYYTARSAGKW.... The drug is Cc1ccc(NC(=O)Nc2cc(C(F)(F)F)ccc2F)cc1Nc1ccc2c(c1)NC(=O)/C2=C\c1ccc[nH]1. The pIC50 is 5.3. (5) The drug is CC[C@@H](C)[C@H](NC(=O)[C@@H](NC(=O)[C@@H](NC(=O)CNC(=O)[C@H](C)NC(=O)[C@H](Cc1ccc(O)cc1)NC(C)=O)[C@@H](C)O)C(C)C)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CC(=O)O)C(=O)N[C@@H](CC(C)C)C(=O)O. The target protein (P23921) has sequence MHVIKRDGRQERVMFDKITSRIQKLCYGLNMDFVDPAQITMKVIQGLYSGVTTVELDTLAAETAATLTTKHPDYAILAARIAVSNLHKETKKVFSDVMEDLYNYINPHNGKHSPMVAKSTLDIVLANKDRLNSAIIYDRDFSYNYFGFKTLERSYLLKINGKVAERPQHMLMRVSVGIHKEDIDAAIETYNLLSERWFTHASPTLFNAGTNRPQLSSCFLLSMKDDSIEGIYDTLKQCALISKSAGGIGVAVSCIRATGSYIAGTNGNSNGLVPMLRVYNNTARYVDQGGNKRPGAFAIYLEPWHLDIFEFLDLKKNTGKEEQRARDLFFALWIPDLFMKRVETNQDWSLMCPNECPGLDEVWGEEFEKLYASYEKQGRVRKVVKAQQLWYAIIESQTETGTPYMLYKDSCNRKSNQQNLGTIKCSNLCTEIVEYTSKDEVAVCNLASLALNMYVTSEHTYDFKKLAEVTKVVVRNLNKIIDINYYPVPEACLSNKRHRP.... The pIC50 is 5.5. (6) The pIC50 is 5.8. The drug is N#CC(C(=O)Nc1ccc(-c2ccccc2)cc1)C(=O)C1CC1. The target protein (O35435) has sequence MAWRQLRKRALDAAIILGGGGLLFTSYLTATGDDHFYAEYLMPALQRLLDPESAHRLAVRVISLGLLPRATFQDSNMLEVRVLGHKFRNPVGIAAGFDKHGEAVDGLYKLGFGFVEVGSVTPQPQEGNPRPRVFRLPEDQAVINRYGFNSHGLSAVEHRLRARQQKQTQLTTDGLPLGINLGKNKTSVDAAADYVEGVRILGPLADYLVVNVSSPNTAGLRSLQGKTELRRLLSKVLQERDALKGPQKPAVLVKIAPDLTAQDKEDIASVARELGIDGLIITNTTVSRPVGLQGALRSETGGLSGKPLRDLSTQTIREMYALTQGTIPIIGVGGVSSGQDALEKIQAGASLVQLYTALTFLGPPVVARVKRELEALLKERGFNTVTDAIGVDHRR.