This data is from Drug-target binding data from BindingDB using IC50 measurements. The task is: Regression. Given a target protein amino acid sequence and a drug SMILES string, predict the binding affinity score between them. We predict pIC50 (pIC50 = -log10(IC50 in M); higher means more potent). Dataset: bindingdb_ic50. (1) The small molecule is C[C@H](N)C(=O)NCC(=O)N[C@H]1CSSC[C@@H](C(=O)O)NC(=O)[C@H](CO)NC(=O)[C@H]([C@@H](C)O)NC(=O)[C@H](Cc2ccccc2)NC(=O)[C@H]([C@@H](C)O)NC(=O)[C@H](CCCCN)NC(=O)[C@H](Cc2c[nH]c3ccccc23)NC(=O)[C@H](Cc2ccccc2)NC(=O)[C@H](Cc2ccccc2)NC(=O)[C@H](CC(N)=O)NC(=O)[C@H](CCCCN)NC1=O. The target protein (P49287) has sequence MPDNSSIANCCAASGLAARPSWPGSAEAEPPETPRAPWVAPMLSTVVIVTTAVDFVGNLLVILSVLRNRKLRNAGNLFVVNLALADLVVALYPYPLILVAILHDGWVLGEIHCKASAFVMGLSVIGSVFNITAIAINRYWCICHSATYHRACSQWHAPLYISLIWLLTLVALVPNFFVGSLEYDPRIYSCTFIQTASTQYTMAVVAIHFLLPIAVVSFCYLRIWILVLQARRKAKAERKLRLRPSDLRSFLTMFAVFVVFAICWAPLNCIGLAVAINPEAMALQIPEGLFVTSYFLAYFNSCLNAIVYGLLNQNFRREYKRILSALWSTGRCFHDASKCHLTEDLQGPVPPAAMATIPVQEGAL. The pIC50 is 8.1. (2) The drug is CC(Oc1ccccc1)C(=O)Nc1ccc(Cl)cc1. The target protein (P49058) has sequence MPNKITKEALTFDDVSLIPRKSSVLPSEVSLKTQLTKNISLNIPFLSSAMDTVTESQMAIAIAKEGGIGIIHKNMSIEAQRKEIEKVKTYKFQKTINTNGDTNEQKPEIFTAKQHLEKSDAYKNAEHKEDFPNACKDLNNKLRVGAAVSIDIDTIERVEELVKAHVDILVIDSAHGHSTRIIELIKKIKTKYPNLDLIAGNIVTKEAALDLISVGADCLKVGIGPGSICTTRIVAGVGVPQITAICDVYEACNNTNICIIADGGIRFSGDVVKAIAAGADSVMIGNLFAGTKESPSEEIIYNGKKFKSYVGMGSISAMKRGSKSRYFQLENNEPKKLVPEGIEGMVPYSGKLKDILTQLKGGLMSGMGYLGAATISDLKINSKFVKISHSSLKESHPHDVFSIT. The pIC50 is 5.8. (3) The compound is Cc1ccc(C2Oc3ccc([N+](=O)[O-])cc3-n3c(-c4ccccc4)c4c(=O)n(C)c(=O)n(C)c4c32)o1. The target protein (P13569) has sequence MQRSPLEKASVVSKLFFSWTRPILRKGYRQRLELSDIYQIPSVDSADNLSEKLEREWDRELASKKNPKLINALRRCFFWRFMFYGIFLYLGEVTKAVQPLLLGRIIASYDPDNKEERSIAIYLGIGLCLLFIVRTLLLHPAIFGLHHIGMQMRIAMFSLIYKKTLKLSSRVLDKISIGQLVSLLSNNLNKFDEGLALAHFVWIAPLQVALLMGLIWELLQASAFCGLGFLIVLALFQAGLGRMMMKYRDQRAGKISERLVITSEMIENIQSVKAYCWEEAMEKMIENLRQTELKLTRKAAYVRYFNSSAFFFSGFFVVFLSVLPYALIKGIILRKIFTTISFCIVLRMAVTRQFPWAVQTWYDSLGAINKIQDFLQKQEYKTLEYNLTTTEVVMENVTAFWEEGFGELFEKAKQNNNNRKTSNGDDSLFFSNFSLLGTPVLKDINFKIERGQLLAVAGSTGAGKTSLLMVIMGELEPSEGKIKHSGRISFCSQFSWIMPG.... The pIC50 is 7.0. (4) The drug is CCC1=C(C(C)C)/C(=C/C(C)=C/C=C/C(C)=C/C(=O)O)CCC1. The target protein (P11416) has sequence MASNSSSCPTPGGGHLNGYPVPPYAFFFPPMLGGLSPPGALTSLQHQLPVSGYSTPSPATIETQSSSSEEIVPSPPSPPPLPRIYKPCFVCQDKSSGYHYGVSACEGCKGFFRRSIQKNMVYTCHRDKNCIINKVTRNRCQYCRLQKCFDVGMSKESVRNDRNKKKKEAPKPECSESYTLTPEVGELIEKVRKAHQETFPALCQLGKYTTNNSSEQRVSLDIDLWDKFSELSTKCIIKTVEFAKQLPGFTTLTIADQITLLKAACLDILILRICTRYTPEQDTMTFSDGLTLNRTQMHNAGFGPLTDLVFAFANQLLPLEMDDAETGLLSAICLICGDRQDLEQPDKVDMLQEPLLEALKVYVRKRRPSRPHMFPKMLMKITDLRSISAKGAERVITLKMEIPGSMPPLIQEMLENSEGLDTLSGQSGGGTRDGGGLAPPPGSCSPSLSPSSHRSSPATQSP. The pIC50 is 7.8. (5) The drug is CC(C)Oc1ccc(C(=O)NO)cc1NC(=O)c1ccccc1. The target protein sequence is MSVGIVYGDQYRQLCCSSPKFGDRYALVMDLINAYKLIPELSRVPPLQWDSPSRMYEAVTAFHSTEYVDALKKLQMLHCEEKELTADDELLMDSFSLNYDCPGFPSVFDYSLAAVQGSLAAASALICRHCEVVINWGGGWHHAKRSEASGFCYLNDIVLAIHRLVSSTPPETSPNRQTRVLYVDLDLHHGDGVEEAFWYSPRVVTFSVHHASPGFFPGTGTWNMVDNDKLPIFLNGAGRGRFSAFNLPLEEGINDLDWSNAIGPILDSLNIVIQPSYVVVQCGADCLATDPHRIFRLTNFYPNLNLDSDCDSECSLSGYLYAIKKILSWKVPTLILGGGGYNFPDTARLWTRVTALTIEEVKGKKMTISPEIPEHSYFSRYGPDFELDIDYFPHESHNKTLDSIQKHHRRILEQLRNYADLNKLIYDYDQVYQLYNLTGM. The pIC50 is 6.7. (6) The small molecule is NCCCNCCCCCC(CCN)SC[C@H]1O[C@@H](n2cnc3c(N)ncnc32)[C@H](O)[C@@H]1O. The target protein sequence is MEPGPDGPAAPGPAAIREGWFRETCSLWPGQALSLQVEQLLHHRRSRYQDILVFRSKTYGNVLVLDGVIQCTERDEFSYQEMIANLPLCSHPNPRKVLIIGGGDGGVLREVVKHPSVESVVQCEIDEDVIEVSKKFLPGMAVGYSSSKLTLHVGDGFEFMKQNQDAFDVIITDSSDPMGPAESLFKESYYQLMKTALKEDGILCCQGECQWLHLDLIKEMRHFCKSLFPVVSYAYCTIPTYPSGQIGFMLCSKNPSTNFREPVQQLTQAQVEQMQLKYYNSDMHRAAFVLPEFTRKALNDIS. The pIC50 is 5.0. (7) The small molecule is Fc1cccc(CN2CCn3nnnc3C2c2cccc(F)c2F)c1. The target protein sequence is HPGLGELGQGPDSYGSPSFRSTPEAPYASLTEIEHLVQSVCKSYRETCQLRLEDLLRQRSNIFSREEVTGYQRKSMWEMWERCAHHLTEAIQYVVEFAKRLSGFMELCQNDQIVLLKAGAMEVVLVRMCRAYNADNRTVFFEGKYGGMELFRALGCSELISSIFDFSHSLSALHFSEDEIALYTALVLINAHRPGLQEKRKVEQLQYNLELAFHHHLCKTHRQSILAKLPPKGKLRSLCSQHVERLQIFQHLHPIVVQAA. The pIC50 is 5.3.