This data is from Drug-target binding data from BindingDB using IC50 measurements. The task is: Regression. Given a target protein amino acid sequence and a drug SMILES string, predict the binding affinity score between them. We predict pIC50 (pIC50 = -log10(IC50 in M); higher means more potent). Dataset: bindingdb_ic50. (1) The small molecule is CC(Oc1cccc([N+](=O)[O-])c1Cl)C(=O)Nc1ccc2oc(-c3ccncc3)nc2c1. The target protein sequence is MWESKFVKEGLTFDDVLLVPAKSDVLPREVSVKTVLSESLQLNIPLISAGMDTVTEADMAIAMARQGGLGIIHKNMSIEQQAEQVDKVKRSESGVISDPFFLTPEHQVYDAEHLMGKYRISGVPVVNNLDERKLVGIITNRDMRFIQDYSIKISDVMTKEQLITAPVGTTLSEAEKILQKYKIEKLPLVDNNGVLQGLITIKDIEKVIEFPNSAKDKQGRLLVGAAVGVTADAMTRIDALVKASVDAIVLDTAHGHSQGVIDKVKEVRAKYPSLNIIAGNVATAEATKALIEAGANVVKVGIGPGSICTTRVVAGVGVPQLTAVYDCATEARKHGIPVIADGGIKYSGDMVKALAAGAHVVMLGSMFAGVAESPGETEIYQGRQFKVYRGMGSVGAMEKGSKDRYFQEGNKKLVPEGIEGRVPYKGPLADTVHQLVGGLRAGMGYCGAQDLEFLRENAQFIRMSGAGLLESHPHHVQITKEAPNYSL. The pIC50 is 7.8. (2) The compound is O=c1ccc2ccccc2o1. The target protein (P22310) has sequence MARGLQVPLPRLATGLLLLLSVQPWAESGKVLVVPTDGSPWLSMREALRELHARGHQAVVLTPEVNMHIKEEKFFTLTAYAVPWTQKEFDRVTLGYTQGFFETEHLLKRYSRSMAIMNNVSLALHRCCVELLHNEALIRHLNATSFDVVLTDPVNLCGAVLAKYLSIPAVFFWRYIPCDLDFKGTQCPNPSSYIPKLLTTNSDHMTFLQRVKNMLYPLALSYICHTFSAPYASLASELFQREVSVVDLVSYASVWLFRGDFVMDYPRPIMPNMVFIGGINCANGKPLSQEFEAYINASGEHGIVVFSLGSMVSEIPEKKAMAIADALGKIPQTVLWRYTGTRPSNLANNTILVKWLPQNDLLGHPMTRAFITHAGSHGVYESICNGVPMVMMPLFGDQMDNAKRMETKGAGVTLNVLEMTSEDLENALKAVINDKSYKENIMRLSSLHKDRPVEPLDLAVFWVEFVMRHKGAPHLRPAAHDLTWYQYHSLDVIGFLLAVV.... The pIC50 is 3.5. (3) The small molecule is CCc1nc(N)nc(N)c1C#C[C@@H](C)c1cc(-c2ccc(C(=O)O)cc2)ccc1OC. The target protein sequence is MKVSLIAAMDKNRVIGKENDIPWRIPEDWEYVKNTTKGYPIILGRKNLESIGRALPGRRNIILTRDKGFSFNGCEIVHSIEDVFELCNSEEEIFIFGGEQIYNLFLPYVEKMYITKIHYEFEGDTFFPEVNYEEWNEVSVTQGITNEKNPYTYYFHIYERKAS. The pIC50 is 7.1. (4) The drug is CCOc1ccc(CC(=O)N[C@H]2CCOC2=O)cc1. The target protein sequence is MHDEREGYLEILSRITTEEEFFSLVLEICGNYGFEFFSFGARAPFPLTAPKYHFLSNYPGEWKSRYISEDYTSIDPIVRHGLLEYTPLIWNGEDFQENRFFWEEALHHGIRHGWSIPVRGKYGLISMLSLVRSSESIAATEILEKESFLLWITSMLQATFGDLLAPRIVPESNVRLTARETEMLKWTAVGKTYGEIGLILSIDQRTVKFHIVNAMRKLNSSNKAEATMKAYAIGLLN. The pIC50 is 6.8. (5) The drug is N=C(N)NCCC[C@H](NC(=O)[C@@H]1CCCN1C(=O)CNC(=O)[C@@H]1C[C@@H](O)CN1C(=O)[C@@H]1CCCN1C(=O)CNC(=O)[C@@H]1C[C@@H](O)CN1C(=O)[C@@H]1CCCN1C(=O)CNC(=O)[C@@H]1C[C@@H](O)CN1C(=O)[C@@H]1CCCN1C(=O)CNC(=O)[C@@H]1C[C@@H](O)CN1C(=O)[C@@H]1CCCN1)C(=O)NCC(=O)N1CCC[C@H]1C(=O)N1C[C@H](O)C[C@H]1C(=O)NCC(=O)N1CCC[C@H]1C(=O)N1C[C@H](O)C[C@H]1C(=O)NCC(=O)N1CCC[C@H]1C(=O)N1C[C@H](O)C[C@H]1C(=O)NCC(=O)N1CCC[C@H]1C(=O)N1C[C@H](O)C[C@H]1C(=O)NCC(N)=O. The target protein (P19324) has sequence MRSLLLGTLCLLAVALAAEVKKPLEAAAPGTAEKLSSKATTLAERSTGLAFSLYQAMAKDQAVENILLSPLVVASSLGLVSLGGKATTASQAKAVLSAEKLRDEEVHTGLGELLRSLSNSTARNVTWKLGSRLYGPSSVSFADDFVRSSKQHYNCEHSKINFRDKRSALQSINEWASQTTDGKLPEVTKDVERTDGALLVNAMFFKPHWDEKFHHKMVDNRGFMVTRSYTVGVTMMHRTGLYNYYDDEKEKLQMVEMPLAHKLSSLIILMPHHVEPLERLEKLLTKEQLKAWMGKMQKKAVAISLPKGVVEVTHDLQKHLAGLGLTEAIDKNKADLSRMSGKKDLYLASVFHATAFEWDTEGNPFDQDIYGREELRSPKLFYADHPFIFLVRDNQSGSLLFIGRLVRPKGDKMRDEL. The pIC50 is 5.3. (6) The compound is CC1(C)O[C@@H]2CO[C@@]3(COS(N)(=O)=O)OC(C)(C)O[C@H]3[C@@H]2O1. The target protein (P27139) has sequence MSHHWGYSKSNGPENWHKEFPIANGDRQSPVDIDTGTAQHDPSLQPLLICYDKVASKSIVNNGHSFNVEFDDSQDFAVLKEGPLSGSYRLIQFHFHWGSSDGQGSEHTVNKKKYAAELHLVHWNTKYGDFGKAVQHPDGLAVLGIFLKIGPASQGLQKITEALHSIKTKGKRAAFANFDPCSLLPGNLDYWTYPGSLTTPPLLECVTWIVLKEPITVSSEQMSHFRKLNFNSEGEAEELMVDNWRPAQPLKNRKIKASFK. The pIC50 is 5.8. (7) The compound is C[C@@H]1O[C@@H](O[C@H]2C[C@@H](O)[C@]3(CO)[C@H]4[C@H](O)C[C@]5(C)[C@@H](C6=CC(=O)OC6)CC[C@]5(O)[C@@H]4CC[C@]3(O)C2)[C@H](O)[C@H](O)[C@H]1O. The target protein (P54707) has sequence MHQKTPEIYSVELSGTKDIVKTDKGDGKEKYRGLKNNCLELKKKNHKEEFQKELHLDDHKLSNRELEEKYGTDIIMGLSSTRAAELLARDGPNSLTPPKQTPEIVKFLKQMVGGFSILLWVGAFLCWIAYGIQYSSDKSASLNNVYLGCVLGLVVILTGIFAYYQEAKSTNIMSSFNKMIPQQALVIRDSEKKTIPSEQLVVGDIVEVKGGDQIPADIRVLSSQGCRVDNSSLTGESEPQPRSSEFTHENPLETKNICFYSTTCLEGTVTGMVINTGDRTIIGHIASLASGVGNEKTPIAIEIEHFVHIVAGVAVSIGILFFIIAVSLKYQVLDSIIFLIGIIVANVPEGLLATVTVTLSLTAKRMAKKNCLVKNLEAVETLGSTSIICSDKTGTLTQNRMTVAHLWFDNQIFVADTSEDHSNQVFDQSSRTWASLSKIITLCNRAEFKPGQENVPIMKKAVIGDASETALLKFSEVILGDVMEIRKRNRKVAEIPFNST.... The pIC50 is 3.7. (8) The small molecule is CCC[C@H](NC(=O)[C@@H]1[C@H]2[C@H]3C=C[C@H](C3)[C@H]2CN1C(=O)[C@@H](NC(=O)[C@@H](NC(=O)c1cnccn1)C1CCCCC1)C(C)(C)C)C(=O)C(=O)NC1CC1. The target protein sequence is APITAYAQQTRGLLGCIITSLTGRDKNQVEGEVQIVSTAAQTFLATCINGVCWTVYHGAGTRTIASPKGPVIQMYTNVDQDLVGWPAPQGSRSLTPCTCGSSDLYLVTRHADVIPVRRRGDSRGSLLSPRPISYLKGSSGGPLLCPAGHAVGIFRAAVCTRGVAKAVDFIPVENLETTMRSPVFTDNSSPPVVPQSFQVAHLHAPTGSGKSTKVPAAYAAQGYKVLVLNPSVAATLGFGAYMSKAHGIDPNIRTGVRTITTGSPITYSTYGKFLADGGCSGGAYDIIICDECHSTDATSILGIGTVLDQAETAGARLVVLATATPPGSVTVPHPNIEEVALSTTGEIPFYGKAIPLEVIKGGRHLIFCHSKKKCDELAAKLVALGINAVAYYRGLDVSVIPTSGDVVVVATDALMTGYTGDFDSVIDCNTCVTQTVDFSLDPTFTIETITLPQDAVSRTQRRGRTGRGKPGIYRFVAPGERPSGMFDSSVLCECYDAGCA.... The pIC50 is 6.7.