Dataset: Drug-target binding data from BindingDB using IC50 measurements. Task: Regression. Given a target protein amino acid sequence and a drug SMILES string, predict the binding affinity score between them. We predict pIC50 (pIC50 = -log10(IC50 in M); higher means more potent). Dataset: bindingdb_ic50. (1) The compound is O=C(O)Cc1csc(=NC(=O)CNc2ccccc2)n1O. The target protein (Q9GZT9) has sequence MANDSGGPGGPSPSERDRQYCELCGKMENLLRCSRCRSSFYCCKEHQRQDWKKHKLVCQGSEGALGHGVGPHQHSGPAPPAAVPPPRAGAREPRKAAARRDNASGDAAKGKVKAKPPADPAAAASPCRAAAGGQGSAVAAEAEPGKEEPPARSSLFQEKANLYPPSNTPGDALSPGGGLRPNGQTKPLPALKLALEYIVPCMNKHGICVVDDFLGKETGQQIGDEVRALHDTGKFTDGQLVSQKSDSSKDIRGDKITWIEGKEPGCETIGLLMSSMDDLIRHCNGKLGSYKINGRTKAMVACYPGNGTGYVRHVDNPNGDGRCVTCIYYLNKDWDAKVSGGILRIFPEGKAQFADIEPKFDRLLFFWSDRRNPHEVQPAYATRYAITVWYFDADERARAKVKYLTGEKGVRVELNKPSDSVGKDVF. The pIC50 is 5.8. (2) The target protein (Q15172) has sequence MSSSSPPAGAASAAISASEKVDGFTRKSVRKAQRQKRSQGSSQFRSQGSQAELHPLPQLKDATSNEQQELFCQKLQQCCILFDFMDSVSDLKSKEIKRATLNELVEYVSTNRGVIVESAYSDIVKMISANIFRTLPPSDNPDFDPEEDEPTLEASWPHIQLVYEFFLRFLESPDFQPSIAKRYIDQKFVQQLLELFDSEDPRERDFLKTVLHRIYGKFLGLRAFIRKQINNIFLRFIYETEHFNGVAELLEILGSIINGFALPLKAEHKQFLMKVLIPMHTAKGLALFHAQLAYCVVQFLEKDTTLTEPVIRGLLKFWPKTCSQKEVMFLGEIEEILDVIEPTQFKKIEEPLFKQISKCVSSSHFQVAERALYFWNNEYILSLIEENIDKILPIMFASLYKISKEHWNPTIVALVYNVLKTLMEMNGKLFDDLTSSYKAERQREKKKELEREELWKKLEELKLKKALEKQNSAYNMHSILSNTSAE. The pIC50 is 9.8. The small molecule is C=C1[C@@H]([C@@H](O)C[C@H](C)[C@H]2O[C@@]3(CCCCO3)CC[C@H]2C)O[C@@H]2CC[C@@]3(CC[C@H](/C=C/[C@@H](C)[C@@H]4CC(C)=C[C@@]5(O[C@H](C[C@@](C)(O)C(=O)O)CC[C@H]5O)O4)O3)O[C@H]2[C@@H]1O. (3) The small molecule is O=C(Nc1ccc(Oc2ccccc2)cc1)Nc1ccc2c(cnn2CCCN2CCC(O)CC2)c1. The target protein sequence is MSVGAGKEGVGRAVGLRGSRGCQAVEEDPFLDCGAQAPGQGGGGRWRLPQPAWVDGRALHSREQATCTGCMDLQASLLSTGPNASNISDGQDNFTLAGPPPRTRSVSYINIIMPSVFGTICLLGIVGNSTVIFAVVKKSKLHWCSNVPDIFIINLSVVDLLFLLGMPFMIHQLMGNGVWHFGETMCTLITAMDANSQFTSTYILTAMAIDRYLATVHPISSTKFRKPSMATLVICLLWALSFISITPVWLYARLIPFPGGAVGCGIRLPNPDTDLYWFTLYQFFLAFALPFVVITAAYVKILQRMTSSVAPASQRSIRLRTKRVTRTAIAICLVFFVCWAPYYVLQLTQLSISRPTLTFVYLYNAAISLGYANSCLNPFVYIVLCETFRKRLVLSVKPAAQGQLRTVSNAQTADEERTESKGT. The pIC50 is 7.0. (4) The drug is CC[C@H](C)[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@H](C(=O)N[C@@H](CCC(=O)O)C(=O)O)C(C)C. The target protein (P09610) has sequence MLSRLFRMHGLFVASHPWEVIVGTVTLTICMMSMNMFTGNNKICGWNYECPKFEEDVLSSDIIILTITRCIAILYIYFQFQNLRQLGSKYILGIAGLFTIFSSFVFSTVVIHFLDKELTGLNEALPFFLLLIDLSRASALAKFALSSNSQDEVRENIARGMAILGPTFTLDALVECLVIGVGTMSGVRQLEIMCCFGCMSVLANYFVFMTFFPACVSLVLELSRESREGRPIWQLSHFARVLEEEENKPNPVTQRVKMIMSLGLVLVHAHSRWIADPSPQNSTTEHSKVSLGLDEDVSKRIEPSVSLWQFYLSKMISMDIEQVVTLSLAFLLAVKYIFFEQAETESTLSLKNPITSPVATPKKAPDNCCRREPVLSRRNEKLSSVEEEPGVNQDRKVEVIKPLVAETESTSRATFVLGASGGCSPVALGTQEPEIELPSEPRPNEECLQILESAEKGAKFLSDAEIIQLVNAKHIPAYKLETLMETHERGVSIRRQLLST.... The pIC50 is 4.1. (5) The compound is CC[C@@H](CO)Nc1ccc2ncc(-c3ccnc(S(C)(=O)=O)n3)n2n1. The target protein (O35492) has sequence MPVLSARRKRLASTAGPRRGSGPSLAVRWVPPLGPEPSSDRGRAPMRPRGPTCSTTRRGAGRGPRLLPGPPGRDLHRCRPDPGGAGQSPRVCEFGARAVRPLGRVEPGPPTAASREGAVLPRAEARAGSGRGARSGEWGLAAAGAWETMHHCKRYRSPEPDPYLSYRWKRRRSYSREHEGRLRYPSRREPPPRRSRSRSHDRIPYQRRYREHRDSDTYRCEERSPSFGEDCYGSSRSRHRRRSRERAPYRTRKHAHHCHKRRTRSCSSASSRSQQSSKRSSRSVEDDKEGHLVCRIGDWLQERYEIVGNLGEGTFGKVVECLDHARGKSQVALKIIRNVGKYREAARLEINVLKKIKEKDKENKFLCVLMSDWFNFHGHMCIAFELLGKNTFEFLKENNFQPYPLPHVRHMAYQLCHALRFLHENQLTHTDLKPENILFVNSEFETLYNEHKSCEEKSVKNTSIRVADFGSATFDHEHHTTIVATRHYRPPEVILELGWA.... The pIC50 is 5.2.