Dataset: Drug-target binding data from BindingDB using IC50 measurements. Task: Regression. Given a target protein amino acid sequence and a drug SMILES string, predict the binding affinity score between them. We predict pIC50 (pIC50 = -log10(IC50 in M); higher means more potent). Dataset: bindingdb_ic50. (1) The compound is O=c1[nH]c2ccccc2n1C1CCN(CCCC(c2ccc(F)cc2)c2ccc(F)cc2)CC1. The target protein (Q01118) has sequence MLASPEPKGLVPFTKESFELIKQHIAKTHNEDHEEEDLKPTPDLEVGKKLPFIYGNLSQGMVSEPLEDVDPYYYKKKNTFIVLNKNRTIFRFNAASILCTLSPFNCIRRTTIKVLVHPFFQLFILISVLIDCVFMSLTNLPKWRPVLENTLLGIYTFEILVKLFARGVWAGSFSFLGDPWNWLDFSVTVFEVIIRYSPLDFIPTLQTARTLRILKIIPLNQGLKSLVGVLIHCLKQLIGVIILTLFFLSIFSLIGMGLFMGNLKHKCFRWPQENENETLHNRTGNPYYIRETENFYYLEGERYALLCGNRTDAGQCPEGYVCVKAGINPDQGFTNFDSFGWALFALFRLMAQDYPEVLYHQILYASGKVYMIFFVVVSFLFSFYMASLFLGILAMAYEEEKQRVGEISKKIEPKFQQTGKELQEGNETDEAKTIQIEMKKRSPISTDTSLDVLEDATLRHKEELEKSKKICPLYWYKFAKTFLIWNCSPCWLKLKEFVHR.... The pIC50 is 7.3. (2) The compound is C=CC(=O)Nc1ccc(S(=O)(=O)N2CCN(C(=O)C34CC5CC(CC(C5)C3)C4)CC2)cc1. The target protein (P21981) has sequence MAEELLLERCDLEIQANGRDHHTADLCQEKLVLRRGQRFRLTLYFEGRGYEASVDSLTFGAVTGPDPSEEAGTKARFSLSDNVEEGSWSASVLDQQDNVLSLQLCTPANAPIGLYRLSLEASTGYQGSSFVLGHFILLYNAWCPADDVYLDSEEERREYVLTQQGFIYQGSVKFIKSVPWNFGQFEDGILDTCLMLLDMNPKFLKNRSRDCSRRSSPIYVGRVVSAMVNCNDDQGVLLGRWDNNYGDGISPMAWIGSVDILRRWKEHGCQQVKYGQCWVFAAVACTVLRCLGIPTRVVTNYNSAHDQNSNLLIEYFRNEFGELESNKSEMIWNFHCWVESWMTRPDLQPGYEGWQAIDPTPQEKSEGTYCCGPVSVRAIKEGDLSTKYDAPFVFAEVNADVVDWIRQEDGSVLKSINRSLVVGQKISTKSVGRDDREDITHTYKYPEGSPEEREVFTKANHLNKLAEKEETGVAMRIRVGDSMSMGNDFDVFAHIGNDTS.... The pIC50 is 7.8. (3) The small molecule is Cc1c(-c2[nH]c3ccc(C4CCN(C(=O)CCN5CCCCCC5)CC4)cc3c2C(C)C)cn2ncnc2c1C. The target protein (Q9NR97) has sequence MENMFLQSSMLTCIFLLISGSCELCAEENFSRSYPCDEKKQNDSVIAECSNRRLQEVPQTVGKYVTELDLSDNFITHITNESFQGLQNLTKINLNHNPNVQHQNGNPGIQSNGLNITDGAFLNLKNLRELLLEDNQLPQIPSGLPESLTELSLIQNNIYNITKEGISRLINLKNLYLAWNCYFNKVCEKTNIEDGVFETLTNLELLSLSFNSLSHVPPKLPSSLRKLFLSNTQIKYISEEDFKGLINLTLLDLSGNCPRCFNAPFPCVPCDGGASINIDRFAFQNLTQLRYLNLSSTSLRKINAAWFKNMPHLKVLDLEFNYLVGEIASGAFLTMLPRLEILDLSFNYIKGSYPQHINISRNFSKLLSLRALHLRGYVFQELREDDFQPLMQLPNLSTINLGINFIKQIDFKLFQNFSNLEIIYLSENRISPLVKDTRQSYANSSSFQRHIRKRRSTDFEFDPHSNFYHFTRPLIKPQCAAYGKALDLSLNSIFFIGPNQ.... The pIC50 is 8.1. (4) The small molecule is O=c1[nH]c2cccc(Cl)c2cc1O. The target protein (P31228) has sequence MDTVRIAVVGAGVMGLSTAVCISKMVPGCSITVISDKFTPETTSDVAAGMLIPPTYPDTPIQKQKQWFKETFDHLFAIVNSAEAEDAGVILVSGWQIFQSIPTEEVPYWADVVLGFRKMTKDELKKFPQHVFGHAFTTLKCEGPAYLPWLQKRVKGNGGLILTRRIEDLWELHPSFDIVVNCSGLGSRQLAGDSKIFPVRGQVLKVQAPWVKHFIRDSSGLTYIYPGVSNVTLGGTRQKGDWNLSPDAEISKEILSRCCALEPSLRGAYDLREKVGLRPTRPSVRLEKELLAQDSRRLPVVHHYGHGSGGIAMHWGTALEATRLVNECVQVLRTPAPKSKL. The pIC50 is 5.0. (5) The small molecule is Cc1c(C(=O)C(N)=O)c2c(OCC(=O)O)cccc2n1Cc1cccc2ccccc12. The target protein (P04054) has sequence MKLLVLAVLLTVAAADSGISPRAVWQFRKMIKCVIPGSDPFLEYNNYGCYCGLGGSGTPVDELDKCCQTHDNCYDQAKKLDSCKFLLDNPYTHTYSYSCSGSAITCSSKNKECEAFICNCDRNAAICFSKAPYNKAHKNLDTKKYCQS. The pIC50 is 5.9. (6) The drug is O=C(NCCN1CCOCC1)c1ccc2c(c1)C(=O)c1ccc(Nc3ccc(F)c(NC(=O)c4ccccc4)c3)cc1OC2. The target protein sequence is MSQERPTFYRQELNKTIWEVPERYQNLSPVGSGAYGSVCAAFDTKTGLRVAVKKLSRPFQSIIHAKRTYRELRLLKHMKHENVIGLLDVFTPARSLEEFNDVYLVTHLMGADLNNIVKCQKLTDDHVQFLIYQILRGLKYIHSADIIHRDLKPSNLAVNEDCELKILDFGLARHTDDEMTGYVATRWYRAPEIMLNWMHYNQTVDIWSVGCIMAELLTGRTLFPGTDHIDQLKLILRLVGTPGAELLKKISSESARNYIQSLTQMPKMNFANVFIGANPLAVDLLEKMLVLDSDKRITAAQALAHAYFAQYHDPDDEPVADPYDQSFESRDLLIDEWKSLTYDEVISFVPPPLDQEEMES. The pIC50 is 9.0.