Dataset: Drug-target binding data from BindingDB using IC50 measurements. Task: Regression. Given a target protein amino acid sequence and a drug SMILES string, predict the binding affinity score between them. We predict pIC50 (pIC50 = -log10(IC50 in M); higher means more potent). Dataset: bindingdb_ic50. (1) The small molecule is Cc1cc(O)c2c(c1)O[C@@]13C(=O)N(C)C(CC=C1C2=O)[C@H]3O. The target protein (P15348) has sequence MENGNKALSIEQMYQKKSQLEHILLRPDSYIGSVEFTKELMWVYDNSQNRMVQKEISFVPGLYKIFDEILVNAADNKQRDKSMNTIKIDIDPERNMVSVWNNGQGIPVTMHKEQKMYVPTMIFGHLLTSSNYNDDEKKVTGGRNGYGAKLCNIFSTSFTVETATREYKKSFKQTWGNNMGKASDVQIKDFNGTDYTRITFSPDLAKFKMDRLDEDIVALMSRRAYDVAASSKGVSVFLNGNKLGVRNFKDYIDLHIKNTDDDSGPPIKIVHEVANERWEVACCPSDRGFQQVSFVNSIATYKGGRHVDHVVDNLIKQLLEVLKKKNKGGINIKPFQVRNHLWVFVNCLIENPTFDSQTKENMTLQQKGFGSKCTLSEKFINNMSKSGIVESVLAWAKFKAQNDIAKTGGRKSSKIKGIPKLEDANEAGGKNSIKCTLILTEGDSAKSLAVSGLGVIGRDLYGVFPLRGKLLNVREANFKQLSENAEINNLCKIIGLQYKK.... The pIC50 is 3.5. (2) The small molecule is N#Cc1ccc(C2NC(=O)N(c3cccc(C(F)(F)F)c3)C3=C2C(=O)CC3)cn1. The target protein (P08246) has sequence MTLGRRLACLFLACVLPALLLGGTALASEIVGGRRARPHAWPFMVSLQLRGGHFCGATLIAPNFVMSAAHCVANVNVRAVRVVLGAHNLSRREPTRQVFAVQRIFENGYDPVNLLNDIVILQLNGSATINANVQVAQLPAQGRRLGNGVQCLAMGWGLLGRNRGIASVLQELNVTVVTSLCRRSNVCTLVRGRQAGVCFGDSGSPLVCNGLIHGIASFVRGGCASGLYPDAFAPVAQFVNWIDSIIQRSEDNPCPHPRDPDPASRTH. The pIC50 is 6.7. (3) The compound is CN(C)CCCn1nc(C2=C(c3cn(-c4cccnc4)c4ccccc34)C(=O)NC2=O)c2ccccc21. The target protein (P11440) has sequence MEDYIKIEKIGEGTYGVVYKGRHRVTGQIVAMKKIRLESEEEGVPSTAIREISLLKELRHPNIVSLQDVLMQDSRLYLIFEFLSMDLKKYLDSIPPGQFMDSSLVKSYLHQILQGIVFCHSRRVLHRDLKPQNLLIDDKGTIKLADFGLARAFGIPIRVYTHEVVTLWYRSPEVLLGSARYSTPVDIWSIGTIFAELATKKPLFHGDSEIDQLFRIFRALGTPNNEVWPEVESLQDYKNTFPKWKPGSLASHVKNLDENGLDLLSKMLVYDPAKRISGKMALKHPYFDDLDNQIKKM. The pIC50 is 6.5. (4) The compound is Cc1ccc2nc(O)c(O)nc2n1. The target protein (P00371) has sequence MRVVVIGAGVIGLSTALCIHERYHSVLQPLDVKVYADRFTPFTTTDVAAGLWQPYTSEPSNPQEANWNQQTFNYLLSHIGSPNAANMGLTPVSGYNLFREAVPDPYWKDMVLGFRKLTPRELDMFPDYRYGWFNTSLILEGRKYLQWLTERLTERGVKFFLRKVESFEEVARGGADVIINCTGVWAGVLQPDPLLQPGRGQIIKVDAPWLKNFIITHDLERGIYNSPYIIPGLQAVTLGGTFQVGNWNEINNIQDHNTIWEGCCRLEPTLKDAKIVGEYTGFRPVRPQVRLEREQLRFGSSNTEVIHNYGHGGYGLTIHWGCALEVAKLFGKVLEERNLLTMPPSHL. The pIC50 is 4.8. (5) The compound is CC(=O)Nc1cc(Oc2ccc3c(C(=O)Nc4ccc(CN5CCN(C)CC5)c(C(F)(F)F)c4)cccc3c2)ncn1. The target protein sequence is MENFQKVEKIGEGTYGVVYKARNKLTGEVVALKKIRXDTETEGVPSTAIREISLLKELNHPNIVKLLDVIHTENKLYLVFEFLHQDLKKFMDASALTGIPLPLIKSYLFQLLQGLAFIHSHRVLHRDLKPQNLLINTEGAIKLADFGLARAFGVPVRTYTHEVVTLWYRAPEILLGCKYYSTAVDIWSLGCIFAEMVTRRALFPGDSEIDQLFRIFRTLGTPDEVVWPGVTSMPDYKPSFPKWARQDFSKVVPPLDEDGRSLLSQMLHYDPNKRISAKAALAHPFFQDVTKPVPHLRL. The pIC50 is 5.0. (6) The drug is Cc1cc(=O)n2nc(COc3ccccc3Cl)sc2n1. The target protein (P22985) has sequence MTADELVFFVNGKKVVEKNADPETTLLVYLRRKLGLCGTKLGCGEGGCGACTVMISKYDRLQNKIVHFSVNACLAPICSLHHVAVTTVEGIGNTQKLHPVQERIARSHGSQCGFCTPGIVMSMYTLLRNQPEPTVEEIENAFQGNLCRCTGYRPILQGFRTFAKDGGCCGGSGNNPNCCMNQTKDQTVSLSPSLFNPEDFKPLDPTQEPIFPPELLRLKDTPQKKLRFEGERVTWIQASTMEELLDLKAQHPDAKLVVGNTEIGIEMKFKNMLFPLIVCPAWIPELNSVVHGPEGISFGASCPLSLVESVLAEEIAKLPEQKTEVFRGVMEQLRWFAGKQVKSVASIGGNIITASPISDLNPVFMASGAKLTLVSRGTRRTVRMDHTFFPGYRKTLLRPEEILLSIEIPYSKEGEFFSAFKQASRREDDIAKVTSGMRVLFKPGTIEVQELSLCFGGMADRTISALKTTPKQLSKSWNEELLQSVCAGLAEELQLAPDAP.... The pIC50 is 5.7. (7) The pIC50 is 4.3. The target protein (P38417) has sequence MFPFGQKGQKIKGTMVVMQKNVLDINSITSVGGIVDQGLGFIGSAVDALTFAATKISIQLISATKADGGKGKIGKSTNLRGKITLPTLGAGEQAYDVNFEWDSDFGIPGAFYIKNFMQNEFYLKSLILEDIPNHGTIHFVCNSWVYNSKNYKTDRIFFANNTYLPSETPAPLLKYREEELKNVRGDGTGERKEWDRIYDYDVYNDLGNPDSGDKYARPVLGGSALPYPRRERTGRGKTRKDPNSEKPSDFVYLPRDEAFGHLKSSDFLAYGIKSVSQDVLPVLTDAFDGNILSLEFDNFAEVHKLYEGGVTLPTNFLSKIAPIPVIKEIFRTDGEQFLKYPPPKVMQVDKSAWMTDEEFARETIAGLNPNVIKIIEEFPLSSKLDTQAYGDHTCIIAKEHLEPNLGGLTVEQAIQNKKLFILDHHDYLIPYLRKINANTTKTYATRTIFFLKDDGTLTPLAIELSKPHPQGEEYGPVSEVYVPASEGVEAYIWLLAKAYV.... The small molecule is O=C(O)CC(=O)OCC1OC(Oc2c[nH]c3ccccc23)C(O)C(=O)C1O.