Dataset: Drug-target binding data from BindingDB using IC50 measurements. Task: Regression. Given a target protein amino acid sequence and a drug SMILES string, predict the binding affinity score between them. We predict pIC50 (pIC50 = -log10(IC50 in M); higher means more potent). Dataset: bindingdb_ic50. (1) The drug is COc1ccc(-c2cnc3nc(N)nc(N4CCCCC4)c3n2)cc1. The target protein (Q28969) has sequence MGNLKSVGQEPGPPCGLGLGLGLGLCGKQGPATPAPEPSRAPAPATPHAPEHSPAPNSPTLTRPPEGPKFPRVKNWEVGSITYDTLCAQSQQDGPCTPRRCLGSLVLPRKLQSRPSPGPPPAEQLLSQARDFINQYYSSIKRSGSQAHEERLQEVEAEVATTGTYHLGESELVFGAKQAWRNAPRCVGRIQWGKLQVFDARDCSSAQEMFTYICNHIKYATNRGNLRSAITVFPQRTPGRGDFRIWNSQLVRYAGYRQQDGSVRGDPANVEITELCIQHGWTPGNGRFDVLPLLLQAPDEPPELFALPPELVLEVPLEHPTLEWFAALGLRWYALPAVSNMLLEIGGLEFPAAPFSGWYMSTEIGTRNLCDPHRYNILEDVAVCMDLDTRTTSSLWKDKAAVEINLAVLHSYQLAKVTIVDHHAATASFMKHLENEQKARGGCPADWAWIVPPISGSLTPVFHQEMVNYVLSPAFRYQPDPWKGSAAKGTGIARKKTFKE.... The pIC50 is 4.3. (2) The drug is OC[C@H]1NC[C@H](O)[C@@H]1O. The target protein (Q4FZV0) has sequence MHLHLLFLLALCGAGCMAAGPSYSLRGSWRVSNGNSSLELPATVPGYVHSALQQHGLIQDPYYRFNDLNYRWISLDNWTYSTEFKIPFNRSEWQKVKLIFDGVDTVAEILFNNVTIGKTDNMFTRYSFDVTNVVKDVNSLKLRFQSAVQYAECQSKAHTQYRVPPECPPVEQKGECHVNFIRKEQCSFSWDWGPSFPSQGIWKDVRIEAYNIAHLDHLTFLPLYDNTSQAWTIEIEASFDVVSTKPVGGQVTIAIPELKTQQANHIELQHGQRIVKLLVKIRKDVTVETWWPHGHGNQTGYNTTILFALDGGLKIEKAAKVYFRTVQLIEEPITGSPGLSFYFKINGLPIFLKGSNWIPADSFQDKVTSELLQLLLQSAVDANMNTLRVWGGGIYEQDEFYALCDELGIMVWQDFMFASALYPTEPGFLESVRKEVTYQVRRLKSHPSVIIWSGNNENEVALRVNWFHVNPRDLGTYINDYVTLYVKTIREIVLSEDRSR.... The pIC50 is 4.7. (3) The compound is Cc1ccc(N2CCN(CC(=O)Nc3nc4c(s3)CCCC4)CC2)cc1. The target protein (Q6PCB7) has sequence MRAPGAGAASVVSLALLWLLGLPWTWSAAAALGVYVGSGGWRFLRIVCKTARRDLFGLSVLIRVRLELRRHQRAGHTIPRIFQAVVQRQPERLALVDAGTGECWTFAQLDAYSNAVANLFRQLGFAPGDVVAIFLEGRPEFVGLWLGLAKAGMEAALLNVNLRREPLAFCLGTSGAKALIFGGEMVAAVAEVSGHLGKSLIKFCSGDLGPEGILPDTHLLDPLLKEASTAPLAQIPSKGMDDRLFYIYTSGTTGLPKAAIVVHSRYYRMAAFGHHAYRMQAADVLYDCLPLYHSAGNIIGVGQCLIYGLTVVLRKKFSASRFWDDCIKYNCTVVQYIGEICRYLLKQPVREAERRHRVRLAVGNGLRPAIWEEFTERFGVRQIGEFYGATECNCSIANMDGKVGSCGFNSRILPHVYPIRLVKVNEDTMELLRDAQGLCIPCQAGEPGLLVGQINQQDPLRRFDGYVSESATSKKIAHSVFSKGDSAYLSGDVLVMDELG.... The pIC50 is 6.6. (4) The target protein sequence is MATTATCTRFTDDYQLFEELGKGAFSVVRRCVKKTSTQEYAAKIINTKKLSARDHQKLEREARICRLLKHPNIVRLHDSISEEGFHYLVFDLVTGGELFEDIVAREYYSEADASHCIHQILESVNHIHQHDIVHRDLKPENLLLASKCKGAAVKLADFGLAIEVQGEQQAWFGFAGTPGYLSPEVLRKDPYGKPVDIWACGVILYILLVGYPPFWDEDQHKLYQQIKAGAYDFPSPEWDTVTPEAKNLINQMLTINPAKRITADQALKHPWVCQRSTVASMMHRQETVECLRKFNARRKLKGAILTTMLVSRNFSAAKSLLNKKSDGGVKKRKSSSSVHLMPQSNNKNSLVSPAQEPAPLQTAMEPQTTVVHNATDGIKGSTESCNTTTEDEDLKGRVPEGRSSRDRTAPSAGMQPQPSLCSSAMRKQEIIKITEQLIEAINNGDFEAYTKICDPGLTSFEPEALGNLVEGMDFHKFYFENLLSKNSKPIHTTILNPHVH.... The drug is CCN(CC)CCCC(C)Nc1nc(/C=C/c2c(Cl)cc(Cl)cc2Cl)nc2ccc(N)cc12. The pIC50 is 5.0. (5) The compound is CCc1nnc(-c2ccc(-c3ccccc3)cc2)n1-c1cccc(C)c1C. The target protein (P28572) has sequence MAVAHGPVATSSPEQNGAVPSEATKKDQNLTRGNWGNQIEFVLTSVGYAVGLGNVWRFPYLCYRNGGGAFMFPYFIMLVFCGIPLFFMELSFGQFASQGCLGVWRISPMFKGVGYGMMVVSTYIGIYYNVVICIAFYYFFSSMTHVLPWAYCNNPWNTPDCAGVLDASNLTNGSRPTALSGNLSHLFNYTLQRTSPSEEYWRLYVLKLSDDIGDFGEVRLPLLGCLGVSWVVVFLCLIRGVKSSGKVVYFTATFPYVVLTILFVRGVTLEGAFTGIMYYLTPKWDKILEAKVWGDAASQIFYSLGCAWGGLITMASYNKFHNNCYRDSVIISITNCATSVYAGFVIFSILGFMANHLGVDVSRVADHGPGLAFVAYPEALTLLPISPLWSLLFFFMLILLGLGTQFCLLETLVTAIVDEVGNEWILQKKTYVTLGVAVAGFLLGIPLTSQAGIYWLLLMDNYAASFSLVVISCIMCVSIMYIYGHRNYFQDIQMMLGFPP.... The pIC50 is 6.0. (6) The compound is O=C(CCl)c1ccc(Br)s1. The target is XTSFAESXKPVQQPSAFGS. The pIC50 is 5.0.