Dataset: Drug-target binding data from BindingDB using IC50 measurements. Task: Regression. Given a target protein amino acid sequence and a drug SMILES string, predict the binding affinity score between them. We predict pIC50 (pIC50 = -log10(IC50 in M); higher means more potent). Dataset: bindingdb_ic50. (1) The compound is CC(C)Cc1ccc(C(C)C(=O)O)cc1. The target protein (P55926) has sequence MELKTEEEEVGGVQPVSIQAFASSSTLHGLAHIFSYERLSLKRALWALCFLGSLAVLLCVCTERVQYYFCYHHVTKLDEVAASQLTFPAVTLCNLNEFRFSQVSKNDLYHAGELLALLNNRYEIPDTQMADEKQLEILQDKANFRSFKPKPFNMREFYDRAGHDIRDMLLSCHFRGEACSAEDFKVVFTRYGKCYTFNSGQDGRPRLKTMKGGTGNGLEIMLDIQQDEYLPVWGETDETSFEAGIKVQIHSQDEPPFIDQLGFGVAPGFQTFVSCQEQRLIYLPSPWGTCNAVTMDSDFFDSYSITACRIDCETRYLVENCNCRMVHMPGDAPYCTPEQYKECADPALDFLVEKDQEYCVCEMPCNLTRYGKELSMVKIPSKASAKYLAKKFNKSEQYIGENILVLDIFFEVLNYETIEQKKAYEIAGLLGDIGGQMGLFIGASILTVLELFDYAYEVIKHRLCRRGKCQKEAKRSSADKGVALSLDDVKRHNPCESLRG.... The pIC50 is 3.3. (2) The pIC50 is 7.9. The target protein (P30549) has sequence MGAHASVTDTNILSGLESNATGVTAFSMPGWQLALWATAYLALVLVAVTGNATVIWIILAHERMRTVTNYFIINLALADLCMAAFNATFNFIYASHNIWYFGSTFCYFQNLFPVTAMFVSIYSMTAIAADRYMAIVHPFQPRLSAPSTKAVIAVIWLVALALASPQCFYSTITVDQGATKCVVAWPNDNGGKMLLLYHLVVFVLIYFLPLVVMFAAYSVIGLTLWKRAVPRHQAHGANLRHLQAKKKFVKAMVLVVVTFAICWLPYHLYFILGTFQEDIYYRKFIQQVYLALFWLAMSSTMYNPIIYCCLNHRFRSGFRLAFRCCPWGTPTEEDRLELTHTPSISRRVNRCHTKETLFMTGDMTHSEATNGQVGGPQDGEPAGP. The compound is COc1cc(C(=O)N2CCC(CCN3CCC(C(N)=O)(c4ccc(F)cc4)CC3)(c3ccc(Cl)c(Cl)c3)C2)cc(OC)c1OC. (3) The compound is COc1ccccc1C(=O)N(CCC1CCCN1C)c1nc2ccccc2s1. The target protein (Q15788) has sequence MSGLGDSSSDPANPDSHKRKGSPCDTLASSTEKRRREQENKYLEELAELLSANISDIDSLSVKPDKCKILKKTVDQIQLMKRMEQEKSTTDDDVQKSDISSSSQGVIEKESLGPLLLEALDGFFFVVNCEGRIVFVSENVTSYLGYNQEELMNTSVYSILHVGDHAEFVKNLLPKSLVNGVPWPQEATRRNSHTFNCRMLIHPPDEPGTENQEACQRYEVMQCFTVSQPKSIQEDGEDFQSCLICIARRLPRPPAITGVESFMTKQDTTGKIISIDTSSLRAAGRTGWEDLVRKCIYAFFQPQGREPSYARQLFQEVMTRGTASSPSYRFILNDGTMLSAHTKCKLCYPQSPDMQPFIMGIHIIDREHSGLSPQDDTNSGMSIPRVNPSVNPSISPAHGVARSSTLPPSNSNMVSTRINRQQSSDLHSSSHSNSSNSQGSFGCSPGSQIVANVALNQGQASSQSSNPSLNLNNSPMEGTGISLAQFMSPRRQVTSGLATR.... The pIC50 is 4.5. (4) The small molecule is CN1CC[C@H](N(C)C(=O)N2CC(c3cc(F)ccc3F)=C[C@@]2(CO)c2ccccc2)[C@H](F)C1. The target protein (O60333) has sequence MSGASVKVAVRVRPFNSRETSKESKCIIQMQGNSTSIINPKNPKEAPKSFSFDYSYWSHTSPEDPCFASQNRVYNDIGKEMLLHAFEGYNVCIFAYGQTGAGKSYTMMGKQEESQAGIIPQLCEELFEKINDNCNEEMSYSVEVSYMEIYCERVRDLLNPKNKGNLRVREHPLLGPYVEDLSKLAVTSYTDIADLMDAGNKARTVAATNMNETSSRSHAVFTIVFTQKKHDNETNLSTEKVSKISLVDLAGSERADSTGAKGTRLKEGANINKSLTTLGKVISALAEVDNCTSKSKKKKKTDFIPYRDSVLTWLLRENLGGNSRTAMVAALSPADINYDETLSTLRYADRAKQIKCNAVINEDPNAKLVRELKEEVTRLKDLLRAQGLGDIIDIDPLIDDYSGSGSKYLKDFQNNKHRYLLASENQRPGHFSTASMGSLTSSPSSCSLSSQVGLTSVTSIQERIMSTPGGEEAIERLKESEKIIAELNETWEEKLRKTEA.... The pIC50 is 4.3. (5) The compound is CC(C)N1C[C@H](CS(=O)(=O)C(C)(C)C)n2cc(-c3nnc(Cc4ccc(F)cc4)s3)c(=O)c(O)c2C1=O. The target protein (P12504) has sequence MENRWQVMIVWQVDRMRINTWKRLVKHHMYISRKAKDWFYRHHYESTNPKISSEVHIPLGDAKLVITTYWGLHTGERDWHLGQGVSIEWRKKRYSTQVDPDLADQLIHLHYFDCFSESAIRNTILGRIVSPRCEYQAGHNKVGSLQYLALAALIKPKQIKPPLPSVRKLTEDRWNKPQKTKGHRGSHTMNGH. The pIC50 is 8.0.