Dataset: Drug-target binding data from BindingDB using IC50 measurements. Task: Regression. Given a target protein amino acid sequence and a drug SMILES string, predict the binding affinity score between them. We predict pIC50 (pIC50 = -log10(IC50 in M); higher means more potent). Dataset: bindingdb_ic50. (1) The small molecule is FC(F)(F)c1c(Cc2ccccc2)n2c3c(cccc13)CCC2. The target protein (Q9WVG5) has sequence MRNTVFLLGFWSVYCYFPAGSITTLRPQGSLRDEHHKPTGVPATARPSVAFNIRTSKDPEQEGCNLSLGDSKLLENCGFNMTAKTFFIIHGWTMSGMFESWLHKLVSALQMREKDANVVVVDWLPLAHQLYTDAVNNTRVVGQRVAGMLDWLQEKEEFSLGNVHLIGYSLGAHVAGYAGNFVKGTVGRITGLDPAGPMFEGVDINRRLSPDDADFVDVLHTYTLSFGLSIGIRMPVGHIDIYPNGGDFQPGCGFNDVIGSFAYGTISEMVKCEHERAVHLFVDSLVNQDKPSFAFQCTDSSRFKRGICLSCRKNRCNNIGYNAKKMRKKRNSKMYLKTRAGMPFKVYHYQLKVHMFSYNNSGDTQPTLYITLYGSNADSQNLPLEIVEKIELNATNTFLVYTEEDLGDLLKMRLTWEGVAHSWYNLWNEFRNYLSQPSNPSRELYIRRIRVKSGETQRKVTFCTQDPTKSSISPGQELWFHKCQDGWKMKNKTSPFVNLA.... The pIC50 is 7.0. (2) The compound is Cc1cc2oc(=O)c(Cc3cccc4ccccc34)c(O)c2cc1C. The target protein sequence is MVGRRALIVLAHSERTSFNYAMKEAAAAALKKKGWEVVESDLYAMNFNPIISRKDITGKLKDPANFQYPAESVLAYKEGHLSPDIVAEQKKLEAADLVIFQFPLQWFGVPAILKGWFERVFIGEFAYTYAAMYDKGPFRSKKAVLSITTGGSGSMYSLQGIHGDMNVILWPIQSGILHFCGFQVLEPQLTYSIGHTPADARIQILEGWKKRLENIWDETPLYFAPSSLFDLNFQAGFLMKKEVQDEEKNKKFGLSVGHHLGKSIPTDNQIKARK. The pIC50 is 8.1. (3) The small molecule is N#Cc1cc(CCOc2cc(O)c3c(=O)cc(C(=O)O)oc3c2)ccc1NC(=O)C(=O)O. The target protein (P11708) has sequence MSEPIRVLVTGAAGQIAYSLLYSIGNGSVFGKDQPIILVLLDITPMMGVLDGVLMELQDCALPLLKDVIATDKEEIAFKDLDVAILVGSMPRRDGMERKDLLKANVKIFKCQGAALDKYAKKSVKVIVVGNPANTNCLTASKSAPSIPKENFSCLTRLDHNRAKAQIALKLGVTSDDVKNVIIWGNHSSTQYPDVNHAKVKLQAKEVGVYEAVKDDSWLKGEFITTVQQRGAAVIKARKLSSAMSAAKAICDHVRDIWFGTPEGEFVSMGIISDGNSYGVPDDLLYSFPVTIKDKTWKIVEGLPINDFSREKMDLTAKELAEEKETAFEFLSSA. The pIC50 is 4.9. (4) The compound is COc1cc(Cc2cnc(N)nc2N)cc2c1NCC(CO)C2. The target protein (P00382) has sequence MKLSLMVAISKNGVIGNGPDIPWSAKGEQLLFKAITYNQWLLVGRKTFESMGALPNRKYAVVTRSSFTSDNENVLIFPSIKDALTNLKKITDHVIVSGGGEIYKSLIDQVDTLHISTIDIEPEGDVYFPEIPSNFRPVFTQDFASNINYSYQIWQKG. The pIC50 is 7.7. (5) The compound is Cc1c2oc3c(C)ccc(C(=O)N[C@@H]4C(=O)N[C@H](C(C)C)C(=O)N5CCC[C@H]5C(=O)N(C)CC(=O)N(C)[C@@H](C(C)C)C(=O)O[C@@H]4C)c3nc-2c(C(=O)N[C@@H]2C(=O)N[C@H](C(C)C)C(=O)N3CCC[C@H]3C(=O)N(C)CC(=O)N(C)[C@@H](C(C)C)C(=O)O[C@@H]2C)c(NCCNC2CC(C)(C)[N+]([O-])C(C)(C)C2)c1=O. The target protein (P0A8T7) has sequence MKDLLKFLKAQTKTEEFDAIKIALASPDMIRSWSFGEVKKPETINYRTFKPERDGLFCARIFGPVKDYECLCGKYKRLKHRGVICEKCGVEVTQTKVRRERMGHIELASPTAHIWFLKSLPSRIGLLLDMPLRDIERVLYFESYVVIEGGMTNLERQQILTEEQYLDALEEFGDEFDAKMGAEAIQALLKSMDLEQECEQLREELNETNSETKRKKLTKRIKLLEAFVQSGNKPEWMILTVLPVLPPDLRPLVPLDGGRFATSDLNDLYRRVINRNNRLKRLLDLAAPDIIVRNEKRMLQEAVDALLDNGRRGRAITGSNKRPLKSLADMIKGKQGRFRQNLLGKRVDYSGRSVITVGPYLRLHQCGLPKKMALELFKPFIYGKLELRGLATTIKAAKKMVEREEAVVWDILDEVIREHPVLLNRAPTLHRLGIQAFEPVLIEGKAIQLHPLVCAAYNADFDGDQMAVHVPLTLEAQLEARALMMSTNNILSPANGEPII.... The pIC50 is 5.5. (6) The pIC50 is 6.8. The drug is O=C1O/C(=C/Br)CCC1c1cccc2ccccc12. The target protein (O60733) has sequence MQFFGRLVNTFSGVTNLFSNPFRVKEVAVADYTSSDRVREEGQLILFQNTPNRTWDCVLVNPRNSQSGFRLFQLELEADALVNFHQYSSQLLPFYESSPQVLHTEVLQHLTDLIRNHPSWSVAHLAVELGIRECFHHSRIISCANCAENEEGCTPLHLACRKGDGEILVELVQYCHTQMDVTDYKGETVFHYAVQGDNSQVLQLLGRNAVAGLNQVNNQGLTPLHLACQLGKQEMVRVLLLCNARCNIMGPNGYPIHSAMKFSQKGCAEMIISMDSSQIHSKDPRYGASPLHWAKNAEMARMLLKRGCNVNSTSSAGNTALHVAVMRNRFDCAIVLLTHGANADARGEHGNTPLHLAMSKDNVEMIKALIVFGAEVDTPNDFGETPTFLASKIGRLVTRKAILTLLRTVGAEYCFPPIHGVPAEQGSAAPHHPFSLERAQPPPISLNNLELQDLMHISRARKPAFILGSMRDEKRTHDHLLCLDGGGVKGLIIIQLLIAI.... (7) The compound is CC(/C=C1\SC(=S)N(CC(=O)O)C1=O)=C\c1ccccc1. The target protein (P51635) has sequence MTASSVLLHTGQKMPLIGLGTWKSEPGQVKAAIKYALSVGYRHIDCASVYGNETEIGEALKESVGAGKAVPREELFVTSKLWNTKHHPEDVEPAVRKTLADLQLEYLDLYLMHWPYAFERGDNPFPKNADGTVKYDSTHYKETWKALEALVAKGLVKALGLSNFSSRQIDDVLSVASVRPAVLQVECHPYLAQNELIAHCQARGLEVTAYSPLGSSDRAWRHPDEPVLLEEPVVLALAEKHGRSPAQILLRWQVQRKVICIPKSITPSRILQNIQVFDFTFSPEEMKQLDALNKNWRYIVPMITVDGKRVPRDAGHPLYPFNDPY. The pIC50 is 4.8. (8) The drug is COc1ccc(CNC(=O)c2cnc3c(SSc4cccc5cc(C(=O)NCc6ccc(OC)cc6)cnc45)cccc3c2)cc1. The target protein (Q92905) has sequence MAASGSGMAQKTWELANNMQEAQSIDEIYKYDKKQQQEILAAKPWTKDHHYFKYCKISALALLKMVMHARSGGNLEVMGLMLGKVDGETMIIMDSFALPVEGTETRVNAQAAAYEYMAAYIENAKQVGRLENAIGWYHSHPGYGCWLSGIDVSTQMLNQQFQEPFVAVVIDPTRTISAGKVNLGAFRTYPKGYKPPDEGPSEYQTIPLNKIEDFGVHCKQYYALEVSYFKSSLDRKLLELLWNKYWVNTLSSSSLLTNADYTTGQVFDLSEKLEQSEAQLGRGSFMLGLETHDRKSEDKLAKATRDSCKTTIEAIHGLMSQVIKDKLFNQINIS. The pIC50 is 6.3.