Dataset: Drug-target binding data from BindingDB using Ki measurements. Task: Regression. Given a target protein amino acid sequence and a drug SMILES string, predict the binding affinity score between them. We predict pKi (pKi = -log10(Ki in M); higher means stronger inhibition). Dataset: bindingdb_ki. (1) The drug is N[C@H]1C[C@@H](C(=O)O)CC1=C(C(F)(F)F)C(F)(F)F. The target protein (P80147) has sequence MASVLLTRRLACSFRHNHRLLVPGWRHISQAAAKVDVEFDYDGPLMKTEVPGPRSRELMKQLNIIQNAEAVHFFCNYEESRGNYLVDVDGNRMLDLYSQISSIPIGYSHPALVKLVQQPQNVSTFINRPALGILPPENFVEKLRESLLSVAPKGMSQLITMACGSCSNENAFKTIFMWYRSKERGQSAFSKEELETCMINQAPGCPDYSILSFMGAFHGRTMGCLATTHSKAIHKIDIPSFDWPIAPFPRLKYPLEEFVKENQQEEARCLEEVEDLIVKYRKKKKTVAGIIVEPIQSEGGDNHASDDFFRKLRDISRKHGCAFLVDEVQTGGGSTGKFWAHEHWGLDDPADVMTFSKKMMTGGFFHKEEFRPNAPYRIFNTWLGDPSKNLLLAEVINIIKREDLLSNAAHAGKVLLTGLLDLQARYPQFISRVRGRGTFCSFDTPDESIRNKLISIARNKGVMLGGCGDKSIRFRPTLVFRDHHAHLFLNIFSDILADFK.... The pKi is 2.4. (2) The small molecule is OCCNc1nc(N2CCCCCCC2)c2nc(NCCO)nc(N3CCCCCCC3)c2n1. The target protein (Q99808) has sequence MTTSHQPQDRYKAVWLIFFMLGLGTLLPWNFFMTATQYFTNRLDMSQNVSLVTAELSKDAQASAAPAAPLPERNSLSAIFNNVMTLCAMLPLLLFTYLNSFLHQRIPQSVRILGSLVAILLVFLITAILVKVQLDALPFFVITMIKIVLINSFGAILQGSLFGLAGLLPASYTAPIMSGQGLAGFFASVAMICAIASGSELSESAFGYFITACAVIILTIICYLGLPRLEFYRYYQQLKLEGPGEQETKLDLISKGEEPRAGKEESGVSVSNSQPTNESHSIKAILKNISVLAFSVCFIFTITIGMFPAVTVEVKSSIAGSSTWERYFIPVSCFLTFNIFDWLGRSLTAVFMWPGKDSRWLPSLVLARLVFVPLLLLCNIKPRRYLTVVFEHDAWFIFFMAAFAFSNGYLASLCMCFGPKKVKPAEAETAGAIMAFFLCLGLALGAVFSFLFRAIV. The pKi is 7.7. (3) The drug is Oc1ccc2c(c1)O[C@@H](CNCc1ccccc1I)CC2. The target protein (P14416) has sequence MDPLNLSWYDDDLERQNWSRPFNGSDGKADRPHYNYYATLLTLLIAVIVFGNVLVCMAVSREKALQTTTNYLIVSLAVADLLVATLVMPWVVYLEVVGEWKFSRIHCDIFVTLDVMMCTASILNLCAISIDRYTAVAMPMLYNTRYSSKRRVTVMISIVWVLSFTISCPLLFGLNNADQNECIIANPAFVVYSSIVSFYVPFIVTLLVYIKIYIVLRRRRKRVNTKRSSRAFRAHLRAPLKGNCTHPEDMKLCTVIMKSNGSFPVNRRRVEAARRAQELEMEMLSSTSPPERTRYSPIPPSHHQLTLPDPSHHGLHSTPDSPAKPEKNGHAKDHPKIAKIFEIQTMPNGKTRTSLKTMSRRKLSQQKEKKATQMLAIVLGVFIICWLPFFITHILNIHCDCNIPPVLYSAFTWLGYVNSAVNPIIYTTFNIEFRKAFLKILHC. The pKi is 6.8. (4) The compound is O=S(=O)(NC1CCN(CCOc2ccccc2-c2ccccc2)C1)c1cccs1. The target protein (P34969) has sequence MMDVNSSGRPDLYGHLRSFLLPEVGRGLPDLSPDGGADPVAGSWAPHLLSEVTASPAPTWDAPPDNASGCGEQINYGRVEKVVIGSILTLITLLTIAGNCLVVISVCFVKKLRQPSNYLIVSLALADLSVAVAVMPFVSVTDLIGGKWIFGHFFCNVFIAMDVMCCTASIMTLCVISIDRYLGITRPLTYPVRQNGKCMAKMILSVWLLSASITLPPLFGWAQNVNDDKVCLISQDFGYTIYSTAVAFYIPMSVMLFMYYQIYKAARKSAAKHKFPGFPRVEPDSVIALNGIVKLQKEVEECANLSRLLKHERKNISIFKREQKAATTLGIIVGAFTVCWLPFFLLSTARPFICGTSCSCIPLWVERTFLWLGYANSLINPFIYAFFNRDLRTTYRSLLQCQYRNINRKLSAAGMHEALKLAERPERPEFVLRACTRRVLLRPEKRPPVSVWVLQSPDHHNWLADKMLTTVEKKVMIHD. The pKi is 8.0. (5) The target protein (P22273) has sequence MLAVGCTLLVALLAAPAVALVLGSCRALEVANGTVTSLPGATVTLICPGKEAAGNATIHWVYSGSQSREWTTTGNTLVLRAVQVNDTGHYLCFLDDHLVGTVPLLVDVPPEEPKLSCFRKNPLVNAFCEWHPSSTPSPTTKAVMFAKKINTTNGKSDFQVPCQYSQQLKSFSCEVEILEGDKVYHIVSLCVANSVGSRSSHNVVFQSLKMVQPDPPANLVVSAIPGPRWLKVSWQDPESWDPSYYLLQFELRYRPVWSKFTVWPLQVAQHQCVIHDALRGVKHVVQVRGKEEFDIGQWSKWSPEVTGTPWLAEPRTTPAGIPGNPTQVSVEDYDNHEDQYGSSTEATSVLAPVQGSSPIPLPTFLVAGGSLAFGLLLCVFIILRLKKKWKSQAEKESKTTSPPPYPLGPLKPTFLLVPLLTPSGSHNSSGTDNTGSHSCLGVRDPQCPNDNSNRDYLFPR. The pKi is 5.0. The compound is Cc1nccn1C[C@H]1CCc2c(c3cccc4c3n2CCC4)C1=O.