Dataset: Drug-target binding data from BindingDB using Kd measurements. Task: Regression. Given a target protein amino acid sequence and a drug SMILES string, predict the binding affinity score between them. We predict pKd (pKd = -log10(Kd in M); higher means stronger binding). Dataset: bindingdb_kd. (1) The drug is Nc1ccn([C@@H]2O[C@H](CNC(=O)c3cn4ccsc4n3)[C@@H](O)[C@H]2O)c(=O)n1. The target protein (Q63T71) has sequence MDFRIGQGYDVHQLVPGRPLIIGGVTIPYERGLLGHSDADVLLHAITDALFGAAALGDIGRHFSDTDPRFKGADSRALLRECASRVAQAGFAIRNVDSTIIAQAPKLAPHIDAMRANIAADLDLPLDRVNVKAKTNEKLGYLGRGEGIEAQAAALVVREAAA. The pKd is 4.2. (2) The small molecule is CNC(=O)c1ccccc1Sc1ccc2c(/C=C/c3ccccn3)n[nH]c2c1. The target protein (P41743) has sequence MPTQRDSSTMSHTVAGGGSGDHSHQVRVKAYYRGDIMITHFEPSISFEGLCNEVRDMCSFDNEQLFTMKWIDEEGDPCTVSSQLELEEAFRLYELNKDSELLIHVFPCVPERPGMPCPGEDKSIYRRGARRWRKLYCANGHTFQAKRFNRRAHCAICTDRIWGLGRQGYKCINCKLLVHKKCHKLVTIECGRHSLPQEPVMPMDQSSMHSDHAQTVIPYNPSSHESLDQVGEEKEAMNTRESGKASSSLGLQDFDLLRVIGRGSYAKVLLVRLKKTDRIYAMKVVKKELVNDDEDIDWVQTEKHVFEQASNHPFLVGLHSCFQTESRLFFVIEYVNGGDLMFHMQRQRKLPEEHARFYSAEISLALNYLHERGIIYRDLKLDNVLLDSEGHIKLTDYGMCKEGLRPGDTTSTFCGTPNYIAPEILRGEDYGFSVDWWALGVLMFEMMAGRSPFDIVGSSDNPDQNTEDYLFQVILEKQIRIPRSLSVKAASVLKSFLNKD.... The pKd is 5.0. (3) The drug is C[C@]12O[C@H](C[C@]1(O)CO)n1c3ccccc3c3c4c(c5c6ccccc6n2c5c31)CNC4=O. The pKd is 6.3. The target protein sequence is MEKYEKIGKIGEGSYGVVFKCRNRDTGQIVAIKKFLESEDDPVIKKIALREIRMLKQLKHPNLVNLLEVFRRKRRLHLVFEYCDHTVLHELDRYQRGVPEHLVKSITWQTLQAVNFCHKHNCIHRDVKPENILITKHSVIKLCDFGFARLLAGPSDYYTDYVATRWYRSPELLVGDTQYGPPVDVWAIGCVFAELLSGVPLWPGKSDVDQLYLIRKTLGDLIPRHQQVFSTNQYFSGVKIPDPEDMEPLELKFPNISYPALGLLKGCLHMDPTQRLTCEQLLHHPYFENIREIEDLAKEHNKPTRKTLRKSRKHHCFTETSKLQYLPQLTGSSILPALDNKKYYCDTKKLNYRFPNI. (4) The small molecule is C=CC1=C(C)c2cc3[n-]c(cc4nc(cc5[n-]c(cc1n2)c(C)c5CCC(=O)O)C(CCC(=O)O)=C4C)c(C)c3C=C. The target protein sequence is MDQQVKQERLQGRLEPEIKEFRQERKTLQLATVDAQGRPNVSYAPFVQNQEGYFVLISHIARHARNLEVNPQVSIMMIEDETEAKQLFARKRLTFDAVASMVERDSELWCQVIAQMGERFGEIIDGLSQLQEFMLFRLQPEHGLFVKGFGQAYQVSGDDLVDFVHLEEGHRKISNG. The pKd is 7.6. (5) The compound is CC(C)(C)c1cnc(CSc2cnc(NC(=O)C3CCNCC3)s2)o1. The target protein (Q9H422) has sequence MASQVLVYPPYVYQTQSSAFCSVKKLKVEPSSCVFQERNYPRTYVNGRNFGNSHPPTKGSAFQTKIPFNRPRGHNFSLQTSAVVLKNTAGATKVIAAQAQQAHVQAPQIGAWRNRLHFLEGPQRCGLKRKSEELDNHSSAMQIVDELSILPAMLQTNMGNPVTVVTATTGSKQNCTTGEGDYQLVQHEVLCSMKNTYEVLDFLGRGTFGQVVKCWKRGTNEIVAIKILKNHPSYARQGQIEVSILARLSTENADEYNFVRAYECFQHRNHTCLVFEMLEQNLYDFLKQNKFSPLPLKVIRPILQQVATALKKLKSLGLIHADLKPENIMLVDPVRQPYRVKVIDFGSASHVSKTVCSTYLQSRYYRAPEIILGLPFCEAIDMWSLGCVIAELFLGWPLYPGALEYDQIRYISQTQGLPGEQLLNVGTKSTRFFCKETDMSHSGWRLKTLEEHEAETGMKSKEARKYIFNSLDDVAHVNTVMDLEGSDLLAEKADRREFVS.... The pKd is 5.7. (6) The drug is C[C@]12O[C@H](C[C@]1(O)CO)n1c3ccccc3c3c4c(c5c6ccccc6n2c5c31)CNC4=O. The target protein (Q9NY57) has sequence MGGNHSHKPPVFDENEEVNFDHFQILRAIGKGSFGKVCIVQKRDTKKMYAMKYMNKQKCIERDEVRNVFRELQIMQGLEHPFLVNLWYSFQDEEDMFMVVDLLLGGDLRYHLQQNVHFTEGTVKLYICELALALEYLQRYHIIHRDIKPDNILLDEHGHVHITDFNIATVVKGAERASSMAGTKPYMAPEVFQVYMDRGPGYSYPVDWWSLGITAYELLRGWRPYEIHSVTPIDEILNMFKVERVHYSSTWCKGMVALLRKLLTKDPESRVSSLHDIQSVPYLADMNWDAVFKKALMPGFVPNKGRLNCDPTFELEEMILESKPLHKKKKRLAKNRSRDGTKDSCPLNGHLQHCLETVREEFIIFNREKLRRQQGQGSQLLDTDSRGGGQAQSKLQDGCNNNLLTHTCTRGCSS. The pKd is 5.2. (7) The drug is CC(C)n1nc(-c2cc3cc(O)ccc3[nH]2)c2c(N)ncnc21. The target protein (Q96Q40) has sequence MGQELCAKTVQPGCSCYHCSEGGEAHSCRRSQPETTEAAFKLTDLKEASCSMTSFHPRGLQAARAQKFKSKRPRSNSDCFQEEDLRQGFQWRKSLPFGAASSYLNLEKLGEGSYATVYKGISRINGQLVALKVISMNAEEGVPFTAIREASLLKGLKHANIVLLHDIIHTKETLTFVFEYMHTDLAQYMSQHPGGLHPHNVRLFMFQLLRGLAYIHHQHVLHRDLKPQNLLISHLGELKLADFGLARAKSIPSQTYSSEVVTLWYRPPDALLGATEYSSELDIWGAGCIFIEMFQGQPLFPGVSNILEQLEKIWEVLGVPTEDTWPGVSKLPNYNPEWFPLPTPRSLHVVWNRLGRVPEAEDLASQMLKGFPRDRVSAQEALVHDYFSALPSQLYQLPDEESLFTVSGVRLKPEMCDLLASYQKGHHPAQFSKCW. The pKd is 5.0.