Dataset: Drug-target binding data from BindingDB using Ki measurements. Task: Regression. Given a target protein amino acid sequence and a drug SMILES string, predict the binding affinity score between them. We predict pKi (pKi = -log10(Ki in M); higher means stronger inhibition). Dataset: bindingdb_ki. (1) The small molecule is CC1(C)N=C(N)N=C(N)N1c1cccc(Cl)c1. The target protein (P00382) has sequence MKLSLMVAISKNGVIGNGPDIPWSAKGEQLLFKAITYNQWLLVGRKTFESMGALPNRKYAVVTRSSFTSDNENVLIFPSIKDALTNLKKITDHVIVSGGGEIYKSLIDQVDTLHISTIDIEPEGDVYFPEIPSNFRPVFTQDFASNINYSYQIWQKG. The pKi is 5.9. (2) The drug is CCCCCCCC[N+]1(CCC#Cc2cc(OC)c(OC)c(OC)c2)CCCCC1. The target protein (Q8BGY9) has sequence MSFHVEGLVAIILFYLLIFLVGIWAAWKTKNSGNPEERSEAIIVGGRDIGLLVGGFTMTATWVGGGYINGTAEAVYGPGCGLAWAQAPIGYSLSLILGGLFFAKPMRSKGYVTMLDPFQQIYGKRMGGLLFIPALMGEMFWAAAIFSALGATISVIIDVDVNISVIVSALIAILYTLVGGLYSVAYTDVVQLFCIFIGLWISVPFALSHPAVTDIGFTAVHAKYQSPWLGTIESVEVYTWLDNFLLLMLGGIPWQAYFQRVLSSSSATYAQVLSFLAAFGCLVMALPAICIGAIGASTDWNQTAYGYPDPKTKEEADMILPIVLQYLCPVYISFFGLGAVSAAVMSSADSSILSASSMFARNIYQLSFRQNASDKEIVWVMRITVLVFGASATAMALLTKTVYGLWYLSSDLVYIIIFPQLLCVLFIKGTNTYGAVAGYIFGLFLRITGGEPYLYLQPLIFYPGYYSDKNGIYNQRFPFKTLSMVTSFFTNICVSYLAKY.... The pKi is 6.6. (3) The target protein (Q9Z2Z8) has sequence MASKSQHNASKAKNHNVKAESQGQWGRAWEVDWFSLVSVIFLLLFAPFIVYYFIMACDQYSCSLTAPILDVATGRASLADIWAKTPPVTAKAAQLYALWVSFQVLLYSWLPDFCHRFLPGYVGGVQEGAITPAGIVNKYEVNGLQAWLITHFLWFVNAYLLSWFSPTIIFDNWIPLLWCANILGYAVSTFAMIKGYLFPTSAEDCKFTGNFFYNYMMGIEFNPRIGKWFDFKLFFNGRPGIVAWTLINLSFAAKQQELYGHVTNSMILVNVLQAIYVLDFFWNETWYLKTIDICHDHFGWYLGWGDCVWLPYLYTLQGLYLVYHPVQLSTPNALGVLLLGLVGYYIFRMTNHQKDLFRRTDGHCLIWGKKPKAIECSYTSADGLKHRSKLLVSGFWGVARHFNYTGDLMGSLAYCLACGGGHLLPYFYIIYMTILLTHRCLRDEHRCANKYGRDWERYVAAVPYRLLPGIF. The small molecule is CC(C)[S+](C)CC[C@@H](C)C1CCC2C3CC=C4C[C@@H](O)CC[C@]4(C)C3CC[C@@]21C. The pKi is 8.4. (4) The compound is NC(=[NH2+])c1cccc(NC(=O)CNS(=O)(=O)c2ccc3ccccc3c2)c1. The target protein (Q8IU80) has sequence MLLLFHSKRMPVAEAPQVAGGQGDGGDGEEAEPEGMFKACEDSKRKARGYLRLVPLFVLLALLVLASAGVLLWYFLGYKAEVMVSQVYSGSLRVLNRHFSQDLTRRESSAFRSETAKAQKMLKELITSTRLGTYYNSSSVYSFGEGPLTCFFWFILQIPEHRRLMLSPEVVQALLVEELLSTVNSSAAVPYRAEYEVDPEGLVILEASVKDIAALNSTLGCYRYSYVGQGQVLRLKGPDHLASSCLWHLQGPKDLMLKLRLEWTLAECRDRLAMYDVAGPLEKRLITSVYGCSRQEPVVEVLASGAIMAVVWKKGLHSYYDPFVLSVQPVVFQACEVNLTLDNRLDSQGVLSTPYFPSYYSPQTHCSWHLTVPSLDYGLALWFDAYALRRQKYDLPCTQGQWTIQNRRLCGLRILQPYAERIPVVATAGITINFTSQISLTGPGVRVHYGLYNQSDPCPGEFLCSVNGLCVPACDGVKDCPNGLDERNCVCRATFQCKED.... The pKi is 3.8. (5) The drug is CC[C@H](C)[C@H](NC(=O)[C@H](Cc1ccc(O)cc1)NC(=O)[C@H](Cc1c[nH]cn1)NC(=O)[C@H](CCCN=C(N)N)NC(=O)[C@H](CC(C)C)NC(=O)[C@H](C)NC(=O)[C@H](CO)NC(=O)[C@H](Cc1ccc(O)cc1)NC(=O)[C@H](Cc1ccc(O)cc1)NC(=O)[C@H](CCCN=C(N)N)NC(=O)[C@H](C)NC(=O)[C@H](CCSC)NC(=O)[C@H](CC(=O)O)NC(=O)[C@H](CCC(=O)O)NC(=O)[C@H](C)NC(=O)[C@@H]1CCCN1C(=O)[C@H](C)NC(=O)[C@H](CC(=O)O)NC(=O)[C@H](CCC(=O)O)NC(=O)CNC(=O)[C@@H]1CCCN1C(=O)[C@H](CC(N)=O)NC(=O)[C@H](CC(=O)O)NC(=O)[C@@H]1CCCN1C(=O)[C@H](CCCCN)NC(=O)[C@H](CO)NC(=O)[C@@H]1CCCN1C(=O)[C@@H](N)Cc1ccc(O)cc1)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@H](C(=O)N[C@H](C(=O)N[C@@H](CCCN=C(N)N)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CCCN=C(N)N)C(=O)N[C@@H](Cc1ccc(O)cc1)C(N)=O)[C@@H](C)O)C(C)C. The target protein sequence is MGPLGAEADENQTVEVKVELYGSGPTTPRGELPPDPEPELIDSTKLVEVQVVLILAYCSIILLGVVGNSLVIHVVIKFKSMRTVTNFFIANLAVADLLVNTLCLPFTLTYTLMGEWKMGPVLCHLVPYAQGLAVQVSTITLTVIALDRHRCIVYHLESKISKQISFLIIGLAWGVSALLASPLAIFREYSLIEIIPDFEIVACTEKWPGEEKSVYGTVYSLSTLLILYVLPLGIISFSYTRIWSKLKNHVSPGAASDHYHQRRHKTTKMLVCVVVVFAVSWLPLHAFQLAVDIDSHVLDLKEYKLIFTVFHIIAMCSTFANPLLYGWMNSNYRKAFLSAFRCEQRLDAIHSEVSMTFKAKKNLEVKKNNGLTDSFSEATNV. The pKi is 6.4. (6) The compound is CNc1nc2c(Cc3cccnc3)c(C)c(O[C@@H]3O[C@H](C(=O)O)[C@@H](O)[C@H](O)[C@H]3O)c(C)c2s1. The target protein sequence is MDALCGSGELGSKFWDSNLSVHTENPDLTPCFQNSLLAWVPCIYLWVALPCYLLYLRHHCRGYIILSHLSKLKMVLGVLLWCVSWADLFYSFHGLVHGRAPAPVFFVTPLVVGVTMLLATLLIQYERLQGVQSSGVLIIFWFLCVVCAIVPFRSKILLAKAEGEISDPFRFTTFYIHFALVLSALILACFREKPPFFSAKNVDPNPYPETSAGFLSRLFFWWFTKMAIYGYRHPLEEKDLWSLKEEDRSQMVVQQLLEAWRKQEKQTARHKASAAPGKNASGEDEVLLGARPRPRKPSFLKALLATFGSSFLISACFKLIQDLLSFINPQLLSILIRFISNPMAPSWWGFLVAGLMFLCSMMQSLILQHYYHYIFVTGVKFRTGIMGVIYRKALVITNSVKRASTVGEIVNLMSVDAQRFMDLAPFLNLLWSAPLQIILAIYFLWQNLGPSVLAGVAFMVLLIPLNGAVAVKMRAFQVKQMKLKDSRIKLMSEILNGIKV.... The pKi is 5.2. (7) The small molecule is O=C(CCc1ccccc1)NS(=O)(=O)c1ccccc1-c1ccc(CN2C(=O)c3ccccc3CCc3ccccc32)cc1. The target protein (P34995) has sequence MSPCGPLNLSLAGEATTCAAPWVPNTSAVPPSGASPALPIFSMTLGAVSNLLALALLAQAAGRLRRRRSAATFLLFVASLLATDLAGHVIPGALVLRLYTAGRAPAGGACHFLGGCMVFFGLCPLLLGCGMAVERCVGVTRPLLHAARVSVARARLALAAVAAVALAVALLPLARVGRYELQYPGTWCFIGLGPPGGWRQALLAGLFASLGLVALLAALVCNTLSGLALLRARWRRRSRRPPPASGPDSRRRWGAHGPRSASASSASSIASASTFFGGSRSSGSARRARAHDVEMVGQLVGIMVVSCICWSPMLVLVALAVGGWSSTSLQRPLFLAVRLASWNQILDPWVYILLRQAVLRQLLRLLPPRAGAKGGPAGLGLTPSAWEASSLRSSRHSGLSHF. The pKi is 7.3. (8) The compound is Cc1nsc(-c2nnc3n2CCN(C(=O)c2ccc(-c4cccs4)cc2)[C@@H]3C)n1. The target protein (P16177) has sequence MASVPRGENWTDGTVEVGTHTGNLSSALGVTEWLALQAGNFSSALGLPATTQAPSQVRANLTNQFVQPSWRIALWSLAYGLVVAVAVFGNLIVIWIILAHKRMRTVTNYFLVNLAFSDASVAAFNTLINFIYGLHSEWYFGANYCRFQNFFPITAVFASIYSMTAIAVDRYMAIIDPLKPRLSATATKIVIGSIWILAFLLAFPQCLYSKIKVMPGRTLCYVQWPEGPKQHFTYHIIVIILVYCFPLLIMGVTYTIVGITLWGGEIPGDTCDKYHEQLKAKRKVVKMMIIVVVTFAICWLPYHVYFILTAIYQQLNRWKYIQQVYLASFWLAMSSTMYNPIIYCCLNKRFRAGFKRAFRWCPFIQVSSYDELELKTTRFHPTRQSSLYTVSRMESVTVLFDPNDGDPTKSSRKKRAVPRDPSANGCSHRGSKSASTTSSFISSPYTSVDEYS. The pKi is 7.7.