This data is from Drug-target binding data from BindingDB using Ki measurements. The task is: Regression. Given a target protein amino acid sequence and a drug SMILES string, predict the binding affinity score between them. We predict pKi (pKi = -log10(Ki in M); higher means stronger inhibition). Dataset: bindingdb_ki. (1) The small molecule is CN1CC(=O)N2[C@H](Cc3c([nH]c4ccccc34)[C@H]2c2ccc3c(c2)OCO3)C1=O. The target protein (Q80VJ4) has sequence MTPSQVTFEIRGTLLPGEVFAMCGNCDALGNWSPQNAVPLTESETGESVWKAVIVLSRGMSVKYRYFRGCFLEPKTIGGPCQVIVHKWETHLQPRSITPLENEIIIDDGQFGIHNGVETLDSGWLTCQTEIRLRLHFSEKPPVSITKKKFKKSRFRVKLTLEGLEEDDDDDDKASPTVLHKMSNSLEISLISDNEFKCRHSQPECGYGLQPDRWTEYSIQTMEPDNLELIFDFFEEDLSEHVVQGDVLPGHVGTACLLSSTIAESERSAGILTLPIMSRSSRKTIGKVRVDFIIIKPLPGYSCSMQSSFSKYWKPRIPLDVGHRGAGNSTTTAKLAKVQENTIASLRNAASHGAAFVEFDVHLSKDLVPVVYHDLTCCLTMKRKYEADPVELFEIPVKELTFDQLQLLKLSHVTALKTKDQKQCMAEEENSFSENQPFPSLKMVLESLPENVGFNIEIKWICQHRDGVWDGNLSTYFDMNAFLDIILKTVLENSGKRRIV.... The pKi is 8.0. (2) The compound is CC(C)(C)NC(=O)[C@@H]1CN(Cc2cccnc2)CCN1C[C@@H](O)C[C@@H](Cc1ccccc1)C(=O)N[C@H]1c2ccccc2C[C@H]1O. The target protein sequence is PQITLWQRPIVTVKIGGQLKEALLDTGADDTVIEDINLPGKWKPKMIGGIGGFVKVRQYDQIHIEICGKKAIGTVLVGPTPVNIIGRNMLTQIGCTLNF. The pKi is 7.4. (3) The compound is C[C@@H]1NC[C@@H](O)[C@H](O)C1(F)F. The target protein sequence is LLLLGFALANTNAARTDPPVVCATLNRTNFDTLFPGFTFGTATASYQLEGAANIDGRGPSIWDAFTHNHPEKITDGSNGDVAIDQYHRYKEDVAIMKDMGLDAYRFSISWSRLLPNGTLSGGINKKGIEYYNNLTNELIRNGIEPLVTLFHWDVPQALEEEYGGVLSPRIVYDFKAYAELCYKEFGDRVKHWTTLNEPYTISNHGYTIGIHAPGRCSSWYDPTCLGGDSGTEPYLVTHNLLLAHAAAVKLYREKYQASQEGVIGITVVSHWFEPASESQKDINASVRALDFMYGWFMDPLTRGDYPQSMRSLVKERLPNFTEEQSKSLIGSYDYIGVNYYSARYASAYPEDYSIPTPPSYLTDAYVNVTTELNGVPIGPQAASDWLYVYPKGLYDLVLYTKNKYNDPIMYITENGMDEFNNPKISLEQALNDSNRIDYCYRHLCYLQEAIIEGANVQGYFAWSLLDNFEWSEGYTVRFGINYVDYDNGLKRHSKLSTHWF.... The pKi is 3.0. (4) The target protein (P41144) has sequence MGRRRQGPAQPASELPARNACLLPNGSAWLPGWAEPDGNGSAGPQDEQLEPAHISPAIPVIITAVYSVVFVVGLVGNSLVMFVIIRYTKMKTATNIYIFNLALADALVTTTMPFQSTVYLMNSWPFGDVLCKIVISIDYYNMFTSIFTLTMMSVDRYIAVCHPVKALDFRTPLKAKIINICIWLLSSSVGISAIILGGTKVREDVDIIECSLQFPDDDYSWWDLFMKICVFVFAFVIPVLIIIVCYTLMILRLKSVRLLSGSREKDRNLRRITRLVLVVVAVFIICWTPIHIFILVEALGSTSHSTAALSSYYFCIALGYTNSSLNPILYAFLDENFKRCFRDFCFPIKMRMERQSTSRVRNTVQDPAYMRNVDGVNKPV. The small molecule is COc1ccc2c3c1O[C@H]1c4ncc(-c5ccc(Cl)cc5)cc4C[C@@]4(O)C(C2)N(C)CC[C@]314. The pKi is 5.2. (5) The drug is O=C1N[C@@H]2[C@H](CCCCCc3cn(CCCCOc4ccc5ccccc5c4)nn3)SC[C@@H]2N1. The target protein (P06709) has sequence MKDNTVPLKLIALLANGEFHSGEQLGETLGMSRAAINKHIQTLRDWGVDVFTVPGKGYSLPEPIQLLNAKQILGQLDGGSVAVLPVIDSTNQYLLDRIGELKSGDACIAEYQQAGRGRRGRKWFSPFGANLYLSMFWRLEQGPAAAIGLSLVIGIVMAEVLRKLGADKVRVKWPNDLYLQDRKLAGILVELTGKTGDAAQIVIGAGINMAMRRVEESVVNQGWITLQEAGINLDRNTLAAMLIRELRAALELFEQEGLAPYLSRWEKLDNFINRPVKLIIGDKEIFGISRGIDKQGALLLEQDGIIKPWMGGEISLRSAEK. The pKi is 4.5. (6) The drug is CC(C)CN(C[C@@H](O)[C@H](Cc1ccccc1)NC(=O)O[C@H]1CO[C@H]2OCC[C@@H]12)S(=O)(=O)c1ccc(N)cc1. The target protein sequence is PQITLWKRPLVTIKIGGQLKEALLDTGADDTVIEEMSLPGRWKPKMIGGIGGFIKVRQYDQIIIEIAGHKAIGTVLVGPTPVNVIGRNLLTQIGATLNF. The pKi is 8.9. (7) The small molecule is CCOc1ccc(C[C@H]2NC(=O)CC3(CCCCC3)SCSC[C@H](C(=O)N3CCCC3C(=O)NCCCCCN)NC(=O)[C@@H](CC(N)=O)NC(=O)[C@H](C(C)C)NC(=O)[C@@H](Cc3ccccc3)NC2=O)cc1. The target protein (P32307) has sequence MLRATTSAVPRALSWPAAPGNGSEREPLDDRDPLLARVELALLSTVFVAVALSNGLVLGALVRRGRRGRWAPMHVFIGHLCLADLAVALFQVLPQLAWDATYRFRGPDALCRAVKYLQMVGMYASSYMILAMTLDRHRAICRPMLAYRHGGGARWNRPVLVAWAFSLLLSLPQLFIFAQRDVGDGSGVLDCWASFAEPWGLRAYVTWIALMVFVAPALGIAACQVLIFREIHTSLVPGPAERAGGHRGGRRAGSPREGARVSAAMAKTARMTLVIVAVYVLCWAPFFLVQLWSVWDPKAPREGPPFVLLMLLASLNSCTNPWIYASFSSSISSELRSLLCCPRRRTPPSLRPQEESCATASSFSARDTSS. The pKi is 8.1. (8) The drug is Cc1ccccc1CNC(=O)[C@H]1N(C(=O)[C@@H](O)[C@H](Cc2ccccc2)NC(=O)c2cccc(O)c2C)CSC1(C)C. The pKi is 8.0. The target protein sequence is PQVTLWQRPLVTIKIGGQLREALLDTGADDTIFEEISLPGRWKPKMIGGIGGFVKVRQYDQIPIEICGHKVIGTVLVGPTPANVIGRNLMTQIGCTLNF. (9) The drug is C[N+](C)(C)CCOC(N)=O. The target protein (P20420) has sequence MVQLLAGRWRPTGARRGTRGGLPELSSAAKHEDSLFRDLFEDYERWVRPVEHLSDKIKIKFGLAISQLVDVDEKNQLMTTNVWLKQEWIDVKLRWNPDDYGGIKIIRVPSDSLWIPDIVLFDNADGRFEGASTKTVVRYNGTVTWTQPANYKSSCTIDVTFFPFDLQNCSMKFGSWTYDGSQVDIILEDQDVDRTDFFDNGEWEIMSAMGSKGNRTDSCCWYPYITYSFVIKRLPLFYTLFLIIPCIGLSFLTVVVFYLPSNEGEKISLCTSVLVSLTVFLLVIEEIIPSSSKVIPLIGEYLVFTMIFVTLSIMVTVFAINIHHRSSSTHNAMAPWVRKIFLHKLPKLLCMRSHADRYFTQREEAESGAGPKSRNTLEAALDCIRYITRHVVKENDVREVVEDWKFIAQVLDRMFLWTFLLVSIIGTLGLFVPVIYKWANIIVPVHIGNTIK. The pKi is 5.5. (10) The compound is CCn1cnc(C[C@H](NC(=O)[C@@H]2CCCC(=O)N2)C(=O)N2CCC[C@H]2C(N)=O)c1. The target protein (P34981) has sequence MENETVSELNQTQLQPRAVVALEYQVVTILLVLIICGLGIVGNIMVVLVVMRTKHMRTPTNCYLVSLAVADLMVLVAAGLPNITDSIYGSWVYGYVGCLCITYLQYLGINASSCSITAFTIERYIAICHPIKAQFLCTFSRAKKIIIFVWAFTSLYCMLWFFLLDLNISTYKDAIVISCGYKISRNYYSPIYLMDFGVFYVVPMILATVLYGFIARILFLNPIPSDPKENSKTWKNDSTHQNTNLNVNTSNRCFNSTVSSRKQVTKMLAVVVILFALLWMPYRTLVVVNSFLSSPFQENWFLLFCRICIYLNSAINPVIYNLMSQKFRAAFRKLCNCKQKPTEKPANYSVALNYSVIKESDHFSTELDDITVTDTYLSATKVSFDDTCLASEVSFSQS. The pKi is 8.0.