This data is from Drug-target binding data from BindingDB using Ki measurements. The task is: Regression. Given a target protein amino acid sequence and a drug SMILES string, predict the binding affinity score between them. We predict pKi (pKi = -log10(Ki in M); higher means stronger inhibition). Dataset: bindingdb_ki. (1) The small molecule is NC(=O)C1CCCN1C(=O)C(Cc1cnc[nH]1)NC(=O)C1CCC(=O)N1. The target protein sequence is MMFLWWLLLLGTAISHKVHSQEQPLLEEDTAPADNLDVLEKAKGILIRSFLEGFQEGQQINRDLPDAMEMIYKRQHPGKRFQEEIEKRQHPGKRDLEDLQLSKRQHPGRRYLEDMEKRQHPGKREEGDWSRGYLTDDSGYLDLFSDVSKRQHPGKRVPDPFFIKRQHPGKRGIEEEDDTEFENSKEVGKRQHPGKRYDPCEGPNAYNCNSGNLQLDSVEEGWAA. The pKi is 6.7. (2) The drug is CCNC(=O)[C@H]1OC(n2cnc3c(NCCCCCCCCNS(=O)(=O)c4cccc5c(N(C)C)cccc45)ncnc32)[C@H](O)[C@@H]1O. The target protein (P28647) has sequence MKANNTTTSALWLQITYITMEAAIGLCAVVGNMLVIWVVKLNRTLRTTTFYFIVSLALADIAVGVLVIPLAIAVSLEVQMHFYACLFMSCVLLVFTHASIMSLLAIAVDRYLRVKLTVRYRTVTTQRRIWLFLGLCWLVSFLVGLTPMFGWNRKVTLELSQNSSTLSCHFRSVVGLDYMVFFSFITWILIPLVVMCIIYLDIFYIIRNKLSQNLTGFRETRAFYGREFKTAKSLFLVLFLFALCWLPLSIINFVSYFNVKIPEIAMCLGILLSHANSMMNPIVYACKIKKFKETYFVILRACRLCQTSDSLDSNLEQTTE. The pKi is 6.6. (3) The pKi is 5.0. The target is MLLARMKPQVQPELGGADQ. The compound is COc1ccc2c3c1OC1C(O)C=CC4C(C2)N(C)CCC341. (4) The compound is O=C([O-])CS[C@H](C(=O)[O-])[C@@H](O)C(=O)[O-]. The target protein (P40495) has sequence MFRSVATRLSACRGLASNAARKSLTIGLIPGDGIGKEVIPAGKQVLENLNSKHGLSFNFIDLYAGFQTFQETGKALPDETVKVLKEQCQGALFGAVQSPTTKVEGYSSPIVALRREMGLFANVRPVKSVEGEKGKPIDMVIVRENTEDLYIKIEKTYIDKATGTRVADATKRISEIATRRIATIALDIALKRLQTRGQATLTVTHKSNVLSQSDGLFREICKEVYESNKDKYGQIKYNEQIVDSMVYRLFREPQCFDVIVAPNLYGDILSDGAAALVGSLGVVPSANVGPEIVIGEPCHGSAPDIAGKGIANPIATIRSTALMLEFLGHNEAAQDIYKAVDANLREGSIKTPDLGGKASTQQVVDDVLSRL. The pKi is 7.0. (5) The small molecule is Cn1ccc2cc3c(cc21)CCN3C(=O)Nc1cccnc1. The target protein (Q60484) has sequence MSPPNQSEEGLPQEASNRSLNATETPGDWDPGLLQALKVSLVVVLSIITLATVLSNAFVLTTILLTRKLHTPANYLIGSLATTDLLVSILVMPISIAYTTTRTWNFGQILCDIWVSSDITCCTASILHLCVIALDRYWAITDALEYSKRRTAGHAGAMIAAVWVISICISIPPLFWRQAQAQEEMSDCLVNTSQISYTIYSTCGAFYIPSVLLIILYSRIYRAARSRILNPPSLSGKRFTTAHLITGSAGSSLCSLNPSLHEGHMHPGSPLFFNHVRIKLADSVLERKRISAARERKATKTLGIILGAFIVCWLPFFVVSLVLPICRDSCWIHPALFDFFTWLGYLNSLINPIIYTVFNEDFRQAFQKVVHFRKAS. The pKi is 6.0. (6) The compound is CC(C)C[C@@H]1NC(=O)[C@H](Cc2ccccc2)NC(=O)[C@H](Cc2ccc(O)cc2)NC(=O)CCSSC[C@@H](C(=O)N2CCC[C@H]2C(=O)N[C@@H](CN)C(=O)NCC(N)=O)NC(=O)[C@H](CC(N)=O)NC1=O. The target protein (P48974) has sequence MNSEPSWTATPSPGGTLPVPNATTPWLGRDEELAKVEIGILATVLVLATGGNLAVLLTLGRHGHKRSRMHLFVLHLALTDLGVALFQVLPQLLWDITYRFQGSDLLCRAVKYLQVLSMFASTYMLLAMTLDRYLAVCHPLRSLRQPSQSTYPLIAAPWLLAAILSLPQVFIFSLREVIQGSGVLDCWADFYFSWGPRAYITWTTMAIFVLPVAVLSACYGLICHEIYKNLKVKTQAGREERRGWRTWDKSSSSAVATAATRGLPSRVSSISTISRAKIRTVKMTFVIVLAYIACWAPFFSVQMWSVWDENAPNEDSTNVAFTISMLLGNLSSCCNPWIYMGFNSRLLPRSLSHHACCTGSKPQVHRQLSTSSLTSRRTTLLTHACGSPTLRLSLNLSLRAKPRPAGSLKDLEQVDGEATMETSIF. The pKi is 9.3. (7) The compound is O=P(O)(O)C(O)(Cc1cccc(-c2cccc3c2oc2ccccc23)c1)P(=O)(O)O. The target protein (Q12051) has sequence MEAKIDELINNDPVWSSQNESLISKPYNHILLKPGKNFRLNLIVQINRVMNLPKDQLAIVSQIVELLHNSSLLIDDIEDNAPLRRGQTTSHLIFGVPSTINTANYMYFRAMQLVSQLTTKEPLYHNLITIFNEELINLHRGQGLDIYWRDFLPEIIPTQEMYLNMVMNKTGGLFRLTLRLMEALSPSSHHGHSLVPFINLLGIIYQIRDDYLNLKDFQMSSEKGFAEDITEGKLSFPIVHALNFTKTKGQTEQHNEILRILLLRTSDKDIKLKLIQILEFDTNSLAYTKNFINQLVNMIKNDNENKYLPDLASHSDTATNLHDELLYIIDHLSEL. The pKi is 7.0. (8) The compound is CC(=O)NCSCC(NC(=O)C(C)N)C(=O)NC(CC(=O)O)C(=O)NC(C(=O)NC(C)C(=O)NC(C(=O)NC(CSCNC(C)=O)C(=O)NC(C(=O)NC(C(=O)NC(CC1CNCN1)C(=O)NC(CCCN=C(N)N)C(=O)NC(CC(C)C)C(=O)NC(C)C(=O)NCC(=O)NC(CC(C)C)C(=O)NC(CC(C)C)C(=O)NC(CO)C(=O)NC(CCCN=C(N)N)C(=O)NC(CO)C(=O)NCC(=O)NCC(=O)NC(C(=O)NC(C(=O)NC(CCCCN)C(=O)NC(CC(N)=O)C(=O)NC(CC(N)=O)C(=O)NC(Cc1ccccc1)C(=O)NC(C(=O)N1CCCC1C(=O)NC(C(=O)NC(CC(N)=O)C(=O)NC(C(=O)NCC(=O)NC(CO)C(=O)NC(CCCCN)C(=O)NC(C)C(=O)NC(Cc1ccccc1)C(N)=O)C(C)C)C(C)O)C(C)C)C(C)C)C(C)C)C(C)O)C(C)C)C(C)O)C(C)O. The target protein (Q8WN93) has sequence MEKKYILYFLFLLPFFMILVIAETEEENPDDLIQLTVTRNKIMTAQYECYQKIMQDPIQQTEGIYCNRTWDGWLCWNDVAAGTESMQHCPDYFQDFDPSEKVTKICDQDGNWFRHPESNRTWTNYTQCNINTHEKVQTALNLFYLTIIGHGLSIASLLISLGIFFYFKSLSCQRITLHKNLFFSFVCNSIVTIIHLTAVANNQALVATNPVSCKVFQFIHLYLMGCNYFWMLCEGIYLHTLIVVAVFAEKQHLMWYYFLGWGFPLIPACIHAVARRLYYNDNCWISSDTHLLYIIHGPICAALLVNLFFLLNIVRVLITKLKVTHQAESNLYMKAVRATLILVPLLGIEFVLIPWRPEGKIAEEVYDYIMHILVHYQGLLVSTIYCFFNGEVQAILRRNWNQYKIQFGNSFSHSDALRSASYTVSTISDGAGYSHDYPSEHLNGKSIHDMENIVIKPEKLYD. The pKi is 7.4. (9) The compound is CC(C)[C@H](NC(=O)[C@H](C)N)C(=O)N1CCC[C@H]1C(=O)N[C@@H](Cc1c[nH]c2ccccc12)C(=O)O. The target protein sequence is MQTHAARMRTFMYWPSSVPVQPEQLAAAGFYYVGRNDDVKCFSCDGGLRCWESGDDPWVEHAKWFPGCEFLIRMKGQEYINNIHLTHSL. The pKi is 7.5.