# Separating Precision and Mean in Dirichlet-enhanced High-order Markov Models

Lecture slides:

- Agenda - Necessity of robustly estimating high-order Markov process models
- Necessity of robustly estimating high-order Markov process models - Natural language and Markov models
- Necessity of robustly estimating high-order Markov process models - Problem caused by data sparseness
- Necessity of robustly estimating high-order Markov process models - Introducing smoothing methods
- Agenda - Prior work: estimating Markov process models by hierarchical Bayesian approaches
- Prior work - Two major smoothing criteria
- Prior work - Smoothing methods = Hierarchical Bayesian estimation
- Prior work - Known performances of existing methods
- Prior work - Frequency modification by an indicator function
- Agenda - Our proposition: Separating precision and mean in Dirichlet prior
- Our proposition - Our direction
- Our proposition - Discounting factor should depend on current states.
- Our proposition - Separating precision and mean in Dirichlet prior
- Our proposition - New formulation : context-dependent Dirichet prior
- Our proposition - Effective frequency for more precise lower-order distribution
- Our proposition - New Dirichlet prior will outperform when # of states is small.
- Agenda - Experimental result
- Experimental result - Checking the performances depending on the # of states.
- Experimental result : evaluating test-set perplexity - Natural language modeling : slightly worse than Kneser-Ney smoothing
- Experimental result : evaluating test-set perplexity - Protein sequence modeling : outperformed Kneser-Ney smoothing (1)
- Experimental result : evaluating test-set perplexity - Protein sequence modeling : outperformed Kneser-Ney smoothing (2)
- Agenda - Conclusion
- Conclusion

*Author: Rikiya Takahashi, IBM Research*