Posterior and MAP Derivation for Categorical Distribution in TensorFlow Probability
This note provides a complete derivation of the posterior distribution and the Maximum A Posteriori (MAP) estimate for a Categorical distribution with a Dirichlet prior, including an example implementation in TensorFlow Probability.

Machine Learning & Simulation
1.4K views • Apr 21, 2021

About this video
We put a Dirichlet prior on the Categorical's parameter vector. Now let's derive the Posterior and the Maximum A Posteriori Estimate (MAP). Here are the notes: https://raw.githubusercontent.com/Ceyron/machine-learning-and-simulation/main/english/essential_pmf_pdf/categorical_posterior_and_map.pdf
The Dirichlet Distribution is the conjugate prior to the Categorical. We use this fact to intuitively derive the posterior and its mode, the Maximum A Posterior (MAP) Estimate.
-------
📝 : Check out the GitHub Repository of the channel, where I upload all the handwritten notes and source-code files (contributions are very welcome): https://github.com/Ceyron/machine-learning-and-simulation
📢 : Follow me on LinkedIn or Twitter for updates on the channel and other cool Machine Learning & Simulation stuff: https://www.linkedin.com/in/felix-koehler and https://twitter.com/felix_m_koehler
💸 : If you want to support my work on the channel, you can become a Patreon here: https://www.patreon.com/MLsim
-------
Timestamps:
00:00 Introduction
00:50 Motivation
01:17 Repetition: The Categorical
01:56 Directed Graphical Model
03:32 The joint distribution
05:51 Bayes' Rules
06:35 Proportional Posterior
08:02 Plugging in Dirichlet & Categorical
09:09 Simplifying Proportional Posterior
13:09 Why Dirichlet is conjugate prior
13:49 "Posterior Likelihood"
14:06 Two Paths
14:52 Deriving the Posterior
17:39 MAP: Setup
18:12 MAP: Log-Posterior Likelihood
19:15 MAP: Lagrange Multiplier
20:51 MAP: Maximization
28:12 Discussing the MAP
19:19 MAP for the One-Hot Categorical
30:18 TFP: Create a dataset
32:00 TFP: n observations per state
32:33 TFP: Calculating the MLE
32:52 TFP: Calculating the MAP
34:39 TFP: MLE/MAP for corrupt dataset
37:10 TFP: Posterior Distributions
38:47 Outro
The Dirichlet Distribution is the conjugate prior to the Categorical. We use this fact to intuitively derive the posterior and its mode, the Maximum A Posterior (MAP) Estimate.
-------
📝 : Check out the GitHub Repository of the channel, where I upload all the handwritten notes and source-code files (contributions are very welcome): https://github.com/Ceyron/machine-learning-and-simulation
📢 : Follow me on LinkedIn or Twitter for updates on the channel and other cool Machine Learning & Simulation stuff: https://www.linkedin.com/in/felix-koehler and https://twitter.com/felix_m_koehler
💸 : If you want to support my work on the channel, you can become a Patreon here: https://www.patreon.com/MLsim
-------
Timestamps:
00:00 Introduction
00:50 Motivation
01:17 Repetition: The Categorical
01:56 Directed Graphical Model
03:32 The joint distribution
05:51 Bayes' Rules
06:35 Proportional Posterior
08:02 Plugging in Dirichlet & Categorical
09:09 Simplifying Proportional Posterior
13:09 Why Dirichlet is conjugate prior
13:49 "Posterior Likelihood"
14:06 Two Paths
14:52 Deriving the Posterior
17:39 MAP: Setup
18:12 MAP: Log-Posterior Likelihood
19:15 MAP: Lagrange Multiplier
20:51 MAP: Maximization
28:12 Discussing the MAP
19:19 MAP for the One-Hot Categorical
30:18 TFP: Create a dataset
32:00 TFP: n observations per state
32:33 TFP: Calculating the MLE
32:52 TFP: Calculating the MAP
34:39 TFP: MLE/MAP for corrupt dataset
37:10 TFP: Posterior Distributions
38:47 Outro
Tags and Topics
Browse our collection to discover more content in these categories.
Video Information
Views
1.4K
Likes
35
Duration
39:20
Published
Apr 21, 2021
User Reviews
4.5
(1) Related Trending Topics
LIVE TRENDSRelated trending topics. Click any trend to explore more videos.
No specific trending topics match this video yet.
Explore All Trends