%matplotlib inline
import numpy as np
import matplotlib
import matplotlib.pyplot as plt
import scipy.stats
%autosave 30
Autosaving every 30 seconds
Statistical distribution#
Let’s have a look how different statistical distributions look like, to have a better idea what to use as prior on our inference bayesian exploration.
All the distributions available in scipy can be found on the docs here: http://docs.scipy.org/doc/scipy/reference/stats.html#module-scipy.stats
Let’s start with Discrete distributions
Discrete Distributions#
bernoulli: A Bernoulli discrete random variable.
binom: A binomial discrete random variable.
poisson: A Poisson discrete random variable.
…
from scipy.stats import bernoulli, poisson, binom
Bernoulli distribution#
Given a certain probability \(p\), the Bernoulli distribution takes value \(k=1\), meanwhile it takes \(k=0\) in all the other cases \(1-p\).
In other words:
bernoulli.rvs(0.6, size=100)
array([1, 1, 1, 1, 1, 1, 0, 1, 0, 0, 0, 0, 1, 0, 1, 1, 1, 0, 1, 0, 1, 0,
0, 0, 0, 0, 1, 1, 1, 0, 0, 0, 0, 1, 0, 1, 0, 1, 1, 1, 1, 1, 0, 1,
0, 1, 0, 0, 1, 1, 1, 0, 1, 1, 1, 1, 0, 1, 0, 1, 1, 1, 1, 0, 0, 0,
0, 1, 0, 1, 1, 1, 1, 1, 0, 1, 1, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1,
1, 1, 1, 0, 0, 1, 1, 1, 0, 0, 1, 0])
a = np.arange(2)
colors = matplotlib.rcParams['axes.color_cycle']
plt.figure(figsize=(12,8))
for i, p in enumerate([0.1, 0.2, 0.6, 0.7]):
ax = plt.subplot(1, 4, i+1)
plt.bar(a, bernoulli.pmf(a, p), label=p, color=colors[i], alpha=0.5)
ax.xaxis.set_ticks(a)
plt.legend(loc=0)
if i == 0:
plt.ylabel("PDF at $k$")
plt.suptitle("Bernoulli probability")
---------------------------------------------------------------------------
KeyError Traceback (most recent call last)
Cell In[5], line 3
1 a = np.arange(2)
----> 3 colors = matplotlib.rcParams['axes.color_cycle']
4 plt.figure(figsize=(12,8))
5 for i, p in enumerate([0.1, 0.2, 0.6, 0.7]):
File ~/miniforge3/envs/book/lib/python3.11/site-packages/matplotlib/__init__.py:802, in RcParams.__getitem__(self, key)
799 from matplotlib import pyplot as plt
800 plt.switch_backend(rcsetup._auto_backend_sentinel)
--> 802 return self._get(key)
File ~/miniforge3/envs/book/lib/python3.11/site-packages/matplotlib/__init__.py:720, in RcParams._get(self, key)
701 def _get(self, key):
702 """
703 Directly read data bypassing deprecation, backend and validation
704 logic.
(...) 718 :meta public:
719 """
--> 720 return dict.__getitem__(self, key)
KeyError: 'axes.color_cycle'
Poisson Distribution#
Another discrete distribution, the Poisson Distribution is defined for all the integer positive number as
k = np.arange(20)
colors = matplotlib.rcParams['axes.color_cycle']
plt.figure(figsize=(12,8))
for i, lambda_ in enumerate([1, 4, 6, 12]):
plt.bar(k, poisson.pmf(k, lambda_), label=lambda_, color=colors[i], alpha=0.4, edgecolor=colors[i], lw=3)
plt.legend()
plt.title("Poisson distribution")
plt.xlabel("$k$")
plt.ylabel("PDF at k")
k = np.arange(15)
plt.figure(figsize=(12,8))
for i, lambda_ in enumerate([1, 2, 4, 6]):
plt.plot(k, poisson.pmf(k, lambda_), '-o', label=lambda_, color=colors[i])
plt.fill_between(k, poisson.pmf(k, lambda_), color=colors[i], alpha=0.5)
plt.legend()
plt.title("Poisson distribution")
plt.ylabel("PDF at $k$")
plt.xlabel("$k$")
Binomial distribution#
Last but not least, the binomial distribution which is defined as:
where
with \(k={1, 2, 3, \ldots}\)
plt.figure(figsize=(12,6))
k = np.arange(0, 22)
for p, color in zip([0.1, 0.3, 0.6, 0.8], colors):
rv = binom(20, p)
plt.plot(k, rv.pmf(k), lw=2, color=color, label=p)
plt.fill_between(k, rv.pmf(k), color=color, alpha=0.5)
plt.legend()
plt.title("Binomial distribution")
plt.tight_layout()
plt.ylabel("PDF at $k$")
plt.xlabel("$k$")
Continous Probability Distribution#
They are defined for any value of a positive \(x\). A lot of distribution are defined on scipy.stats
, so I will explore only som:
alpha An alpha continuous random variable.
beta A beta continuous random variable.
gamma A gamma continuous random variable.
expon An exponential continuous random variable.
…
Alpha#
The Alpha distribution is defined as
x = np.linspace(0.1, 2, 100)
alpha = scipy.stats.alpha
alphas = [0.5, 1, 2, 4]
plt.figure(figsize=(12,6))
for a,c in zip(alphas,colors):
label=r"$\alpha$ = {0:.1f}".format(a)
plt.plot(x, alpha.pdf(x, a), lw=2,
color=c, label=label)
plt.fill_between(x, alpha.pdf(x, a), color=c, alpha = .33)
plt.ylabel("PDF at $x$")
plt.xlabel("$x$")
plt.title("Alpha distribution")
plt.legend()
Beta distribution#
The Beta distribution is defined for a variabile rangin between 0 and 1.
The pdf is defined as:
beta = scipy.stats.beta
x = np.linspace(0,1, num=200)
fig = plt.figure(figsize=(12,6))
for a, b, c in zip([0.5, 0.5, 1, 2, 3], [0.5, 1, 3, 2, 5], colors):
plt.plot(x, beta.pdf(x, a, b), lw=2,
c=c, label = r"$\alpha = {0:.1f}, \beta={1:.1f}$".format(a, b))
plt.fill_between(x, beta.pdf(x, a, b), color=c, alpha = .1)
plt.legend(loc=0)
plt.ylabel("PDF at $x$")
plt.xlabel("$x$")
Gamma distribution#
The gamma distribution uses the Gamma function (http://en.wikipedia.org/wiki/Gamma_function) and it has two shape parameters.
The scale parameter is equal = \(1.0/\lambda\)
gamma = scipy.stats.gamma
plt.figure(figsize=(12, 6))
x = np.linspace(0, 10, num=200)
for a, c in zip([0.5, 1, 2, 3, 10], colors):
plt.plot(x, gamma.pdf(x, a), lw=2,
c=c, label = r"$\alpha = {0:.1f}$".format(a))
plt.fill_between(x, gamma.pdf(x, a), color=c, alpha = .1)
plt.legend(loc=0)
plt.title("Gamma distribution with scale=1")
plt.ylabel("PDF at $x$")
plt.xlabel("$x$")
Exponential#
The Exponantial probability function is
Therefore, the random variable X has an exponential distribution with parameter λ, we say X is exponential and write
Given a specific λ, the expected value of an exponential random variable is equal to the inverse of λ, that is:
x = np.linspace(0,4, 100)
expo = scipy.stats.expon
lambda_ = [0.5, 1, 2, 4]
plt.figure(figsize=(12,4))
for l,c in zip(lambda_,colors):
plt.plot(x, expo.pdf(x, scale=1./l), lw=2,
color=c, label = "$\lambda = %.1f$"%l)
plt.fill_between(x, expo.pdf(x, scale=1./l), color=c, alpha = .33)
plt.legend()
plt.ylabel("PDF at $x$")
plt.xlabel("$x$")
plt.title("Probability density function of an Exponential random variable;\
differing $\lambda$");