Skip to article frontmatterSkip to article content
Site not loading correctly?

This may be due to an incorrect BASE_URL configuration. See the MyST Documentation for reference.

Outliers

import numpy as np
import matplotlib.pyplot as pl
%matplotlib inline

import matplotlib as mpl
mpl.rcParams['lines.linewidth']=2
mpl.rcParams['lines.color']='r'
mpl.rcParams['figure.figsize']=(10,8)
mpl.rcParams['font.size']=14
mpl.rcParams['axes.labelsize']=20

Single variable, xx

x = np.array([49.3,50.2,49.2,49.8,50.5,49.3,48.9,49.9,50.1,49.2])
pl.figure()
pl.plot(x,'o')
pl.xlim([-1,8])
pl.ylim([48,52])
pl.ylabel('$x$')
pl.xlabel('$n$')
<Figure size 1000x800 with 1 Axes>

We use modified Thompson test (based on Student’s t-distribution)

Sort the values

x.sort()
x
array([48.9, 49.2, 49.2, 49.3, 49.3, 49.8, 49.9, 50.1, 50.2, 50.5])
pl.plot(x,'o')
pl.xlim([-1,8])
pl.ylim([48,52])
pl.ylabel('$x$')
pl.xlabel('$n$')
<Figure size 1000x800 with 1 Axes>

Note: we suspect in the sorted list of values the first and the last

get the sample mean and sample standard deviation, get deviations

x_mean = np.mean(x)
x_std = np.std(x,ddof=1)
print( x_mean)
print (x_std)
49.64
0.5295700562196137

δi=xxi\delta_i = | x - x_i |

delta = abs(x - x_mean)
pl.plot(delta,'o')
pl.xlim([-1,8])
pl.ylim([-.5,1])
pl.ylabel('$\delta$')
pl.xlabel('$n$')
print (delta[0],delta[-1])
0.740000000000002 0.8599999999999994
<Figure size 1000x800 with 1 Axes>