Standard deviation
the standard deviation is in the stochastics a measure for the dispersion of the values of a variate around its average value. It is for a variate <math> X< /math> defined as the positive square root from their variance and becomes as <math> \ sigma_X< /math> noted. <math> \ sigma_x= \ {\ operator name sqrt {Var} (X)}</math>. The variance of a variate is the centered moment of second order of the associated distribution, the expectancy value the first moment.
An observation row is appropriate <math> (x_1, x_2, \ dots, x_N)< for /math> the length <math> N< /math> forwards, then empirical average value and empirical standard deviation are thosetwo most important yardsticks in the statistics for the description of the characteristics of the observation row.
The standard deviation is called also middle error or RMS error (from English. root mean square). As abbreviation one finds apart from <math> \ sigma< /math> in applications often also s, m.F. or English rms. In the applied statistics one finds frequently the short way of writing of the kind „to Ø 21 ± 4 “, which is to be read as „average value 21 and standard deviation 4 “.
An example (with range)
middle age (for example in a dance school) = (17.5 ± 1.2) years.
Both values together result in the middle range, MW ± s = 16.3 to 18.7 years.
It applies in case of normaldistributed quantities (see bath tub curve) with a probability of approx. 68% (that one of 2σ with approx. 95 %). Therefore above range suggests that
- 16% are younger the dance pupil than 16.3 years (and 2 - 3% under 15.1 years) and
- 16% than 18.7 years (and 2 - 3% over 19.9 years) are older.
This example has however hardly normal distribution, because there is probably of the class participants more as 2.5% older than 20 years.
- Rules of thumb for practiceare: One calls values outside of the two to three-way standard deviation outliers. Outliers can be a reference to gross errors of the data acquisition. It can be appropriate for the data in addition, a strongly inclined distribution to reason. On the other hand must approx. each 20steAre appropriate for measured value outside of the double standard deviation.
middle error, dispersion and variance
the standard deviation (M. F.) is the square root of another dispersion yardstick, the variance. The standard deviation has the advantage in relation to the variance that it the same unithas like the original measured values.
- If the number of the children in a household is examined, then the unit of the variance is a square child, the unit of the standard deviation however child.
mathematical definition of the standard deviation
- < math>
\ sigma_x: = \ {\ frac {1} {N-1 sqrt} \ sum_ {i=1} ^N {(x_i \ bar {x}) ^2}} </math>
Is
- <math> \ sigma_x< /math> the standard deviation of the individual measuring
- <math> N< /math> the range of the population (number of values and/or. Number of degrees of freedom)
- <math> x_i< /math> characteristic developments at <math> the i< /math> - ten element of the population ( <math> the i< /math> - width unit element in the quantity of the values)
- < math> \ bar {x} =\ frac {1} {N} \ sum_ {i=1} ^N {x_i}< /math> the arithmetic means (empirical average value)
the standard deviation of the average value <math> \ sigma_ {\ bar {x}}< /math> is given through:
- <math>
\ sigma_ {\ bar {x}} = \ frac {\ sigma_x} {\ sqrt {N}} = \ sqrt {\ frac {1} {N (N-1)} \ sum_ {i=1} ^N {(x_i \ bar {x}) ^2}}
< /math>
If one would like to calculate the standard deviation without previous average value computation by the sample, the following formula is useful:
- <math>
\ sigma_x = \ {\ frac {N sqrt \ sum_ {i=1} ^N {x_i^2} - (\ sum_ {i=1} ^N {x_i}) ^2} {N (N-1)} }
< /math>
guide
for the fast estimation of <math> \ sigma< /math> one looks for that Sechstel of the values, which are smallest and/or largest. The standard deviation is then the half differencethe two limit values. With unclear distributions or if one can count only „in the head “, the also following estimation goes: (Maximum value minimum value) /3. This estimation surprisingly supplies both with normal distributions and uniform distributions or high Variationskoeffizienten good rough estimations.
unbiased estimation thatStandard deviation from a sample
- < math>
\ {\ sigma has} = \ {\ frac {n-1 sqrt} {2}} \ \ frac {\ gamma \ left (\ frac {n-1} {2} \ right)} {\ Gamma \ left (\ frac {n} {2} \ right)} \ s
< /math>
Math
- <\> has {\ sigma is}< /math> the unbiased estimation of the standard deviation and
- < math> n< /math> the sample size (because only one sample, not the total quantity, is examined, must the average value from thatValues of the sample to be determined. Thus the number of the degrees of freedom reduces to (n-1))
- < math> \ gamma (x)< /math> the gamma function
- < math> s< /math> the estimation for the standard deviation
example
the five values 3, 4, 5, 6 became with a sample,7 measured. One is to calculate now the estimation for the standard deviation.
The Korrekturfaktor is in this case
- < math>
\ {2} \ \ frac {\ gamma sqrt \ left (2 \ right)}{\ Gamma \ left (2 {,} 5 \ right)} \ approx 1 {,} 063846 </math>
and the unbiased estimation for the standard deviation is thereby approach 1,064.
the standard deviation | |
---|---|
of a normal distribution | |
the linear | |
normal distribution | it can |
be represented |
2 1.253314 5 1.063846 10 1.028109
15 1.018002 [works on] among other things in such a way that the standard deviation is a parameter of the distribution. With thisEstimation can be used the characteristic of the maximum Likelihood estimation that a monotonous transformation of an maximum Likelihood estimation is an maximum Likelihood estimation for the monotonous transformation of the estimated parameter. That means that the square root of an maximum Likelihood estimation of a parameter, which can be only positive an maximum Likelihood estimationfor the square root of this parameter is.
- <math>
\ {\ sigma has} _ {\ rm ML} = \ sqrt {\ frac {1} {n} \ sum_ {i=1} ^n {(X_i \ bar {X}) ^2}}
< /math>
This estimation is an maximum Likelihood estimation for a parameter of the normal distribution or for a transformation of this parameter. It is not on the estimation of the standard deviation of anyTo transfer distribution.
The maximum Likelihood estimation for the standard deviation of a poisson distribution is for example the square root from the arithmetic means.
As maximum Likelihood estimation for the standard deviation from the sample {} one receives 3, 4, 5, 6, 7 thus
- < math>
\ {\ sigma has} _ {\ rm ML} = \ sqrt {\ frac {1} {5}\ 10} = \ sqrt {2} \ approx 1 cdot {,} 414
< /math>
examples
the cube example here of the standard deviation, well-known from the variance:
The standard deviation with the 500-maligen cubes and the random variable X: Number of ones
- < math>
\ sqrt {500 \ {1 cdot \ more over6} \ cdot {5 \ more over 6}}
< /math>
computation for accumulating measured values
in systems, which seize continuously large quantities of measured values, is often unpractical it to buffer all measured values in order to compute the standard deviation.
In this connection it is more favorable,to use a modified formula, those the critical term <math> \ sum_ {i=1} ^N {(x_i \ bar {x}) ^2}< /math> goes around. This cannot be computed for each measured value immediately, there the average value <math> \ bar {x}< /math> is not constant.
By application of the 2. binomischen formula and the definition of the average value <math> \ bar {x} =\ sum_ {i=1} ^N \ frac {x_i} {N}< /math> arrives one at the representation
- < math>
\ sigma_x = \ sqrt {\ frac {N \ cdot \ sum_ {i=1} ^N {x_i {} ^2} - \ (\ sum_ {i=1} ^N {x_i} \ right) ^2 left} {N \ cdot (N-1)}},
< /math>
for each arriving measured value to update immediately leaves itself, if the sum of the measured values <math> \ sum_ {i=1} ^N {x_i}< /math> as well as the sum of their squares <math> \ sum_ {i=1} ^N {x_i {} ^2}< /math> carried and sequentially updatesbecome.
see also:
Web on the left of
Wiktionary: Standard deviation - word origin, synonyms and translations |