Probability Distributions

Menu location: Analysis_Distributions.

This section covers common statistical probability distributions. Robust, reliable algorithms have been employed to provide a high level of accuracy. For practical purposes, however, the P values given with hypothesis tests throughout StatsDirect are displayed to four decimal places (or the number you specify in Options section of the Analysis menu).

PROBABILITY DISTRIBUTIONS

Probability is a concept that helps us predict the chance of something happening (an outcome) based upon knowledge of how this type of outcome behaves mathematically. In mathematical language, an outcome is described in terms of a random variable.

A random variable can take on different values which represent different outcomes, e.g. blood pressure readings. Blood pressure can be thought of in infinitely small units of measurement where the steps between the units are so small that they become continuous, this is an example of a continuous random variable.

If a variable can not take on an infinite number of sub-divisions of values then it is discrete. Discrete random variables take on discrete outcomes such as the number of times an asthmatic patient has been admitted to hospital.

Consider the pattern of values of an outcome measured many times in a population. If you plot all of the values of this outcome on a histogram chart then you are likely to see that the histogram takes on a similar shape each time you plot observations from a large random sample from the population. With a continuous random variable you can draw a curve around the histogram because it is possible to have values in-between any that are measured. With a discrete variable, however, there are a pre-defined number of values that can be measured. If there are only a few discrete values in the population then your histogram will have wide bars with definite steps between them.

Now comes the all important linking concept: probability distribution. The peaks in histogram plots show that some values occur more frequently than others. The most commonly occurring values are those that have the highest probability of being observed when you take a random sample from the population of interest.

def. A probability distribution of a random variable is a table, graph or mathematical expression giving the probabilities with which the random variable takes different values.

Description of this concept in numbers involves more thought about populations and samples. Consider a graph of probability (P) plotted against the value of outcome (x):

A probability distribution would include all possible values for x.
The sum of P for all possible values of x is defined as 1.
For discrete variables this is literally a simple summation but for continuous variables the number of possible values of x is infinite so we use integration to estimate the area under the curve. This area is 1 for the total curve.

Now consider one value of x:

You can use the probability distribution for x to estimate the chance of observing that x at random in the population.
For discrete distributions we do literally calculate P.
For continuous distributions we use a partial area under the curve or probability density function which represents the probability that x lies between 2 specified values.

Calculated probability (P) values are associated with statistical tests. A test statistic calculated in a statistical test is compared with its probability distribution. The P value derived from this comparison is then used to support the researcher's decision to accept or refute the test hypothesis with an accepted level of certainty.

P values and confidence intervals can give a false sense of security. P values say nothing about the assumptions of your test. Confidence intervals give a more realistic representation of a test result but they do NOT compensate for a test used with invalid assumptions. Please read the help text regarding assumptions when you are using any of the hypothesis tests in StatsDirect.

Discrete distributions: e.g. Binomial, Poisson

Continuous distributions: e.g. Normal, Chi-square, Student's t, F

If you need more information about probability and sampling theory then please consult one of the general texts listed in the reference section.