***

*

PAGE RETIRED: Click here for the new StatsDirect help system.

*

OR YOU WILL BE REDIRECTED IN 5 SECONDS

*

***

Mann-Whitney U test

 

Menu location: Analysis_Non-parametric_Mann-Whitney.

 

This is a method for the comparison of two independent random samples (x and y):

The Mann Whitney U statistic is defined as:

image\STAT0166_wmf.gif

- where samples of size n1 and n2 are pooled and Ri are the ranks.

 

U can be resolved as the number of times observations in one sample precede observations in the other sample in the ranking.

 

Wilcoxon rank sum, Kendall's S and the Mann-Whitney U test are exactly equivalent tests. In the presence of ties the Mann-Whitney test is also equivalent to a chi-square test for trend.

 

In most circumstances a two sided test is required; here the alternative hypothesis is that x values tend to be distributed differently to y values. For a lower side test the alternative hypothesis is that x values tend to be smaller than y values. For an upper side test the alternative hypothesis is that x values tend to be larger than y values.

 

Assumptions of the Mann-Whitney test:

 

A confidence interval for the difference between two measures of location is provided with the sample medians. The assumptions of this method are slightly different from the assumptions of the Mann-Whitney test:

 

Technical Validation

StatsDirect uses the sampling distribution of U to give exact probabilities. These calculations may take an appreciable time to complete when many data are tied.

 

Confidence intervals are constructed for the difference between the means or medians (any measure of location in fact). The level of confidence used will be as close as is theoretically possible to the one you specify. StatsDirect approaches the selected confidence level from the conservative side.

 

When samples are large (either sample > 80 or both samples >30) a normal approximation is used for the hypothesis test and for the confidence interval. Note that StatsDirect uses more accurate P value calculations than some other statistical software, therefore, you may notice a difference in results (Conover, 1999; Dineen and Blakesley, 1973; Harding, 1983; Neumann, 1988).

 

Example

From Conover (1999, p. 218).

Test workbook (Nonparametric worksheet: Farm Boys, Town Boys).

 

The following data represent fitness scores from two groups of boys of the same age, those from homes in the town and those from farm homes.

 

Farm Boys

Town Boys

14.8

12.7

7.3

14.2

5.6

12.6

6.3

2.1

9.0

17.7

4.2

11.8

10.6

16.9

12.5

7.9

12.9

16.0

16.1

10.6

11.4

5.6

2.7

5.6

 

7.6

 

11.3

 

8.3

 

6.7

 

3.6

 

1.0

 

2.4

 

6.4

 

9.1

 

6.7

 

18.6

 

3.2

 

6.2

 

6.1

 

15.3

 

10.6

 

1.8

 

5.9

 

9.9

 

10.6

 

14.8

 

5.0

 

2.6

 

4.0

 

To analyse these data in StatsDirect you must first enter them in two separate workbook columns. Alternatively, open the test workbook using the file open function of the file menu. Then select the Mann-Whitney from the Non-parametric section of the analysis menu. Select the columns marked "Farm Boys" and "Town Boys" when prompted for data.

 

For this example:

 

estimated median difference = 0.8

two sided P = 0.529

95.1% confidence interval for difference between population means or medians = -2.3 to 4.4

 

Here we have assumed that these groups are independent and that they represent at least hypothetical random samples of the sub-populations they represent. In this analysis, we are clearly unable to reject the null hypothesis that one group does NOT tend to yield different fitness scores to the other. This lack of statistical evidence of a difference is reflected in the confidence interval for the difference between population means, in that the interval spans zero. Note that the quoted 95.1% confidence interval is as close as you can get to 95% because of the very nature of the mathematics involved in non-parametric methods like this.

 

P values

confidence intervals