***
*
PAGE RETIRED: Click here for the new StatsDirect help system.
*
OR YOU WILL BE REDIRECTED IN 5 SECONDS
*
***
Menu location: Analysis_Nonparametric_MannWhitney.
This is a method for the comparison of two independent random samples (x and y):
The Mann Whitney U statistic is defined as:
 where samples of size n1 and n2 are pooled and Ri are the ranks.
U can be resolved as the number of times observations in one sample precede observations in the other sample in the ranking.
Wilcoxon rank sum, Kendall's S and the MannWhitney U test are exactly equivalent tests. In the presence of ties the MannWhitney test is also equivalent to a chisquare test for trend.
In most circumstances a two sided test is required; here the alternative hypothesis is that x values tend to be distributed differently to y values. For a lower side test the alternative hypothesis is that x values tend to be smaller than y values. For an upper side test the alternative hypothesis is that x values tend to be larger than y values.
Assumptions of the MannWhitney test:
random samples from populations
independence within samples and mutual independence between samples
measurement scale is at least ordinal
A confidence interval for the difference between two measures of location is provided with the sample medians. The assumptions of this method are slightly different from the assumptions of the MannWhitney test:
random samples from populations
independence within samples and mutual independence between samples
two population distribution functions are identical apart from a possible difference in location parameters
Technical Validation
StatsDirect uses the sampling distribution of U to give exact probabilities. These calculations may take an appreciable time to complete when many data are tied.
Confidence intervals are constructed for the difference between the means or medians (any measure of location in fact). The level of confidence used will be as close as is theoretically possible to the one you specify. StatsDirect approaches the selected confidence level from the conservative side.
When samples are large (either sample > 80 or both samples >30) a normal approximation is used for the hypothesis test and for the confidence interval. Note that StatsDirect uses more accurate P value calculations than some other statistical software, therefore, you may notice a difference in results (Conover, 1999; Dineen and Blakesley, 1973; Harding, 1983; Neumann, 1988).
Example
From Conover (1999, p. 218).
Test workbook (Nonparametric worksheet: Farm Boys, Town Boys).
The following data represent fitness scores from two groups of boys of the same age, those from homes in the town and those from farm homes.
Farm Boys 
Town Boys 
14.8 
12.7 
7.3 
14.2 
5.6 
12.6 
6.3 
2.1 
9.0 
17.7 
4.2 
11.8 
10.6 
16.9 
12.5 
7.9 
12.9 
16.0 
16.1 
10.6 
11.4 
5.6 
2.7 
5.6 

7.6 

11.3 

8.3 

6.7 

3.6 

1.0 

2.4 

6.4 

9.1 

6.7 

18.6 

3.2 

6.2 

6.1 

15.3 

10.6 

1.8 

5.9 

9.9 

10.6 

14.8 

5.0 

2.6 

4.0 
To analyse these data in StatsDirect you must first enter them in two separate workbook columns. Alternatively, open the test workbook using the file open function of the file menu. Then select the MannWhitney from the Nonparametric section of the analysis menu. Select the columns marked "Farm Boys" and "Town Boys" when prompted for data.
For this example:
estimated median difference = 0.8
two sided P = 0.529
95.1% confidence interval for difference between population means or medians = 2.3 to 4.4
Here we have assumed that these groups are independent and that they represent at least hypothetical random samples of the subpopulations they represent. In this analysis, we are clearly unable to reject the null hypothesis that one group does NOT tend to yield different fitness scores to the other. This lack of statistical evidence of a difference is reflected in the confidence interval for the difference between population means, in that the interval spans zero. Note that the quoted 95.1% confidence interval is as close as you can get to 95% because of the very nature of the mathematics involved in nonparametric methods like this.