Universal Agreement R
Menu location: Analysis_Agreement_Universal R.
The function calculates the Berry-Mielke Universal R coefficient of agreement and/or effect size (Mielke and Berry, 2007). It is a generalisation of Cohen's kappa to an interval and ordinal measurement scales, and can handle more than two raters. With categorical data R is equivalent to a linearly weighted kappa statistic.
R is chance-corrected and appropriate for the measurement of reliability. It is based on Euclidean distances in a multivariate framework, and it's significance is tested using Pearson Type III distribution (Berry and Mielke, 1988).
In addition to multiple observers, this function can handle multiple aspects or dimensions of observation per observer.
If one of the observers represents a gold standard or reference then you can specify that observer - in this case a slightly different calculation is performed (Berry and Mielke, 1997b).
A calculator is also provided for testing the significance of the difference between two R values from two independent sets of raters (Berry and Mielke, 1997a).
Example
From Berry and Mielke (2007).
Test workbook (Agreement worksheet: Measurement, Rater, Dimension).
Five objects are measured by three observers, each assessing the height, weight and depth of the object:
Measurement | Object | Observer | Dimension |
8.0 | 1 | 1 | h |
10.5 | 2 | 1 | h |
17.6 | 3 | 1 | h |
9.0 | 4 | 1 | h |
14.6 | 5 | 1 | h |
9.2 | 1 | 1 | w |
2.5 | 2 | 1 | w |
4.5 | 3 | 1 | w |
12.0 | 4 | 1 | w |
6.0 | 5 | 1 | w |
6.0 | 1 | 1 | d |
11.0 | 2 | 1 | d |
13.0 | 3 | 1 | d |
14.2 | 4 | 1 | d |
7.5 | 5 | 1 | d |
8.2 | 1 | 2 | h |
11.2 | 2 | 2 | h |
20.0 | 3 | 2 | h |
9.0 | 4 | 2 | h |
14.2 | 5 | 2 | h |
9.0 | 1 | 2 | w |
3.0 | 2 | 2 | w |
4.5 | 3 | 2 | w |
12.5 | 4 | 2 | w |
6.0 | 5 | 2 | w |
6.5 | 1 | 2 | d |
11.5 | 2 | 2 | d |
15.0 | 3 | 2 | d |
14.0 | 4 | 2 | d |
8.0 | 5 | 2 | d |
8.2 | 1 | 3 | h |
9.5 | 2 | 3 | h |
21.4 | 3 | 3 | h |
9.5 | 4 | 3 | h |
14.5 | 5 | 3 | h |
9.0 | 1 | 3 | w |
2.8 | 2 | 3 | w |
4.5 | 3 | 3 | w |
13.5 | 4 | 3 | w |
5.5 | 5 | 3 | w |
6.5 | 1 | 3 | d |
12.5 | 2 | 3 | d |
17.0 | 3 | 3 | d |
14.4 | 4 | 3 | d |
9.2 | 5 | 3 | d |
To analyse these data using StatsDirect you must first enter them into a workbook or open the test workbook. Then select Universal R from the Agreement section of the Analysis menu.
Universal agreement (Berry-Mielke) R
Measurement: | Measurement (height, weight, depth) | Measurement (height, weight, depth) |
Number of measurements: | 45 | 45 |
Number of observers: | 3 | 3 |
Number of objects: | 5 | 5 |
Number of dimensions: | 3 | 3 |
Reference standard: | None | Observer 1 |
Observed (realised) delta: | 1.607036 | 3.432041 |
Expected (mean) delta: | 8.257518 | 16.218413 |
Variance of delta: | 1.166045 | 6.363892 |
Skewness of delta: | -0.777166 | -0.492177 |
Agreement coefficient R: | 0.805385 | 0.788386 |
Significance: | P < 0.0001 | P < 0.0001 |
The results indicate 81% agreement between observer, which is beyond chance. If the first observer is considered the gold standard then the agreement is 79%.
Comparison example
If two independent groups of observers were assessed using the R coefficient above, giving results: R(1) = 0.11578; R(2) = 0.19780; mean delta(1) = 1.27050; mean delta(2) = 1.60240; variance delta(1) = 0.4678E-03; variance delta(2) = 0.1010E-02; skewness delta(1) = -0.34145; skewness delta(2) = -0.28425.
Select menu item Analysis_Agreement_Compare two universal R values...
Comparison of two universal agreement R statistics
Group 1 | Group 2 | Difference | |
R: | 0.11578 | 0.1978 | -0.08202 |
Mean delta: | 1.2705 | 1.6024 | -0.3319 |
Variance delta: | 0.000468 | 0.00101 | 0.000683 |
Skewness delta: | -0.34145 | -0.28425 | -0.029847 |
Significance: | P < 0.0001 | P < 0.0001 | P = 0.002 |
The results indicate a statistically significant difference between the two groups of observers' ratings of the same set of objects.