# Universal Agreement R

Menu location: Analysis_Agreement_Universal R.

The function calculates the Berry-Mielke Universal R coefficient of agreement and/or effect size (Mielke and Berry, 2007). It is a generalisation of Cohen's kappa to an interval and ordinal measurement scales, and can handle more than two raters. With categorical data R is equivalent to a linearly weighted kappa statistic.

R is chance-corrected and appropriate for the measurement of reliability. It is based on Euclidean distances in a multivariate framework, and it's significance is tested using Pearson Type III distribution (Berry and Mielke, 1988).

In addition to multiple observers, this function can handle multiple aspects or dimensions of observation per observer.

If one of the observers represents a gold standard or reference then you can specify that observer - in this case a slightly different calculation is performed (Berry and Mielke, 1997b).

A calculator is also provided for testing the significance of the difference between two R values from two independent sets of raters (Berry and Mielke, 1997a).

Example

Test workbook (Agreement worksheet: Measurement, Rater, Dimension).

Five objects are measured by three observers, each assessing the height, weight and depth of the object:

 Measurement Object Observer Dimension 8.0 1 1 h 10.5 2 1 h 17.6 3 1 h 9.0 4 1 h 14.6 5 1 h 9.2 1 1 w 2.5 2 1 w 4.5 3 1 w 12.0 4 1 w 6.0 5 1 w 6.0 1 1 d 11.0 2 1 d 13.0 3 1 d 14.2 4 1 d 7.5 5 1 d 8.2 1 2 h 11.2 2 2 h 20.0 3 2 h 9.0 4 2 h 14.2 5 2 h 9.0 1 2 w 3.0 2 2 w 4.5 3 2 w 12.5 4 2 w 6.0 5 2 w 6.5 1 2 d 11.5 2 2 d 15.0 3 2 d 14.0 4 2 d 8.0 5 2 d 8.2 1 3 h 9.5 2 3 h 21.4 3 3 h 9.5 4 3 h 14.5 5 3 h 9.0 1 3 w 2.8 2 3 w 4.5 3 3 w 13.5 4 3 w 5.5 5 3 w 6.5 1 3 d 12.5 2 3 d 17.0 3 3 d 14.4 4 3 d 9.2 5 3 d

To analyse these data using StatsDirect you must first enter them into a workbook or open the test workbook. Then select Universal R from the Agreement section of the Analysis menu.

Universal agreement (Berry-Mielke) R

 Measurement: Measurement (height, weight, depth) Measurement (height, weight, depth) Number of measurements: 45 45 Number of observers: 3 3 Number of objects: 5 5 Number of dimensions: 3 3 Reference standard: None Observer 1 Observed (realised) delta: 1.607036 3.432041 Expected (mean) delta: 8.257518 16.218413 Variance of delta: 1.166045 6.363892 Skewness of delta: -0.777166 -0.492177 Agreement coefficient R: 0.805385 0.788386 Significance: P < 0.0001 P < 0.0001

The results indicate 81% agreement between observer, which is beyond chance. If the first observer is considered the gold standard then the agreement is 79%.

Comparison example

If two independent groups of observers were assessed using the R coefficient above, giving results: R(1) = 0.11578; R(2) = 0.19780; mean delta(1) = 1.27050; mean delta(2) = 1.60240; variance delta(1) = 0.4678E-03; variance delta(2) = 0.1010E-02; skewness delta(1) = -0.34145; skewness delta(2) = -0.28425.

Select menu item Analysis_Agreement_Compare two universal R values...

Comparison of two universal agreement R statistics

 Group 1 Group 2 Difference R: 0.11578 0.1978 -0.08202 Mean delta: 1.2705 1.6024 -0.3319 Variance delta: 0.000468 0.00101 0.000683 Skewness delta: -0.34145 -0.28425 -0.029847 Significance: P < 0.0001 P < 0.0001 P = 0.002

The results indicate a statistically significant difference between the two groups of observers' ratings of the same set of objects.