Universal Agreement R

Menu location: Analysis_Agreement_Universal R.

The function calculates the Berry-Mielke Universal R coefficient of agreement and/or effect size (Mielke and Berry, 2007). It is a generalisation of Cohen's kappa to an interval and ordinal measurement scales, and can handle more than two raters. With categorical data R is equivalent to a linearly weighted kappa statistic.

R is chance-corrected and appropriate for the measurement of reliability. It is based on Euclidean distances in a multivariate framework, and it's significance is tested using Pearson Type III distribution (Berry and Mielke, 1988).

In addition to multiple observers, this function can handle multiple aspects or dimensions of observation per observer.

If one of the observers represents a gold standard or reference then you can specify that observer - in this case a slightly different calculation is performed (Berry and Mielke, 1997b).

A calculator is also provided for testing the significance of the difference between two R values from two independent sets of raters (Berry and Mielke, 1997a).

Example

From Berry and Mielke (2007).

Test workbook (Agreement worksheet: Measurement, Rater, Dimension).

Five objects are measured by three observers, each assessing the height, weight and depth of the object:

Measurement	Object	Observer	Dimension
8.0	1	1	h
10.5	2	1	h
17.6	3	1	h
9.0	4	1	h
14.6	5	1	h
9.2	1	1	w
2.5	2	1	w
4.5	3	1	w
12.0	4	1	w
6.0	5	1	w
6.0	1	1	d
11.0	2	1	d
13.0	3	1	d
14.2	4	1	d
7.5	5	1	d
8.2	1	2	h
11.2	2	2	h
20.0	3	2	h
9.0	4	2	h
14.2	5	2	h
9.0	1	2	w
3.0	2	2	w
4.5	3	2	w
12.5	4	2	w
6.0	5	2	w
6.5	1	2	d
11.5	2	2	d
15.0	3	2	d
14.0	4	2	d
8.0	5	2	d
8.2	1	3	h
9.5	2	3	h
21.4	3	3	h
9.5	4	3	h
14.5	5	3	h
9.0	1	3	w
2.8	2	3	w
4.5	3	3	w
13.5	4	3	w
5.5	5	3	w
6.5	1	3	d
12.5	2	3	d
17.0	3	3	d
14.4	4	3	d
9.2	5	3	d

To analyse these data using StatsDirect you must first enter them into a workbook or open the test workbook. Then select Universal R from the Agreement section of the Analysis menu.

Universal agreement (Berry-Mielke) R

Measurement:	Measurement (height, weight, depth)	Measurement (height, weight, depth)
Number of measurements:	45	45
Number of observers:	3	3
Number of objects:	5	5
Number of dimensions:	3	3
Reference standard:	None	Observer 1

Observed (realised) delta:	1.607036	3.432041
Expected (mean) delta:	8.257518	16.218413
Variance of delta:	1.166045	6.363892
Skewness of delta:	-0.777166	-0.492177
Agreement coefficient R:	0.805385	0.788386
Significance:	P < 0.0001	P < 0.0001

The results indicate 81% agreement between observer, which is beyond chance. If the first observer is considered the gold standard then the agreement is 79%.

Comparison example

If two independent groups of observers were assessed using the R coefficient above, giving results: R(1) = 0.11578; R(2) = 0.19780; mean delta(1) = 1.27050; mean delta(2) = 1.60240; variance delta(1) = 0.4678E-03; variance delta(2) = 0.1010E-02; skewness delta(1) = -0.34145; skewness delta(2) = -0.28425.

Select menu item Analysis_Agreement_Compare two universal R values...

Comparison of two universal agreement R statistics

	Group 1	Group 2	Difference
R:	0.11578	0.1978	-0.08202
Mean delta:	1.2705	1.6024	-0.3319
Variance delta:	0.000468	0.00101	0.000683
Skewness delta:	-0.34145	-0.28425	-0.029847
Significance:	P < 0.0001	P < 0.0001	P = 0.002

The results indicate a statistically significant difference between the two groups of observers' ratings of the same set of objects.