Menu location: Data_Grouping_Categorise.
This function enables you to categorise any set of data into groups that you specify, for example ages into age groups.
Typically, a continuous variable might be divided into categories or groups. Take the IgM variable in the parametric sheet of the test workbook for example; this has 298 observations which you might want to summarise in ranges of values. In order to do this, simply select the Data_Grouping_Categorise menu item then select the IgM column of data. You are presented with different ways to group your data into bins (intervals) of counts:
Quartiles: 4 bins (< lower quartile, lower quartile to median, median to upper quartile, >= upper quartile)
Quintiles: 5 bins (< first quintile… >= fourth quintile)
Deciles: 10 bins (<first decile… >=ninth decile)
Age groups: one of four common groupings (<15, 15-19… five yearly bands to 85+; <15, 15-24… ten yearly bands to 85+; <1, 1-4… five yearly bands to 85+; <1, 1-4… ten yearly bands to 75+)
User-defined: from minimum min, in k intervals of equal size = step (<min + 1 * step, >= min + 1 * step to < min + 2 * step… in k intervals to >= min + k * step)
Using the IgM example in quartiles:
|
category |
count |
|
< 0.5 |
56 |
|
>= 0.5; < 0.7 |
67 |
|
>= 0.7; < 1 |
98 |
|
>= 1 |
77 |
Using the IgM example in 10 intervals of 0.5 from:
|
category |
count |
|
< 0.5 |
56 |
|
>= 0.5; < 1 |
165 |
|
>= 1; < 1.5 |
54 |
|
>= 1.5; < 2 |
14 |
|
>= 2; < 2.5 |
6 |
|
>= 2.5; < 3 |
2 |
|
>= 3; < 3.5 |
0 |
|
>= 3.5; < 4 |
0 |
|
>= 4; < 4.5 |
0 |
|
>= 4.5 |
1 |
A quick look at the counts above shows a similar picture to that you would see from a histogram, namely that the data are not evenly spread into ranges of values, i.e. they are skewed. The text-based histogram will give you counts, but note that the bin values in a histogram are the mid-point of the bin and not the cut-off value between bins, i.e. they are the same as a user-defined bin cut-off values minus half of the step size.