The Chi-Square Test
Brahms’s Hemiolas
A common statistical tests is the chi-square test. This test takes its name from the Greek letter X (pronounced “Kye” — rhymes with “pie”), which is used to compare ratios of counts. It facilitates the determination of whether the number of occurrences of some feature in one data sample is significantly greater than or less than what would otherwise be expected.
By way of example, let us test the hypothesis that Brahms use lots of hemiolas. Notice, first of all that “lots of hemiolas” is ambiguous. If we count three hemiolas in a short piano work, is that a small or big number? In order to answer that question we need a comparison or sense of what is normal or expected. An appropriate approach is to compare the number of hemiolas in Brahms with the number of hemiolas in the music of his contemporaries.
So, to set up the test we need first to count (a) the number of hemiolas in a sample of Brahms’s music, and (b) the number of hemiolas in the comparison population — the number of hemiolas in a similar sized sample of music by other composers who were active at the same time. Let us suppose that the raw data looks like this:
COMPOSER(S)
MEASURES SEARCHED
HEMIOLAS FOUND
Various
5000
23
Brahms
500
9
Proportionally, hemiolas occurred in 1.8% of the data for Brahms (9 out of 500) but just 0.7% in the music by other composers (23 out of 5000). Is this difference within the range of what one might expect by chance? That is, is this difference statistically significant? A chi-square test will tell us.
[]{#Calculating Chi-Square}
Calculating Chi-Square
The test entails three parts. First, we draw our line in the sand. That is, we establish in advance of our test our confidence level. Let us use a conventional confidence level of 95 percent. This corresponds to a significance level of .05. Second, we calculate the value of chi-squared. This is done by subtracting the number of expected occurrences from the number of observed occurrences, squaring the result, and dividing it by the number of expected occurrences. Where O is the observed number of occurrences and E is the expected number, the formula is:
The summation symbol means that we must perform the calculation for each expected element. There are two elements to the expectation: how many measures we expect to contain a hemiola, and how many measures we would expect not to contain a hemiola. If we would normally expect 23 hemiolas in 5000 measures of music of the period, then we would expect about 2.3 hemiolas in the 500 measures of Brahms’s music, i.e., that E1 = 2.3. At the same time, in our sample of various composers we observed that 4977 measures of music (out of 5000) did not contain a hemiola. If Brahms was like other composers, we would expect 497.7 of the 500 measures not to contain a hemiola. Accordingly, E2 = 497.7. The actual number of hemiolas encountered was in his music was 9, and the actual number of non-hemiolas measures was 491. So the numeric substitutions would be as follows:
Our value of chi-squared is 19.6.
[]{#Determining Statistical Significance}
Determining Statistical Significance
Now we need to determine whether our chi-squared value is statistically significant. Recall that our confidence level is 95 percent—and that this corresponds to a significance level of .05. We need to determine whether the chi-squared value is sufficient to achieve statistical significance at the .05 significance level. In order to answer this question, we will make use of a table of critical values for chi-square. Such tables have been pre-computed by statisticians, and are easily found in the appendices of any statistics book or on the web.
In order to determine the critical value for chi-square, we need to know the significance level (or confidence level) and the so-called degrees of freedom (abbreviated df). For this task, there is only a single degree of freedom. From the table below, we can see that for the 95% confidence level (=.05 significance level), with one degree of freedom, the critical value of chi-squared is 3.841. Our actual calculated value of chi-squared is 19.6. This means that our result is statistically significant. That is, the observations are not consistent with the null hypothesis; instead, they are consistent with our research hypothesis.
The value of X² is sensitive to the number of observations. The greater the number of observations, the greater the likelihood that the result will be considered significant. If a coin were flipped 6 times and came up heads 4 of them, this incidence would be considered to lie within the realm of chance. But if the coin came up heads on 400 of 600 tosses (preserving the same proportion), the value of p would be less than 0.001.
Tips
The chi-square test is about counts, not percentages. The values used in the calculation must be actual counts and expected counts. The percentages are useful, only as a way to identify the expected counts.
The test is called chi square (no ‘d’), but the calculated value is referred to as chi squared (with the ‘d’).