Cluster Analysis in Orchestration
Cluster analysis is a form of statistical data analysis in which subsets (called “clusters”) are formed according to some notion of similarity. There are many different variants of cluster analysis, but most are hierarchical — in which low-level clusters are successively joined together to make larger clusters, and so on, until everything is clustered into one large group. The result is a cluster tree or dendrogram.
Randolph Johnson (2007) used cluster analysis to try to determine how instruments are grouped together in orchestral works. He examined a large number of scores from 19th century orchestral music. In order to circumvent possible experimenter bias, Johnson used David Dubal’s The Essential Canon of Classical Music (2001) as his source for sampling “orchestral music.” For each work in Dubal’s book, Johnson sampled just a single random vertical sonority — a single sounding moment. He coded which instruments were playing at that moment. For example, for a given moment in some score, the first and second violins might be active, along with the violas, ’cellos and a solo clarinet. Johnson ended up with several hundred coded sonorities — one sonority from each work. In this way, he ensured a high degree of data independence.
Johnson then used cluster analysis to analyze the instrumental combinations. The computer first grouped together those instruments which most commonly appeared together. As can be seen in the dendogram, the first clusters include (1) violin + viola, (2) double bass + ’cello, (3) flute + clarinet, (4) bassoon + oboe.
The second group of clusters include (5) horn + trumpet, and (6) bass clarinet + English horn/cor anglais. In addition, two of the earlier groups group together: (7) the violin/viola + the bassoon/oboe. You can see all of the groupings and hierarchical groupings in the tree below.
Johnson found that there were three over-arching instrumental groupings in 19th century orchestral music (identified in the dendrogram by the solid vertical line). One group consists of the violin, viola, ’cello, double bass, flute, oboe, clarinet and bassoon. The second group consists of the horn, trumpet, trombone, tuba, piccolo and timpani. The third group consists of the bass clarinet, English horn, harp, cornet, and contrabasson.
Cluster analysis does not tell you the “meaning” or origin of the groups. It simply tells you that there are statistical pertinent natural groupings that arise depending on the way you define similarity. In this case, Johnson proposed a musically plausible interpretation of his three over-arching groups: the first group he dubbed “Standard” instrumentation; the second group he dubbed “Power” instrumentation; and the third group he dubbed “Color” instrumentation. The results of his analysis are consistent with the idea that 19th century orchestral composers conceived or deployed instruments in terms of three broad categories: (1) core or standard instruments consisting of the strings and woodwinds, (2) loud, energetic or power instruments consisting of the brass, piccolo, and timpani, and (3) novelty or color instruments such as the English horn, cornet, and the harp. Notice that the color instruments further cluster into two groups: (i) quieter color instruments such as the English horn, bass clarinet and harp, and (ii) louder color instruments such as the cornet and contrabassoon. Johnson calls his resulting theory of orchestration, the SPC theory (for Standard, Power and Color).
References
Johnson, R. (2007). Towards a Theory of Orchestration: A Quantitative Study of Instrumental Combinations in Romantic-Era Symphonic Works. M.A. thesis. Ohio State University.