Similarity Coefficients: Measures of Co-Occurrence and Association or Simply Measures of Occurrence?
Citations Over TimeTop 10% of 1989 papers
Abstract
Data on the presence or absence of 25 fish species in a survey of 52 lakes from the watersheds of the Black and Hollow rivers of south-central Ontario were analyzed with eight similarity coefficients. Comparisons were made of Jaccard, Ochiai, Phi, Rogers-Tanimoto, Russell and Rao, Simple Matching, Sorensen-Dice, and Yule similarity coefficients using results from R-mode cluster analysis, principal-coordinates analysis (PCoA), and nonmetric multidimensional scaling. Coefficients were grouped into those representing measures of co-occurrence and those measuring association. Coefficients of co-occurrence (i.e., Jaccard, Rogers-Tanimoto, Russell and Rao, Simple Matching, and Sorensen-Dice) incorporate information associated with the frequency of occurrence of the fish species analyzed. Dendrograms faithfully revealed this size effect. Similarly, first axes of PCoA were linear or curvilinear functions of species' frequency of occurrence. Measures of association (i.e., Phi and Yule) and Ochiai's coefficient were less affected by the frequency of occurrence. The first axes of PCoA, based on centered coefficients (i.e., Phi, Yule, and Ochiai), were highly correlated with the second axes from ordinations using co-occurrence coefficients. The second axes from analyses of centered coefficients were correlated with the third axes based on non-centered measures. We propose that co-occurrence coefficients reflect a general size effect similar to that commonly found in principal-components analysis. Measures of association and Ochiai's coefficient incorporate implicit centering transformations that reduce the size influence associated with the frequency of occurrence. Cluster analyses using co-occurrence coefficients are most susceptible to this size effect. We believe that the interpretations of many dendrograms fail to recognize size effects that arise from employing non-centered similarity coefficients (e.g., Strauss 1982; Nemec and Brinkhurst 1987). Additionally, arguments contrasting phenetic and phylogenetic methods may unknowingly debate the utility of centered versus non-centered coefficients, since the size effect undoubtedly contributes to the apparent strength of phylogenetic approaches.
Related Papers
- → Similarity coefficient methods applied to the cell formation problem: a comparative investigation(2005)98 cited
- → The production data-based similarity coefficient versus Jaccard's similarity coefficient(1991)49 cited
- → A machine level based-similarity coefficient for forming manufacturing cells(1994)6 cited
- Comparison of multidimensional scaling and principal component analysis of interspecific variation in bacteria.(1989)