Plenary Talk
- Title: Tests for validity on principal canonical correlation analysis
- Presenter: Professor Takakazu Sugiyama, Department of Mathematics, Chuo University, Japan
- Abstract:
We consider two sets of variables with a joint distribution and analyze the canonical correlation analysis. As a different approach, let us consider the method that we first apply the principal component analysis for two sets respectively, and then calculate the canonical correlation coefficients for two sets of principal components. We know that the interpretation of principal components is easier than the canonical variable. Therefore, examining canonical correlation analysis based on the principal components, in some cases we may see that the canonical correlation analysis on two sets of principal components are more useful for understanding the relationships of the given data sets.
For example, we investigate the correlations between the academic record in senior high school and the score on the common first-stage university entrance examination. From the academic records and scores for 147 students, we obtain the following correlation coefficients:
r12 = 0.52547, r22= 0.32277, r32 = 0.21834, r42= 0:05387, r52 = 0.00992.
Now we shall use the first and second principal components in each group for explaining the scores of the common first-stage university entrance examination and the academic records in senior high school, and we apply the canonical correlation analysis using those principal components. Then we obtain the following canonical correlation coefficients:
r12* = 0.50774, r22* = 0.30770.Above coefficients on the two principal components have almost the same information about the relationships between x and y as the all variables, because the differences of the two canonical correlation coefficients, r12 and r12* , r22 and r22*, are small; furthermore, it is easier to interpret the canonical variables, as they are written using the uncorrected principal components. In this case, we know that the canonical correlation analysis based on principal components gives us the clear results.
We want to examine that the canonical correlation coefficients on the principal components may not lose the information comparing the case of all variables. Applying the permutation test procedure we discuss the validity on above principal canonicalĀ correlation analysis.
Presented jointly by
the Department of Statistics and Finance, University of Science and Technology of China and the Forum for Interdisciplinary Mathematics.
IMST 2007 FIM XV (Shanghai China)
May 20-23, 2007, S.I.A.S. of USTCIMST 2007 - FIM XV