Correspondence Analysis | |||
Page 4 on 8 | Table of contents | Last | Next . |
|||
4. The Analysis II |
|||
A. |
Dimensionality of the problem |
||
. Let's look at the cloud of point that stands for the rectangular figure table, the sum of whose column and rows we have transformed to one (that is, percentage). This cloud is contained in a space of dimension card(I)-1 or card(J)-1, whichever is lower. For the end of this paper we will assume that I<J and thus the problem is of dimension card(I)-1. In our example the table is 8x12 and is thus contained after its transformation in percents in a 7 dimensional space. . . |
|||
B. |
Geometric principles |
||
. Basically, the insight of any factor analysis and in particular of correspondence analysis is that the cloud of points that we are trying to describe does not stretch equally in every direction, but on the contrary that it has a definite shape which is not an hyperball, for there is affinity between rows and columns. We are then going to define a new system of orthogonal coordinate more "economical". More precisely, what we seek is for a cloud of points N(I) (remember that each point is located in space by its card(J) coordinates on J, that is, its profile on J) the representation which, in as small as possible a dimensionality (that is, the minimum number of axis), is as good as can be. If we want a graphical representation on paper, the problem can be formulated as follows : determine the subspace L of dimension 2 which pass through the center of gravity of the cloud (i.e. its mean profile) and which maximize the inertia of N(I) parallel to L. The softwares do not stop at 2 dimensions and you get in the standard output the card(I)-1 dimensions of the problem. But we have to be more general now. . . |
|||
C. |
Factorial Axis |
||
. If
we denote by .
To each axis is associated an eigenvalue whose sum equals the inertia of the cloud. Each eigenvalue is worth 1 at most. You can see that if N(I) had only one point, there would be no axis, and if it had only 2 points, it would have a single axis; with 3 points we would need at most 2 perpendicular axis and for n points a maximum of card(J)-1 axis. . . |
|||
D. |
Symmetry of both analyses |
||
. We have so far talked only about the analysis of one side of the table, rows or column.We can either project the row-points in the 11 dimensional community space or project the column points in the 7 dimensional schooling level space. We would then obtain 2 representations of the two cloud. Are these representations different ? Not much. The analyses are symmetric in three ways : .
Statisticians have decided not to differentiate between the
two systems of factorial axis born from both analysis, and to represent all points on the
same graph. The algorithm represent the points (see below how) in the space created by the
k first factorial axis. The distance between points will be a Besides the graphs, the softwares give us material to answer some of the scientist's questions like : Which part of the total inertia is accounted by the first k axis ? Which part of the variation of a given point is accounted by this particular graph ? What is the contribution of each point to the construction of the axis system ? . |
|||