Linear algebra is very fundamental and critical in machine learning and data mining research. Essence of Linear Algebra gives a very good description of the basic concepts. Here in this article, I added some comments and thoughts. This is mainly for helping myself understand and memorise the details better.

I. Linear Transformation of a Coordinate System:


What I would like to add concerning linear transformation is another explanation of how the linear transformation matrix comes. Typically, for a coordinate system, the basis vectors are known, e.g., , where each is a basis vector and is the number of basis vectors and also the number of dimensions usually. This basis vector set defines the coordinate system and any vector in this coordinate system can be expressed using:

,

in which the column vector with all is the coordinate for the given vector.

A linear transformation of a coordinate system can also be considered as the linear transformation of the basis vectors in the coordinate system, which transforms the original set of basis vectors, e.g., , into another set, e.g., , defining a new coordinate system. And the relationship between these two sets could be formalised as:

,

where one can see that a new basis vector is a linear combination of the original basis vectors. With the linear transformation of the whole space, the coordinate for a vector in the original space will also be transformed such that:

,

showing that the new coordinate (from the perspective of the original space) of a vector is equal to the linear transformation matrix (the matrix with all s) times the original coordinate : (in matrix form)

.

Note that this explanation does not apply well when the linear transformation matrix is non-square. Therefore, the intuition provided by the video shown above may be a better way to memorise linear transformation, in which:

  • each column vector in the linear transformation matrix is the target vector into which the corresponding original basis vector are transforming. More formally, this could be interprete as directly changing the basis vectors into another set of basis vectors of the same number:
    ,
    such that we have:
    .
    The new basis vectors could be of a different dimension and the coordinate of the original vector after transformation becomes:
    ,
    i.e., the direct multiplication of the transformed basis vectors and the original coordinates.

  • the new coordinate of a vector (from the perspective of the original space) after transformation is simply the transformation matrix times the original coordinate.

II. Linear Transformation of two Coordinate Systems:

In last part, all the coordinates we used, e.g., , are from the perspective of the original space, i.e., standard basis vectors that form the standard coordinate system. Note that to communicate with a different coordinate system, the coordinates should be transformed such that two coordinate systems are using different coordinates to describe a same vector. This is explained in the following video:


Basically, there are three operations that are worth further emphases. Note that we are using to denote the coordinates in a new coordinate system other than the standard coordinate system.

  • from coordinate of a new coordinate system to that of the standard coordinate system:
    ,
    where is the coordinate of the new coordinate system defined by s, s are the coordinates of the basis vectors of the new coordinate system from the perspective of the standard coordinate system, and is the coordinate of the standard coordinate system. This whole process has the same formulation as that of linear transformation, but maintains a very different geometric meaning. In linear transformation, this operation stands for the change of vector, e.g., rotation, scaling, etc. However, in change of coordinate system, this operation stands for the translation of a same vector from one system to another, i.e., the vector itself is intrinsically unchanged;

  • from coordinate of the standard coordinate system to that of the new coordinate system:
    .
    This process is straightforwardly derived from last process with the assumption that the inverse exists. Therefore, one has to make sure that the two coordinate systems are of the same dimension and convertible to each other through linear transformation.

  • from coordinate of the new coordinate system to that of the standard coordinate system, linear transform and then convert back to that of the new coordinate system, i.e., performing linear transformation in the new coordinate system:
    ,
    where the s form the linear transformation matrix from the perspective of the standard coordinate system, and is the target coordinate in the new coordinate system after linear transformation. A very helpful use case of the above transformation is provided in the above video.