Linear Algebra for Machine Learning – Part 2


This is part 2 of the linear algebra series .The goal of every single article will be to make linear algebra more understandable in geometric and applied aspects  .

Linear algebra is a branch of mathematics that is widely used throughout science and engineering .A good understanding of linear algebra is essential for understanding the working  with many machine learning algorithms especially deep Learning. linear algebra is a very wide branch and covering all topics is itself a semester course and many topics are not that much relevant to learn about machine learning algorithms. So I will omit some of the most important topics in linear algebra which are not essential for understanding machine learning algorithms. If you are a beginner or have no prior experience with linear algebra then follow this series of article in predefined order.

Note : If you haven’t read part 1 click here.

The goal of this article is to make you understand some of the most useful linear transformations in machine learning. Linear transformation can be thought of as function in which instead of passing a single value we are passing a vector or matrices and generating a transformed output vector . The reason these transforms are called linear is because they only transform the vector linearly and not non-linearly.


After completing this article you will know,

  • What is transpose , addition , dot product operations .
  • What are geometric outcomes of these operations.

Let’s begin,

Transpose of a Matrix or vector

Transpose is one the most basic yet fundamental operation which we apply on matrix without realizing what does to a matrix geometrically .Even most courses on linear algebra fails to cover this topic fundamentally. Transpose of a matrix means getting its mirror image along main diagonal or in case of square matrix across principle diagonal (fig 1.1). Geometrically transpose many things to a matrix , consider a example of a matrix A of size 3×2 given below,

before transpose the matrix consist of data points in 2-dimension space but after transpose the resultant matrix consist of vectors in 3-dimension space . how ever in case of square matrix the dimension .

There are some interesting properties of transpose operation,

  • (A^{T})^{T} = A
  • (kA)^{T} = kA^{T}
  • (AB)^{T} = B^{T}.A^{T}

here T is transpose of matrix.

Addition of Matrices and Vectors

Addition of vectors is of great importance , also the geometric interpretation of vector addition is not found in most linear algebra courses.Consider two vectors  w^{'} and  v^{'} where ,the coordinates of  v{'} = [3,2]   and  w{'} = [1,2] and the vector  z{'} is obtained by adding   w^{'} and  v^{'}, which is  z{'} = [4,4] , before going any further lets understand what just happened,

Addition of vectors with tail at origin. fig 1.2

this one type of vector addition with vectors tails at origin but what if the tail is not at origin but follows another vector ,in that situation the interpretation of addition  changes (fig 1.3.).

Addition of vectors with tail not at origin. fig 1.3

The above type of vector addition is not followed in context of machine learning but for sake of understanding one should know.

Multiplication of matrices and vector

One of the most popular linear transformation . The idea behind matrix multiplication is simple change the basis vector of current space into some other form .Let’s understand this in terms of example .Consider a basis matrix b = [[1,0]\ ,\ [1,1]] , the 2-d plane corresponding to this basis vector (fig 1.4).

fig 1.4

Now apply a linear transformation i.e multiplying with a column vector  [3,2]   on this basis vector .

dot product fig 1.5

this specific matrix multiplication is called dot product .There is also another type of matrix multiplication called cross product , but we will not cover as it is extremely rarely used in machine learning and quite frankly i have never read any research paper which uses cross product in its approach, but the question what does it means , what this multiplication is doing with basis matrix. To understand just take a look at the figure blow (fig 1.6).

dot product result . fig 1.6

In above image you can clearly observe how the plane has changed with respect to linear transformation operation. Now to locate any vector which was present in old plane you have to apply same linear transformation on that vector also. The operation is also linear i.e it only scales, stretches , rotate but not change the shape from straight line to some curve or sphere.The mathematical formula for dot product is given below.

dot product = ||w|| .||v|| .cos\ \theta , w and v are  vectors , and  \theta is angle between two vectors . In general sense the dot product shows us how dissimilar the two vectors are with respect to each other. ||w|| and ||v|| are magnitude of the vectors and can be calculated using pythagoras theorem. if this sounds unfamiliar then go with row with column  element wise multiplication.


We have now covered some of the most fundamental operations in linear algebra which we will be using in deriving some of most popular machine learning algorithms. Hope you enjoyed reading it.

For reading part 3 click here.

Send a Message