Linear Algebra for Machine Learning – part 1


In this and next upcoming  series of articles i will be covering linear algebra for applied machine learning. The goal of every single article will be to make understand linear algebra more understandable in geometric and applied aspects  .

linear algebra is a branch of mathematics that is widely used throughout science and engineering .A good understanding of linear algebra is essential for understanding the working  with many machine learning algorithms especially deep Learning. linear algebra is a very wide branch and covering all topics is itself a semester course and many topics are not that much relevant to learn about machine learning algorithms. So I will omit some of the most important topics in linear algebra which are not essential for understanding machine learning algorithms. If you are a beginner or have no prior experience with linear algebra then follow this series of article in predefined order.

Note : The notations used in this series are standard mathematical notations.

After completing this article you will know,

  • What are scalar,vector,matrices and tensors
  • Their geometric interpretation
  • Their role in Machine Learning algorithms

Let’s begin,

Scalars :

A Scalar is just a single number , in contrast to most other objects studied under linear algebra , which are usually array of multiple number. we represent scalar quantity in lower-case italics. A Scalar quantity only represents magnitude and not the direction i.e a scalar quantity can be understood a value on number line which has magnitude but has no direction aka coordinates by which it can be located in n-dimension space. A common way of writing scalar would be “Let n\in N be number of units “, which means a natural number scalar. If you have ever seen any machine learning algorithm chances are you have seen this equation

W^{t}.X +b

What this equation represent is beyond the scope of first article ,we will be covering this in upcoming articles but the most important catch in this equation is b which is a scalar quantity used  to define the slope of the decision boundary which is pretty powerful.


In text books you may have find two types of vector a) vector which are in space and their tails may or may not  passes through origin and b) vectors which have their tails passes through origin and head in space. In context of machine learning we are only talking about type ‘b’ vectors , ‘a’ type of vectors are not important to us but they are very relevant in context of physics. So before defining a vector in context of linear algebra let’s first understand in context of a point in 2-d space.

Consider a point say x = \{x_{1}\ ,\ x_{2}\} , here x_{1} defines its position on x-axis and x_{2} defines its position on y-axis and both combined represents its position in 2-dimension space .

we can represent the point x in vector form x = [x_{1},x_{2}] , this notation allows us to represent a point in 2-dimension space , you must be wondering why it is even important, before jumping to this conclusion you must remember that even we humans can only visualize upto 3 dimensions but vector representation of a point gives us power to represent a point in more than 3 dimension even though we cannot visualize it . e.g

  • A point in 2-d space  x = [x_{1},x_{2}]
  • A point in 3-d x = [x_{1},x_{2},x_{3}]
  • A point in 4-d x = [x_{1},x_{2},x_{3},x_{4}]
  • A point in n-d space x = [x_{1},x_{2},x_{3},x_{4},.........,x_{n}], beyond the scope of human visualization.

In Simple words we can think of vectors as identifying points in space , with each element representing a coordinate along a different axis.There are two types of vectors

a) row vector : Vector represented as a row of numbers.

b) column vector: Vector represented as a column of numbers.

A Vector can be written as s\in N^{d} ,where d is dimension of vector and N represents that all the number in the vector are natural numbers.


Matrices can be thought of as collection of vectors or vector of vectors. A 2-d array of numbers , so each element is identified by two indices instead of just one.A matrices can be thought of as a collection of points in an n-d space . Each column of a matrix represent a coordinate on a particular axis. In context of machine learning a dataset can be thought of as a matrix where each i_{th} sample is a vector in a n-dimension space and each column represents a coordinate on a particular axis.

A diagrammatic view of matrix is quite simple to understand (for illustration see fig.)

Matrices are also widely used to represent linear equations in an n-d space , this application of matrices is very useful in context of machine learning, like representing derivatives in backpropagation. A Matrix is usually represented as s\in N^{m\ \ast\ n} where m is number of rows and n is number of columns , rest is same as that of vector.


In some cases we will need an array with more than two axes. In the general case, an array of numbers arranged on a regular grid with a variable number of axes is known as a tensor. We denote a tensor named “A” with this typeface: A. We identify the element of A at coordinates (i, j, k ) by writing A_{i,j,k}.

Linear algebra allows us to apply mathematical operations on the representations which we itself cannot even visualize but by doing so the machines can , this gives machines huge advantage over humans . In next article we will be covering operations on Linear algebra objects .If you have any doubts feel free to leave comment.

For reading part 2 click here.

Send a Message