Thursday, 16 November 2017

Time vs Memory trade-off in Software

In software, there is always a trade-off in memory and time. You can improve run time of your software at the expense of some extra RAM memory and similarly to improve memory footprint of your software, you may have to compromise with speed of your program. Let us try to understand with a very simple example :

let us say you are implementing a singly linked list, a simple implementation would look like this :

1
2
3
4
5
6
7
8
class Node {
   int data;
   Node *next;
}

class List {
  node *start;
}
If you want to count number of nodes in linked list, how would you do this? very simple, iterate over all nodes :


 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
class Node {
   int data;
   Node *next;
}
class List {
  node *start;
  public :

   int size() {
     int count = 0;
     Node *temp = start;
      while(temp) {
         count++;
         temp = temp->next;
      }
      return count;
   }
}

what is the time complexity? It is O(n) i.e. linear. But imagine if nodes in linked list are in million. Can we reduce run time of this algorithm or can we do it in constant time?

yes, we can. At the expense of very small amount of memory, with 4-8 bytes. How?

instead of counting number of nodes, keep the count in class itself. when you add a new element in the list, increase this count and similarly when you delete an element from list, decrease the count.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
class Node {
   int data;
   Node *next;
}

class List {
  node *start;
  int count; // number of nodes. initialize to 0
   void addNode(int value) {
      Node *node = new Node(value);
      //add in the front or end
      count++;
   }
   // similarly for deletion
}

With just one extra byte of integer, you have reduced run time of algorithm from linear to constant.
This is a very simple and small example. In large software, such simple optimizations can improve run time from days to mins/seconds and at the same time, sometime extra memory required may be too large for significant improvement. So one has to take such decisions very wisely.


Sunday, 12 November 2017

Machine learning in layman language

Machine learning is a buzz word these days. Let us just try to understand from a beginner point of view in layman language. 

Machine learning, as its name suggests, is a method of teaching machine to learn themselves from given and future data. You give your algorithm some initial data and ask it to predict in future. It is completely inspired from human beings. Like the human, as they gain experience, they become proficient in taking decisions. Similarly, with time machine learning algorithms becomes better at decision making. If you have noticed on Facebook, it gives you friend suggestions "People you may know / Video you may like". It never says people "you should know, video you should/will like" because machine learning algorithms work on probabilistic  model. If you have noticed, with time or more you use Facebook, it gives you better and more confined results. 

At the first thought, it seems a very challenging and fascinating task to devise machine learning algorithms but basic idea behind them is not as complicated as it seems at first thought. 

let us  say you want to devise a machine learning algorithm to predict price of house. you have been given data of 10 houses. price and all details like area,number of rooms, locality where it is present, furnished/unfurnished and I ask you to predict price of another 2 houses based on this data. 

What would you do. You would try to find trend in price w.r.t. various parameters. e.g. price per square meter, price w.r.t. to number of rooms,etc

Say two houses have all the same parameters except locality and it has same price. it may be true to say that location does not affect on price of house. but this is also possible that data is not sufficient. we can make atleast some prediction based on this observation. 

How would a machine predict? in simplest model of machine learning, it will assume linear model. 

Price of house, P = basic property rate (A0) + A * number of rooms + B * area of house + C * floor on which it is located + D * locality

A,B,C &D are weight of various parameters. If some parameter does not contribute to price of house, its corresponding weight value will be zero. A0 is bias or in simple words you can say this is the base price which is independent of any parameter. 

job of algorithm is to find optimal values of A0 , A. B, C and D. You have to find these parameters such that this equations fits for given data and it gives a good prediction for future data. 

in the simplest approach, we can minimize the error : 


Error in prediction, E = (real price of house - predicted price of house)^2

we have used here the squared error function.

Once weight vectors are calculated, your model is ready for predictions. You just have to put different input data in this equation, you will get approximated price of house. 

Isn't it really cool and easy :) I never knew in my 11th- 12th standard that what I was studying will be used for such a great technology in future. 

what is array

Array is one of the simplest and most  sophisticated data structures. In array data is stored at contiguous memory locations. E.g. say you h...