Thursday, 10 September 2015

Quick Introduction to Machine Learning

Today, let us talk about machine learning, one of the most popular terms that we are hearing these days. Though the term might be new, it is self explainable.

Machine learning means making the machine to learn. What? The data. Why? To give a better user experience by giving best results. Yes, in machine learning you feed your machine the data that you have and make it learn that data, so that it gives output to users based on that.

Consider, stock markets. you might have undergone some apps that predict the stock markets. But how do they predict? Simply, by learning from the previous stock data. The machine is fed with data and this data is analyzed to generate predictions.

In machine learning, there are two types. Mainly, supervised learning and unsupervised learning.

Supervised Learning


Here the data given is clear. For example, consider we wanted to predict the rent of a house in a particular area. We are given sample data which consists of two parameters: the rent, the area (in sq. feet).

Now, we can plot a graph between these parameters taking rent on x-axis and area on y-axis.
Now, we can plot points for the given data in the graph and we will be getting some graph (either linear or whatever it may be).

The graph will certainly have an equation behind it. If you are familiar with graphs, you might have constructed graphs of polynomials. For different degree polynomials we will be getting different shapes of graphs.

Unsupervised Learning


In this type of learning, we are not clearly given the parameters. But the data is mostly random and we need to classify the data. For example, consider Google News. In Google News, Google searches for a lot of articles written on the web. When you search for something on Google, you might have undergone some search results categorized under News. So, how does it detect it?
It has a lot of articles. Google knows when each article is published (the date) and it also knows to which type it belongs to i.e. what the article is talking about. It needs to detect whether the given article is a general blog post or a news. For example, consider this article, this article cannot be a news because it is just a tutorial article which doesn't discuss about anyone and it is immaterial with time. Whereas an article named 'Obama's visit to India' can be a news because it is an event that took place at a particular instant of time.

Simply, put

Supervised learning: The data is labeled i.e. for a given input value there will be an output value. You need not worry about what the data is, because it is clearly given.

Unsupervised learning: The data is not labeled i.e. you need to find the hidden structure in the data.