types machine learning

Best Types of learning in Machine Learning: Supervised and Unsupervised

Best Types of learning in Machine Learning: supervised and unsupervised

machine learning

Machine Learning basically consists of automating, using different algorithms, the identification of patterns or trends that are “hidden” in the data. Therefore, it is very important not only to choose the most appropriate algorithm (and its subsequent parameterization for each specific problem), but also to have a large volume of data of sufficient quality.
In recent years, machine learning has gained great importance in the business world, since the intelligent use of data analytics is key to business success.
In this post we are going to explain what machine learning consists of, what types of learning there are, how they work and what they are used for.

Really, what is Machine learning?

It is a branch of artificial intelligence that began to gain importance in the 1980s. It is a type of AI that no longer depends on rules and a programmer, but rather the computer can establish its own rules and learn on its own. same.
Machine learning occurs through algorithms. An algorithm is nothing more than a series of ordered steps taken to perform a task.
The objective of machine learning is to create a model that allows us to solve a given task. The model is then trained using large amounts of data. The model learns from this data and is able to make predictions. Depending on the task you want to perform, it will be more appropriate to work with one algorithm or another.
Choosing the algorithm is not easy. If we search for information on the Internet, we can find a veritable avalanche of very detailed articles, which sometimes, rather than helping us, confuse us. Therefore, we are going to try to give some basic guidelines to start working.
There are two fundamental questions that we must ask ourselves. The first is:

What do we want to do?

The crux of the matter is to clearly define the objective. To solve our problem, then, we will consider what type of task we will have to undertake. This could be, for example,:
  • Classification problems such as spam detection.
  • Clustering problems such as recommending a book to a user based on their previous purchases (recommendation system)
  • Regression problems, such as finding out how much a customer will use a certain service (determining a value)
If we consider the classic customer retention problem, we see that we can approach it from different approaches. We want to segment customers, yes, but what strategy is the most appropriate? Is it better to treat it as a classification, clustering or even regression problem? The key clue will be given to us by asking the second question.

What information do I have to achieve my goal?

If I ask myself: “Do my clients group together in some way, naturally?”, I have not defined any objective (target) for the grouping.
However, if I ask the question this other way: Can we identify groups of customers with a high probability of requesting cancellation of the service as soon as their contract ends? We have a perfectly defined objective: will the customer cancel? and we want to take action based on the response we get.
In the first case, we are faced with an example of unsupervised learning, while the second is an example of supervised learning.
In the initial phases of the Data Science process, it is very important to decide whether the “attack strategy” will be supervised or unsupervised, and in the latter case define precisely what the target variable will be. Depending on what we decide, we will work with one family of algorithms or another.
Once the above is identified, preset algorithms will be used so that you can choose which one to work with. Among the best known are: scikit-learning, machine learning algorithm cheat see, among others.

machine learning

Types of learning in Machine Learning

Machine Learning implementation types can be classified into three different categories:
  • Supervised learning
  • Unsupervised learning
  • Reinforcement learning based on the nature of the data it receives.

Supervised Learning

In supervised learning, algorithms work with “labeled” data, trying to find a function that, given the input variables, assigns them the appropriate output label. The algorithm is trained with a “historical” data and thus “learns” to assign the appropriate output label to a new value, that is, it predicts the output value. (Simeone, 2018)
For example, a spam detector analyzes the history of messages, seeing what function it can represent, depending on the input parameters that are defined (the sender, whether the recipient is an individual or part of a list, whether the subject contains certain terms, etc. ), the assignment of the “spam” or “not spam” label. Once this function is defined, when introducing a new unlabeled message, the algorithm is able to assign it the correct label.

Supervised learning is typically used in:

  • Classification problems (digit identification, diagnostics, or identity fraud detection).
  • Regression problems (weather predictions, life expectancy, growth, etc.).
These two main types of supervised learning, classification and regression, are distinguished by the type of target variable. In cases of classification, it is of categorical type, while, in cases of regression, the target variable is of numerical type.
The most common algorithms that apply for supervised learning are:
  1. Decision trees.
  2. Naïve Bayes classification.
  3. Least squares regression.
  4. Logistic Regression.
  5. Support Vector Machines (SVM).
  6. “Ensemble” methods (Sets of classifiers).

Unsupervised Learning

Unsupervised learning occurs when “labeled” data is not available for training. We only know the input data, but there are no output data that correspond to a certain input. Therefore, we can only describe the structure of the data, to try to find some type of organization that simplifies the analysis. Therefore, they have an exploratory nature.
For example, clustering tasks look for groupings based on similarities, but there is no guarantee that these have any meaning or usefulness. Sometimes, when exploring data without a defined objective, you can find curious, but impractical, spurious correlations.

Unsupervised learning is often used in:

  • Clustering problems
  • Groupings of co-occurrences
  • Profiling or profiling.
However, problems involving similarity finding, link prediction, or data reduction tasks may or may not be supervised.
The most common types of algorithms in unsupervised learning are:
  1. Clustering algorithms
  2. Principal component analysis
  3. Singular value decomposition
  4. Principal Component Analysis (Independent Component Analysis)

So what is reinforcement learning?

Not all ML algorithms can be classified as supervised or unsupervised learning algorithms. There is a “no man’s land” which is where reinforcement learning techniques fit.
This type of learning is based on improving the model’s response using a feedback process. The algorithm learns by observing the world around it. Your input information is the feedback you obtain from the outside world in response to your actions. Therefore, the system learns through trial and error.
It is not a type of supervised learning, because it is not strictly based on a set of labeled data, but on monitoring the response to the actions taken. It is also not unsupervised learning, since when we model our “learner” we know in advance what the expected reward is.
If you want to know more about types of learning, don’t miss this other post, where we explain what transfer learning is.

machine learning

Practical uses of Machine learning

Finally, let’s look at some of the most common practical uses of machine learning.
  • Computer security, attack diagnosis, online fraud prevention, anomaly detection, etc.
  • Recognition of images or patterns (facial, fingerprint, objects, voice, etc.)
  • Autonomous driving, using deep learning algorithms: real-time image identification, detection of obstacles and traffic signs, accident prevention…
  • Health: automatic evaluation of diagnostic tests, medical robotics, etc.
  • Stock market analysis (financial predictions, market evolution, etc.)
  • Recommendation engines
It is essential to be clear at all times about the objectives sought by the company when using these techniques, in order to be able to ask the appropriate questions to the data. And, of course, always work with quality data.

 

Table of Contents