Support Vector Machines (SVM)

Rahull Trehan
4 min readMay 30, 2021

SVM is one of the most important but difficult algorithms when it comes to understanding all the jargon it has. Through this article, I would like to discuss a few key concepts which build this algorithm. I would not be discussing any mathematics equation in this, since I am only trying to explain this algorithm from a conceptual standpoint.

a. The concept behind SVM — Support Vector Machine (SVM) as a concept is used in classification problems where this algorithm tries to define a clear boundary between the two classes.t

It could be a straight line or a hyperplane in multiple dimensions but the basic idea behind is to divide the two classes in such a way that the difference between the two classes is maximum.

In fig. A, the two classes are separated with a linear line, and in fig. B the two classes are divided by a 2-dimensional plane in a 3-dimensional space, and we have represented that in a 2-dimensional chart and hence it is represented by a circle.

b. Where this algorithm is used?This algorithm can be used in both classifications as well as regression challenges but mainly it is used in classification problems.

c. The concept of Hard or Soft Margin Before talking about Hard or Soft Margins, let us see what Margin is:

Margin is a line that is parallel to the linear or hyperplane which is separating the two classes. The margin is built on both sides of this hyperplane and is equidistant from the hyperplane.

The target of SVM is to maximize this margin and practically speaking in real-world scenarios we would rarely get easily separable data. The majority of the time we would get data that is inseparable and hence the concept of Hard and Soft Margins becomes a bit important to understand.

Hard Margins tries to maximize the margins without introducing any kind of errors.

Soft Margins also tries to maximize the margins, however, they allow some marginal or classification errors to occur. The idea behind soft margin follows the same logic of bias-variance trade-off in order to achieve a better model which has less complexity.

The ultimate goal in SVM is to have a maximum margin with minimum errors

d. Errors in SVM There are two types of errors in SVM:

i. Classification Error — if the miss-classification occurs due to the point being on the opposite side of the hyperplane which is dividing the classes is called the classification error.

ii. Margin Error — if the miss-classification occurs due to the point being present inside the soft margins is called the margin error. Even if a point is on the correct side of the hyperplane it would still be tagged as miss-classified because the main target is to ensure that there are no points inside the margin.

The total error is calculated in SVM which is the sum of both Classification Error and Margin Error

e. C ParameterC parameter conceptually is very similar to the lambda (λ) parameter in regression, where we use the C parameter to punish the outliers and hence the error co-efficient increases or decreases so we change the parameter as per the model requirement.

If we increase the C parameter (also called as penalty parameter) it narrows down the margin which restricts the points to enter the margin which ultimately results in lower margin error and in the opposite scenario if we decrease the value of a C parameter it increases the margin size and hence allows more points inside the margin which results in a more generalized model.

f. Kernel trick As we have already discussed above that it is very rare that we would get data that is easily separable, in the majority of the cases we would get inseparable data in which it is practically impossible to draw a linear hyperplane between the two classes.

SVM algorithm has a technique called Kernel Trick which takes a low dimensional input and transforms it to a higher dimensional space and then tries to separate the inseparable data.

I hope, you all liked this article. Thank you for reading.

--

--