Regression Techniques

Vishvanath Metkari
4 min readJan 9, 2021

Overview :

Learn about the different regression types in machine learning , including linear regression and logistic regression .

Each regression technique has its own regression equation and regression coefficients

What is regression Analysis ?

Regression analysis is a form of predictive modelling technique which investigates the relationship between a dependent and independent variables. this technique is used for forecasting , time series modelling and finding the causal effect relationship between the variables.

Why do we use Regression Analysis ?

There are multiple benefits of using regression analysis . they are following :

  • It indicates the significant relationship between dependent and independent variable .
  • It indicates the strength of impact of multiple independent variables on a dependent variable .

Regression analysis also allows us to compare the effects of variable measured on different scale , such as the effect of price changes and the number of promotional activities . These benefits helps in market researchers / data analysis . data scientists to eliminates and evaluates the best set of variable to be used for building predictive models .

1. Linear Regression :

  • Linear regression establishes a relationship between dependent variable (Y) and one or more independent variable (X) using a best fit straight line (also know as Regression line).
  • In this techniques the dependent variable is continuous , independent variables can be continuous or discrete , and nature of regression line is linear .
  • It is represent by the equation : Y = a + b*X + e

Where a is a intercept ,

b is slope of line

and e is error term

  • This equation can be used to predict the value of target variable based on given predictor variables .
Image from Google
  • Multiple linear regression has (>1) independent variables, whereas simple linear regression has only 1 independent variable .

2. Logistic Regression :

  • Logistic regression is used to find the probability of event = success or event = Failure .
  • We should use Logistic regression when the dependent variable is binary (0/1 , True/ False , Yes /No) in nature. Here the value of Y ranges form 0 to 1 and it can represented by the following equation .

odds = p / (1-p) = probability of event occurrence / not

In(odds) = In(p/(1-p))

logit(p) = In(p/(1-p)) = b0 + b1x1+b2x2+b3x3 + …..+bkxk

  • Above , P is the Probability of presence of the characteristic of interest .
Image from Google

3. Polynomial Regression :

  • A regression equation is a polynomial regression equation if the power of independent variable is more than 1. the equation below represent a polynomial equation.

y = a + b*X²

  • In this regression the best fit line is not a straight line . It a rather a curve that fit into the data points.
Image from Google

4. Ridge Regression :

  • Ridge regression is a technique used when the data is suffers from multicollinearity (independent variable are highly correlated) .
  • This method performs L2 regularization. When the issue of multicollinearity occurs, least-squares are unbiased, and variances are large, this results in predicted values to be far away from the actual values.
  • The cost function for ridge regression:

Min(||Y — X(theta)||² + λ||theta||²)

Lambda is the penalty term. λ given here is denoted by an alpha parameter in the ridge function. So, by changing the values of alpha, we are controlling the penalty term. Higher the values of alpha, bigger is the penalty and therefore the magnitude of coefficients is reduced.

  • It shrinks the parameters. Therefore, it is used to prevent multicollinearity
  • It reduces the model complexity by coefficient shrinkage

5. Lasso Regression :

  • Similar to ridge regression , Lasso (Least absolute Shrinkage and selection Operator ) also penalize the absolute size of the regression coefficients.
  • In addition , it is capable of reducing the variability and improving the accuracy of linear regression model.
  • The LASSO method puts a constraint on the sum of the absolute values of the model parameters, the sum has to be less than a fixed value (upper bound). In order to do so the method apply a shrinking (regularization) process where it penalizes the coefficients of the regression variables shrinking some of them to zero.
  • During features selection process the variables that still have a non-zero coefficient after the shrinking process are selected to be part of the model. The goal of this process is to minimize the prediction error. In practice the tuning parameter λ, that controls the strength of the penalty, assume a great importance. Indeed when λ is sufficiently large then coefficients are forced to be exactly equal to zero, this way dimensionality can be reduced.
  • The larger is the parameter λ the more number of coefficients are shrinked to zero. On the other hand if λ = 0 we have an OLS (Ordinary Least Square) regression.
  • Lasso reg. Shrinks the coefficients to zero (exactly zero), which certainly helps in feature selection
  • Lasso is regularization method and uses L1 regularization.
  • If group of predictors are highly correlated , lasso picks only one of them and shrinks the others to zero.

--

--