Logistic Regression: From Linear Relationships to Binary Classifications (with Full Explanation)

SharkYun
4 min readAug 27, 2023

--

  • Intro
  • Quick Recap of Linear Regression
  • What is Logistic Regression
  • Why Logistic Regression is powerful
  • How to interpret Logistic Regression
  • Conclusion

Intro

Before delving into Logistic Regression, let’s examine the formulas of Linear Regression and Logistic Regression.

From the graph above, we can observe that Logistic Regression employs the output of Linear Regression as an input to the sigmoid function. Before we get into Logistic Regression, lets get a quick recap about what linear regression is:

Quick recap — Linear Regression

Linear regression is used for predicting a continuous numeric output based on input features. It assumes a linear relationship between the input features and the output. The model equation in linear regression is of the form:

where:

  • y is the continuous output.
  • x1​,x2​,…,xn are the input features.
  • β0​,β1​,…,βn​ are the coefficients to be estimated.
  • ε represents the error term.

For more details regarding Linear regression, I will explain in another article. Right now, lets take a deep look into how logistic regression is different with the linear regression and how it relative with it.

What is Logistic regression

Logistic regression, on the other hand, is used for binary classification, which involves predicting one of two classes (0 or 1). While it’s called “regression,” it doesn’t predict a continuous output directly. Instead, it models the probability that a given input belongs to a certain class. The logistic regression model uses the logistic function (also called the sigmoid function) to transform the linear combination of input features:

where P(y=1∣x) is the probability that the output is 1 given the input features.

In logistic regression:

  • The linear combination β0​+β1​x1​+β2​x2​+…+βnxn​ is often referred to as the “logit.”
  • The logistic function 1 /(1+e^(-logit)) transforms the logit to a probability value between 0 and 1.

The logit function maps y as a sigmoid function of x. So, while logistic regression uses the concept of a linear combination of input features like linear regression, the transformation through the logistic function fundamentally changes its purpose and characteristics, making it suitable for binary classification tasks.

Why Logistic Regression is powerful

  • Interpretability: Logistic Regression produces interpretable results, allowing you to understand the impact of each predictor variable on the probability of the binary outcome. The coefficients of the predictor variables provide insight into the direction and magnitude of their effects.
  • Probability Estimation: Logistic Regression models directly estimate the probability of a binary outcome, which is useful when you need to assess the likelihood of an event occurring.
  • Efficiency: Logistic Regression is computationally efficient and can handle a large number of predictor variables. It doesn’t require complex computations compared to some other machine learning algorithms.

How to interpret Logistic Regression

  • Interpret output

The sigmoid function will map the output the target value as probability within [0, 1] which we can use to do binary classification easily. Such as: P(y=1|x) > 0.5, then predict y as 1. Vice versa, P(y=0|x) < 0.5, then predict y as 0.

  • Interpret coefficient

From python/R, we can easily print the summary of the Logistic regression to see the coefficient and their p values to see if they are significant or not. And yeah, be cautious with interpretation if multicollinearity is present, as it can affect the stability and interpretability of coefficients.

Conclusion

In conclusion, Logistic Regression is a powerful and versatile technique in the realm of machine learning. It distinguishes itself from Linear Regression by its focus on binary classification, where the goal is to predict one of two classes (0 or 1). Unlike Linear Regression, which predicts continuous numeric output, Logistic Regression models the probability that a given input belongs to a particular class.

Overall, Logistic Regression’s combination of interpretability, probability estimation, and computational efficiency makes it a valuable tool for various binary classification tasks. By understanding the mechanics and interpretive elements of Logistic Regression, you can harness its power to make informed decisions in data-driven applications.

Reference:

--

--

SharkYun

Data science notes and Personal experiences | UCLA 2023'