Article

Introduction To Logistic Regression

Topic: Business ConsultingPublished October 12, 2008

Legacy signals

Legacy popularity: 929 legacy views

Reader rating

Not enough ratings yet

Aggregate average appears after enough eligible reader ratings.

Rate this resource

Sign in to rate this resource.

Sign in to rate this resource

Researchers are often interested in setting up a model to analyze the relationship between some predictors (i.e., independent variables) and a response (i.e., dependent variable). Linear regression is commonly used when the response variable is continuous. One assumption of linear models is that the residual errors follow a normal distribution. This assumption fails when the response variable is categorical, so an ordinary linear model is not appropriate. This newsletter presents a regression model for a response variable that is dichotomous having two categories. Examples are common: whether a plant lives or dies, whether a survey respondent agrees or disagrees with a statement, or whether an at-risk child graduates or drops out from high school.

In ordinary linear regression, the response variable (Y) is a linear function of the coefficients (B0, B1, etc.) that correspond to the predictor variables (X1, X2, etc.). A typical model would look like:

Y = B0 + B1*X1 + B2*X2 + B3*X3 + ... + E

For a dichotomous response variable, we could set up a similar linear model to predict individuals' category memberships if numerical values are used to represent the two categories. Arbitrary values of 1 and 0 are chosen for mathematical convenience. Using the first example, we would assig
Y = 1 if a plant lives and Y = 0 if a plant dies.

This linear model does not work well for a few reasons. First, the response values, 0 and 1, are arbitrary, so modeling the actual values of Y is not exactly of interest. Second, it is really the probability that each individual in the population responds with 0 or 1 that we are interested in modeling. For example, we may find that plants with a high level of a fungal infection (X1) fall into the category "the plant lives" (Y) less often than those plants with low level of infection. Thus, as the level of infection rises, the probability of a plant living decreases.

Thus, we might consider modeling P, the probability, as the response variable. Again, there are problems. Although the general decrease in probability is accompanied by a general increase in infection level, we know that P, like all probabilities, can only fall within the boundaries of 0 and 1. Consequently, it is better to assume that the relationship betwee
X1 and P is sigmoidal (S-shaped), rather than a straight line.

It is possible, however, to find a linear relationship betwee
X1 and a function of P. Although a number of functions work, one of the most useful is the logit function. It is the natural log of the odds that Y is equal to 1, which is simply the ratio of the probability that Y is 1 divided by the probability that Y is 0. The relationship between the logit of P and P itself is sigmoidal in shape. The regression equation that results is:nnln[P/(1-P)] = B0 + B1*X1 + B2*X2 + ...

Although the left side of this equation looks intimidating, this way of expressing the probability results in the right side of the equation being linear and looking familiar to us. This helps us understand the meaning of the regression coefficients. The coefficients can easily be transformed so that their interpretation makes sense.

The logistic regression equation can be extended beyond the case of a dichotomous response variable to the cases of ordered categories and polytymous categories (more than two categories).

Article author

About the Author

Copyright © 2008, Karen Grace-Marti Karen Grace-Martin, founder of The Analysis Factor, has helped social science researchers practice statistics for 9 years, as a statistical consultant at Cornell University and in her own business. She knows the kinds of resources and support that researchers need to practice statistics confidently, accurately, and efficiently, no matter what their statistical background. To answer your questions, receive advice, and view a list of resources to help you learn and apply appropriate statistics to your data, visit www.analysisfactor.com.

Further reading

Further Reading

4 total

Article

The medical device sector demands greater regulatory standards worldwide. Firms must ensure product safety and quality for patient well-being. Implementing the ISO 13485standards for medical devices can help meet these expectations. Skilled ISO 13485 consultants can assist in the implementation journey,and this delivers measurable value. This ISO is not about a paperwork exercise, but it offers practical implementation procedures. It allows medical firms to design efficient q

February 17, 2026

Article

Are You Worried That Competitors Are Ahead in Ways We Can’t See? How to Stop Playing Blind and Start Seeing What Actually Matters: Weekly Winning StrategiesrnMany companies lose because they fight ghosts. Imagining competitor advantage that doesn’t exist. Missing the real threats right in front of them. Stop worrying about invisible competitors and start seeing what matters. The Panic That Wastes MillionsrnA fintech startup approached us in 2025 with $800K in their bank a

February 8, 2026

Article

Inventory management is one of the most important parts of running a successful business. No matter if you own a retail store, a restaurant, or a small warehouse, knowing what products you have in stock helps you avoid losses and serve customers better. When inventory is poorly managed, businesses often face common problems such as missing items, overstocked shelves, or products running out at the wrong time. These issues can directly affect profits and customer trust. In the

January 16, 2026

Article

Inventory management is one of the most important parts of running a successful business. No matter if you own a retail store, a restaurant, or a small warehouse, knowing what products you have in stock helps you avoid losses and serve customers better. When inventory is poorly managed, businesses often face common problems such as missing items, overstocked shelves, or products running out at the wrong time. These issues can directly affect profits and customer trust.rnIn th

January 16, 2026