Get AI summaries of any video or article — Sign up free
Binary Logistic Regression Analysis using SPSS: What it is, How to Run, and Interpret the Results. thumbnail

Binary Logistic Regression Analysis using SPSS: What it is, How to Run, and Interpret the Results.

Research With Fawad·
6 min read

Based on Research With Fawad's video on YouTube. If you like this content, support the original creators by watching, liking and subscribing to their content.

TL;DR

Use binary logistic regression when the dependent variable is dichotomous (0/1), such as private vs public bank choice.

Briefing

Binary logistic regression is the go-to method when the outcome is dichotomous—coded as 0/1—so researchers can estimate how one or more predictors shift the odds of belonging to one group versus another. Instead of modeling a continuous dependent variable like linear regression, logistic regression predicts group membership by calculating the probability of “success” relative to “failure,” producing results in odds ratios. That makes it useful for questions such as whether customers will return, whether applicants are admitted, whether candidates win elections, or whether job applicants are selected.

The transcript frames logistic regression as a tool for assessing the strength and direction of relationships between independent variables and a binary dependent variable. The dependent variable is coded with one category as 1 (the “target” group) and the other as 0 (the reference group). For example, private bank choice can be coded as 1 and public bank choice as 0. Predictors can include factors like technology, interest rates, value-added services, perceived risks, and reputation. The model estimates how changes in these predictors affect the log-odds of choosing the target group, and the odds ratio translates those effects into an interpretable metric.

Assumptions differ from linear regression: logistic regression does not require a linear relationship between dependent and independent variables, and predictors do not need to be interval-scaled or normally distributed. The dependent variable must be dichotomous, with mutually exclusive and exhaustive categories—each case belongs to exactly one group. Sample size guidance is also emphasized: larger samples are needed than in linear regression, with commonly cited rules such as at least 50 cases per predictor (Field, Lee, and Blank, 2000) or at least 30 observations per independent variable.

Running the analysis in SPSS follows a standard workflow: Analyze → Regression → Binary Logistic. The dependent variable is assigned as the “Preferred Choice,” while independent variables are placed as covariates. Options can include the Hosmer–Lemeshow goodness-of-fit test, confidence intervals for odds ratios, and related diagnostics.

Interpretation starts with the output structure. The Case Processing Summary confirms the number of observations (341 respondents in the example) and shows how the dependent variable is coded (public sector bank = 0, private sector bank = 1). Block 0 provides a baseline model with no predictors. Block 1 adds predictors and is evaluated using the Omnibus test of model coefficients (significant improvement over the null model) and the Hosmer–Lemeshow test (a good fit when the result is not significant, i.e., significance ≥ 0.05). A contingency table further checks whether observed and predicted classifications align.

Model performance is assessed via pseudo R-square measures (e.g., Nagelkerke’s R-square as an adjusted Cox & Snell statistic) and, crucially, the classification table. In the example, the model correctly classifies about 75.7% of cases, with sensitivity reported around 96.4% for predicting private bank choice and specificity around 17.8% for predicting public bank choice—indicating strong detection of the target group but weaker identification of the reference group.

Finally, the odds ratio table identifies which predictors significantly affect the outcome. Odds ratios greater than 1 increase the odds of choosing the target group as the predictor rises; odds ratios less than 1 decrease those odds. Confidence intervals that do not cross 1 signal statistical significance. In the example, value-added services shows odds greater than 1 (with a 95% confidence interval entirely above 1), while perceived risk shows an odds ratio below 1 (with a confidence interval entirely below 1), meaning higher perceived risk reduces the likelihood of choosing private banks.

Cornell Notes

Binary logistic regression is used when the dependent variable is binary (0/1), such as choosing a private bank (1) versus a public bank (0). The method estimates how predictors like technology, interest rates, value-added services, and perceived risk change the odds of belonging to the target group. SPSS output is interpreted through model fit tests (Omnibus test and Hosmer–Lemeshow), pseudo R-square measures, and a classification table that reports accuracy, sensitivity, and specificity. The odds ratio table identifies significant predictors: values above 1 increase target-group odds, values below 1 decrease them, and confidence intervals that do not cross 1 indicate significance. This approach helps quantify which factors meaningfully shift binary decisions.

Why does logistic regression produce odds ratios instead of the kind of coefficients used in linear regression?

Because the dependent variable is dichotomous, the model predicts the probability of being in one group (coded 1) relative to the other group (coded 0). Logistic regression works with log-odds and then converts results into odds ratios, which directly express how a one-unit change in a predictor affects the odds of choosing the target category versus the reference category.

What do the Omnibus test and Hosmer–Lemeshow test tell you about model fit?

The Omnibus test of model coefficients checks whether adding predictors significantly improves fit compared with Block 0 (the null model with no predictors). A significant Omnibus result indicates improved fit. The Hosmer–Lemeshow goodness-of-fit test is interpreted oppositely: a good fit corresponds to an insignificant result (significance ≥ 0.05), meaning observed outcomes and predicted probabilities do not differ meaningfully.

How should the classification table be interpreted beyond overall accuracy?

Overall accuracy (e.g., ~75.7% correct classification) indicates how often the model predicts the correct category. Sensitivity (true positive rate) measures correct prediction of the target group (private bank choice) and was reported as about 96.4% in the example. Specificity (true negative rate) measures correct prediction of the reference group (public bank choice) and was reported as about 17.8%, showing the model is much better at detecting private-bank choosers than public-bank choosers.

What does an odds ratio greater than 1 mean, and how does confidence interval placement affect significance?

An odds ratio greater than 1 means higher predictor values increase the odds of the event (choosing the target group). Statistical significance is assessed using the 95% confidence interval: if the interval does not cross 1, the effect is significant. In the example, value-added services had an odds ratio above 1 with a confidence interval entirely above 1, indicating a significant positive effect on choosing private banks.

What does an odds ratio less than 1 mean for a predictor like perceived risk?

An odds ratio less than 1 means higher values of the predictor decrease the odds of the event. For perceived risk, the odds ratio was below 1 and the confidence interval did not cross 1, indicating a significant negative relationship—higher perceived risk reduces the likelihood of choosing private banks.

What assumptions matter most for binary logistic regression?

The dependent variable must be dichotomous with mutually exclusive and exhaustive categories (each case belongs to exactly one group). Unlike linear regression, logistic regression does not require a linear relationship between dependent and independent variables, and predictors do not need normal distribution or equal variance across groups. Sample size should be sufficiently large, with commonly cited rules such as at least 50 cases per predictor or at least 30 observations per independent variable.

Review Questions

  1. In an SPSS binary logistic regression output, which tests would you check first to confirm that adding predictors improves model fit, and what significance direction would you expect for each?
  2. If a predictor has an odds ratio of 0.75 with a 95% confidence interval of 0.60 to 0.95, how would you interpret the direction and significance of its effect?
  3. How do sensitivity and specificity differ in meaning, and what does it imply if sensitivity is high but specificity is very low?

Key Points

  1. 1

    Use binary logistic regression when the dependent variable is dichotomous (0/1), such as private vs public bank choice.

  2. 2

    Interpret results using odds ratios: values above 1 increase target-group odds, values below 1 decrease them.

  3. 3

    Check model fit with the Omnibus test (significant improvement over the null model) and Hosmer–Lemeshow (insignificant result indicates adequate fit).

  4. 4

    Evaluate predictive performance with the classification table, focusing on sensitivity (target-group accuracy) and specificity (reference-group accuracy), not just overall accuracy.

  5. 5

    Confirm that the dependent categories are mutually exclusive and exhaustive, and remember logistic regression does not require normality or linearity assumptions for predictors.

  6. 6

    Use confidence intervals for odds ratios: significance is supported when the 95% interval does not cross 1.

  7. 7

    Plan for adequate sample size, commonly using rules like 50 cases per predictor or at least 30 observations per independent variable.

Highlights

Logistic regression predicts group membership by modeling the odds of success (probability of target group) relative to failure (probability of reference group), producing odds ratios as the main effect size.
A significant Omnibus test paired with an insignificant Hosmer–Lemeshow test supports a model that fits the data reasonably well.
Sensitivity and specificity can diverge sharply: the example shows very high sensitivity (~96.4%) alongside very low specificity (~17.8%).
Odds ratios become interpretable only with confidence intervals: intervals that don’t cross 1 indicate statistically meaningful effects.
In SPSS, the workflow is Analyze → Regression → Binary Logistic, then interpret Block 0 vs Block 1, followed by fit tests, classification accuracy, and the odds ratio table.

Topics

Mentioned

  • SPSS