Regression Analysis
Based on Research With Fawad's video on YouTube. If you like this content, support the original creators by watching, liking and subscribing to their content.
Regression analysis quantifies how much variance in a dependent variable is explained by one or more independent variables, supporting both prediction and explanation.
Briefing
Regression analysis is a statistical method used to measure how strongly one dependent variable relates to one or more independent variables—and to quantify that relationship in ways that support prediction and explanation. It estimates how much variance in the dependent variable can be accounted for by the independent variable(s), producing a regression equation that links inputs to an expected outcome. In practice, regression is used to test whether changes in a predictor are associated with changes in an outcome, while also providing coefficients and fit statistics that indicate strength and significance.
A key distinction separates regression from correlation. Correlation focuses on association between two variables using a correlation coefficient, without labeling one variable as dependent or independent. Regression, by contrast, is built around prediction and explanation: it clearly distinguishes dependent and independent variables and uses inferential tests tied to regression outputs—such as regression coefficients, the intercept, and t statistics—to determine whether the relationship is statistically meaningful.
The transcript distinguishes two main forms. Bivariate regression involves exactly two variables: one dependent variable and one independent variable. It is often used to see how well scores on the dependent variable can be predicted from data on the independent variable. Multiple regression expands this framework to three or more variables, allowing researchers to evaluate the impact of several independent variables on a single dependent variable at once.
Before running models, several terms matter. Regression coefficients quantify how strongly each independent variable predicts the dependent variable. The unstandardized coefficient is used directly in the regression equation, while the standardized coefficient (beta value) expresses effects in standard deviation units; with a single predictor, beta aligns with the correlation coefficient between the dependent and independent variables. Model fit is summarized with R and R². R represents the correlation between observed and predicted values, while R² (the squared R) indicates the proportion of variance in the dependent variable explained by the chosen predictors. Because R² can look inflated as more predictors are added or with larger samples, adjusted R² offers a more conservative measure.
The walkthrough then demonstrates bivariate regression using servant leadership as the independent variable and life satisfaction as the dependent variable. The model summary reports R = .526, yielding R² = .276, meaning 27.6% of the variance in life satisfaction is accounted for by servant leadership. Significance is checked via the ANOVA table, where the p-value is reported as below .05 (described as 0.0 in the output). The coefficient interpretation follows: a positive beta (reported as .579) indicates that higher servant leadership is associated with higher life satisfaction, and the overall model fit is supported by an F statistic (reported around 83.599) with p < .01.
Finally, the transcript shows how multiple regression changes the output and reporting. When adding self-efficacy and job satisfaction alongside servant leadership, the overall model fit increases, with R² reported as .581 (58.1% variance explained). Significance is assessed again using the ANOVA p-value (< .05), while the coefficient table determines which predictors matter individually through t values and p values. Reporting guidance emphasizes writing results in text using a hypothesis template and presenting key statistics (beta, t, p, R², and F) in a table rather than copying raw software tables into a document.
Cornell Notes
Regression analysis quantifies how well one dependent variable can be predicted (and partially explained) by one or more independent variables. It differs from correlation because regression labels dependent vs. independent variables and uses inferential tests tied to regression coefficients and t statistics. Bivariate regression uses one predictor, producing R and R² to show explained variance; the transcript’s example reports R = .526 and R² = .276, meaning 27.6% of life satisfaction variance is accounted for by servant leadership. Multiple regression adds more predictors, increasing explained variance (example R² = .581) and requiring coefficient-level checks to see which predictors are individually significant. Reporting should translate outputs into hypothesis-based text plus a clean table of beta, t, p, R², and F.
How does regression analysis differ from correlation in purpose, variable labeling, and inferential testing?
What do R and R² mean in regression output, and why might adjusted R² be needed?
In bivariate regression, how should the regression equation be interpreted, including the role of the error term?
Using the servant leadership example, what do the reported R, R², beta, and p-values imply?
How does multiple regression change interpretation and reporting compared with bivariate regression?
What is a practical way to report regression results in text using a hypothesis template?
Review Questions
- When would correlation be insufficient and regression be the better choice, based on the transcript’s distinction between association and prediction/explanation?
- How do you interpret a positive beta coefficient in the context of a regression model, and how do you verify whether that effect is statistically significant?
- Why can R² become misleading as more independent variables are added, and what alternative does the transcript recommend?
Key Points
- 1
Regression analysis quantifies how much variance in a dependent variable is explained by one or more independent variables, supporting both prediction and explanation.
- 2
Correlation measures association without labeling dependent vs. independent variables, while regression requires clear dependent/independent roles and uses coefficient-based inferential tests.
- 3
Bivariate regression uses one independent variable; multiple regression uses three or more independent variables and requires checking each predictor’s individual significance.
- 4
R² represents the proportion of variance explained; the transcript interprets R² = .276 as 27.6% variance explained and emphasizes that adjusted R² can correct inflation.
- 5
Significance is assessed using ANOVA p-values for the overall model and coefficient-table p-values (with t statistics) for individual predictors.
- 6
Reporting should translate regression outputs into hypothesis-based text plus a clean summary table (beta, t, p, R², F) rather than pasting raw software tables.