Actions for Regression & linear modeling : best practices and modern methods
Regression & linear modeling : best practices and modern methods / Jason W. Osborne, University of Louisville
- Author
- Osborne, Jason W.
- Additional Titles
- Regression and linear modeling
- Published
- Los Angeles : SAGE, [2017]
- Copyright Date
- ©2017
- Physical Description
- xxv, 457 pages : illustrations ; 26 cm
- Contents
- Machine generated contents note: The Variables Lead the Way -- Ordinality -- Equal Intervals -- True Zero Point -- Different Classifications of Measurement -- Ratio Measurement -- Interval Measurement -- Ordinal Measurement -- Nominal Measurement -- It's All About Relationships! -- A Brief Review of Basic Algebra and Linear Equations -- The GLM in One Paragraph -- A Brief Consideration of Prediction Versus Explanation in Linear Modeling -- A Brief Primer on Null Hypothesis Statistical Testing -- A Trivial and Silly Example of Hypothesis Testing -- A Tale of Two Errors -- What Conclusions Can We Draw Based on NHST Results? -- So What Does Failure to Reject the Null Hypothesis Mean? -- Moving Beyond NHST -- Other Pieces of Information Necessary to Draw Proper Conclusions -- The Importance of Replication and Generalizability -- Where We Go From Here -- Enrichment -- References -- Estimation and the GLM -- What Is OLS Estimation? -- ML Estimation-A Gentle but Deeper Look -- Assumptions for OLS and ML Estimation -- Model -- Variables -- Residuals and Distributions -- Simple Univariate Data Cleaning and Data Transformations -- Data Screening -- Missing Data -- Transformation of Data -- University Size and Faculty Salary in the United States -- What If We Cannot Meet the Assumptions? -- Where We Go From Here -- Enrichment -- References -- Advance Organizer -- It's All About Relationships! -- Basics of the Pearson Product-Moment Correlation Coefficient -- Calculating r -- Effect Sizes and r -- A Real Data Example -- The Basics of Simple Regression -- Basic Calculations for Simple Regression -- Standardized Versus Unstandardized Regression Coefficients -- Hypothesis Testing in Simple Regression -- A Real Data Example -- The Assumption That the Model Is Correctly Specified -- Assumptions About the Variables -- Assumptions About Residuals -- Summary of Results -- Does Centering or z-Scoring Make a Difference? -- Some Simple Multivariate Data Cleaning -- What Is a Bivariate Outlier? -- Standardized Residuals -- Studentized Residuals -- Global Measures of Influence: DfFit or Cook's Distance (Cook's D) -- Specific Measures of Influence: Dffletas -- Summary -- Enrichment -- References -- Advance Organizer -- It's All About Relationships! (Part 2) -- Analyzing These Data via t-Test -- Analyzing These Data via ANOVA -- ANOVA Within an OLS Regression Framework -- When Your IV Has More Than Two Groups: Dummy Coding Your Unordered Polytomous Variable -- Define the Reference Group -- Set Up the Dummy-Coded Variables -- Evaluating the Effects of the Categorical Variable in the Regression Model -- Smoking and Diabetes Analyzed via ANOVA -- Smoking and Diabetes Analyzed via Regression -- What If the Dummy Variables Are Coded Differently? -- Unweighted Effects Coding -- Weighted Effects Coding -- Common Alternatives to Dummy or Effects Coding -- Simple Contrasts -- Difference (Reverse Helmert) Contrasts -- Helmert Contrasts -- Repeated Contrasts -- Summary -- Enrichment -- References -- Advance Organizer -- It's All About Relationships! (Part 3) -- Why Is Logistic Regression Necessary? -- The Linear Probability Model -- How Logistic Regression Solves This Issue: The Logit Link Function -- A Brief Digression Into Probabilities, Conditional Probabilities, and Odds -- Simple Logistic Regression Using Statistical Software -- Indicators of Overall Model Fit -- What Is a -2 Log Likelihood? -- The Logistic Regression Equation -- Interpreting the Constant -- What If You Want CIs for the Constant? -- Summary So Far -- Logistic Regression With a Continuous IV -- Some Best Practices When Using a Continuous Variable in Logistic Regression -- Testing Assumptions and Data Cleaning in Logistic Regression -- Deviance Residuals -- DfBetas -- Hosmer and Lemeshow Test for Model Fit -- How Should We Interpret Odds Ratios That Are Less Than 1.0? -- Summary -- Enrichment -- Appendix 5A: A Brief Primer on Probit Regression -- What Is a Probit? -- The Probit Link -- A Real-Data Example of Probit Regression -- Why Are There Two Different Procedures If They Produce the Same Results? -- Some Nice Features of Probit -- Assumptions of Probit Regression -- Summary and Conclusion -- References -- Advance Organizer -- Understanding Marijuana Use -- Dummy-Coded DVs and Our Hypotheses to Be Tested -- Basics and Calculations -- Multinomial Logistic Regression (Unordered) With Statistical Software -- Multinomial Logistic Regression With a Continuous Predictor -- Multinomial Logistic Regression as a Series of Binary Logistic Regressions -- Data Cleaning and Multinomial Logistic Regression -- Testing Whether Groups Can Be Combined -- Ordered Logit (Proportional Odds) Model -- Assumptions of the Ordinal Logistic Model -- Interpreting the Results of the Ordinal Regression -- Interpreting the Intercepts/Thresholds -- Interpreting the Parameter Estimates -- Data Cleaning and More Advanced Models in Ordinal Logistic Regression -- The Measured Variable is Continous, Why Not Just Use OLS Regression for This Type of Analysis? -- A Brief Note on Log-Linear Analyses -- Summary and Conclusions -- Enrichment -- References -- Advance Organizer -- Zeno's Paradox, a Nerdy Science Joke, and Inherent Curvilinearity in the Universe -- A Brief Review of Simple Algebra -- Hypotheses to Be Tested -- Illegitimate Causes of Curvilinearity -- Model Misspecification: Omission of Important Variables -- Poor Data Cleaning -- Detection of Nonlinear Effects -- Theory -- Ad Hoc Testing -- Box-Tidwell Transformations -- Basic Principles of Curvilinear Regression -- Occam's Razor -- Ordered Entry of Variables -- Each Effect Is One Part of the Entire Effect -- Centering -- Curvilinear OLS Regression Example: Size of the University and Faculty Salary -- Data Cleaning -- Interpreting Curvilinear Effects Effectively -- Reality Testing This Effect -- Summary of Curvilinear Effects in OLS Regression -- Curvilinear Logistic Regression Example: Diabetes and Age -- Curvilinear Effects in Multinomial Logistic Regression -- Replication Becomes Important -- More Fun With Curves: Estimating Minima and Maxima as Well as Slope at Any Point on the Curve -- Summary -- Enrichment -- References -- Advance Organizer -- The Basics of Multiple Predictors -- What Are the Implications of This Act? -- Hypotheses to Be Tested in Multiple Regression -- Assumptions of Multiple Regression and Data Cleaning -- Predicting Student Achievement From Real Data -- Where Is the Missing Variance? -- Testing Assumptions and Data Cleaning in the NELS S88 Data -- What Does the Intercept Mean When There Are Multiple IVs? -- Methods of Entering Variables -- User-Controlled Methods of Entry -- Hierarchical Entry -- Blockwise Entry -- Software-Controlled Entry -- Using Multiple Regression for Theory Testing -- What Is the Meaning of This Intercept? -- Logistic Regression With Multiple IVs -- Assessing the Overall Logistic Regression Model: Why There Is No R2 for Logistic Regression -- Summary and Conclusions -- Enrichment -- References -- Advance Organizer -- What Is an Interaction? -- Procedural and Conceptual Issues in Testing for Interactions Between Continuous Variables -- Procedural and Conceptual Issues in Testing for Interactions Containing Categorical Variables -- Hypotheses to Be Tested in Multiple Regression With Interactions Present -- An OLS Regression Example: Predicting Student Achievement From Real Data -- Interpreting the Results From a Significant Interaction -- Graphing Interaction Effects -- Staying Out of Trouble on the X Axis -- Staying Out of Trouble on the Y Axis -- Procedural Issues With Graphing -- An Interaction Between a Continuous and a Categorical Variable in OLS Regression -- Interactions With Logistic Regression -- Example Summary of Interaction Analysis -- Interactions and Multinomial Logistic Regression -- Data Cleaning -- Calculation of Overall Model Statistics -- Example Summary of Findings -- Can These Effects Replicate? -- Post Hoc Probing of Interactions -- Regions of Significance -- Using Statistical Software to Produce Simple Slopes Analyses -- Summary -- Enrichment -- References -- Advance Organizer -- What Is a Curvilinear Interaction? -- A Quadratic Interaction Between X and Z -- A Cubic Interaction Between X and Z -- A Real-Data Example and Exploration of Procedural Details -- Step 1. Create the Terms Prior to Analysis -- Step 2. Build Your Equation Slowly -- Step 3. Clean the Data Thoughtfully to Ensure You Are Not Missing an Interesting Effect -- Step 4. After Influential Cases Are Removed, Perform the Analysis Again -- Step 5. Provide Your Audience With a Graphical Representation of These Complex Results -- Step 6. Summarize the Results Coherently Using the Graphs as Guides -- Summary -- Curvilinear Interactions Between Continuous and Categorical Variables -- Summary -- Curvilinear Interactions With Categorical DVs (Multinomial Logistic) -- Curvilinear Interaction Effects in Ordinal Regression -- Summary -- Chapter Summary -- Enrichment -- References -- Advance Organizer -- The Basics and Assumptions of Poisson Regression -- Curvilinearity in Poisson Models -- The Nature of the Variables -- Issues With Zeros -- Issues With Variance -- Why Can't We Just Analyze Count Data via OLS, Multinomial, or Ordinal Regression? -- Multinomial or Ordinal Regression -- Hypotheses Tested in Poisson Regression -- Model Fit -- Poisson Regression With Real Data -- Interactions in Poisson Regression -- Data Cleaning in Poisson Regression -- Refining the Model by Eliminating Excess (Inappropriate) Zeros -- A Refined Analysis With Excess Zeros Removed -- Curvilinear Effects in Poisson Regression -- Dealing With Overdispersion or Underdispersion -- Effects of Adjusting the Scale Parameter -- Negative Binomial Model -- Summary and Conclusions -- Enrichment -- References -- Advance Organizer -- The Basics of Log-Linear Analysis -- What Is Different About Log-Linear Analysis? -- Hypotheses Being Tested -- and Contents note continued: Individual Parameter Estimates -- Assumptions of Log-Linear Models -- A Slightly More Complex Log-Linear Model -- Can We Replicate These Results in Logistic Regression? -- Data Cleaning in Log-Linear Models -- Summary and Conclusions -- Enrichment -- References -- Advance Organizer -- Why HLM Models Are Necessary -- What Is a Hierarchical Data Structure? -- Why Is Hierarchical or Nested Data an Issue? -- The Problem of Independence of Observations -- The Problem of How to Deal With Multilevel Data -- How Do Hierarchical Models Work? A Brief Primer -- Generalizing the Basic HLM Model -- Example 1. Modeling a Continuous DV in HLM -- Example 2. Modeling Binary Outcomes in HLM -- Residuals in HLM -- Results of DROPOUT Analysis in HLM -- Cross-Level Interactions in HLM Logistic Regression -- So What Would Have Happened If These Data Had Been Analyzed via Simple Logistic Regression Without Accounting for the Nested Data Structure? -- Summary and Conclusions -- Enrichment -- References -- Advance Organizer -- Not All Missing Data Are the Same -- Utility of Legitimately Missing Data for Data Checking -- Categories of Missingness: Why Do We Care If Data Are MCAR or Not? -- How Do You Know If Your Data Are MCAR, MAR, or MNAR? -- What Do We Do With Randomly Missing Data? -- Data MCAR -- Mean Substitution -- Strong and Weak Regression Imputation -- Multiple Imputation (Bayesian) -- Summary -- Data MNAR -- Example 1. Nonrandom Missingness Reverses the Effect -- Example 2. Nonrandom Missingness Dramatically Inflates the Effect -- Summary -- How Missingness Can Be an Interesting Variable in and of Itself -- Summing Up: Benefits of Appropriately Handling Missing Data -- Enrichment -- References -- Advance Organizer -- What Is Power, and Why Is It Important? -- Correctly Rejecting a Null Hypothesis -- Informing Null Results -- Is Power an Ethical Issue? -- Power in Linear Models -- OLS Regression With Multiple Predictors -- Binary Logistic Regression -- Summary of Points Thus Far -- Who Cares as Long as p < .05? Volatility in Linear Models -- Small Samples Versus Large Samples -- A Brief Introduction to Bootstrap Resampling -- Principle 1. Results From Larger Samples Will Be Less Volatile Than Results From Smaller Samples -- Principle 2. Effect Sizes Should Not Affect the Replicability of the Results -- Principle 3. Complex Effects Are Less Likely to Replicate Than Simple Effects, Particularly in Smaller Samples -- Summary and Conclusions -- Enrichment -- References -- Advance Organizer -- A More Modern View of Reliability -- What Is Cronbach's Alpha (and What Is It Not)? -- Alpha and the Kuder-Richardson Coefficient of Equivalence -- The Correct Interpretation of Alpha -- What Alpha Is Not -- Factors That Influence Alpha -- Length of the Scale -- Average Inter-Item Correlation -- Reverse-Coded Items (Negative Item-Total Correlations) -- Random Responding or Response Sets -- Multidimensionality -- Outliers -- Other Assumptions of Alpha -- What Is "Good Enough" for Alpha? -- Reliability and Simple Correlation or Regression -- Reliability and Multiple IVs -- Reliability and Interactions in Multiple Regression -- Protecting Against Overcorrecting During Disattenuation -- Other (Better) Solutions to the Issue of Measurement Error -- Does Reliability Influence Other Analyses, Such as Analysis of Variance? -- Reliability in Logistic Models -- But Other Authors Have Argued That Poor Reliability Isn't That Important. Who Is Right? -- Sample Size and the Precision/Stability of Alpha-Empirical CIs -- Summary and Conclusions -- References -- Advance Organizer -- Prediction Versus Explanation -- How Is a Prediction Equation Created? -- Methods for Entering Variables Into the Equation -- Shrinkage and Evaluating the Quality of Prediction Equations -- Cross-Validation -- Double Cross-Validation -- An Example Using Real Data -- Double Cross-Validation -- So How Much Shrinkage Is Too Much Shrinkage? -- The Final Step -- How Does Sample Size Affect the Shrinkage and Stability of a Prediction Equation? -- Improving on Prediction Models -- Calculating a Predicted Score, and CIs Around That Score -- Prediction (Prognostication) in Logistic Regression (and Other) Models -- Overall Performance -- Concordance or Discrimination -- Estimated Shrinkage in Logistic Models -- Other Proposed Methods of Estimating Shrinkage -- An Example of External Validation of a Prognostic Equation Using Real Data -- External Validation of a Prediction Equation -- Overall Performance (Brier Score) -- Estimated Shrinkage -- Concordance and Discrimination -- Using Bootstrap Analysis to Estimate a More Robust Prognostic Equation -- General Bootstrap Methodology for Internal Validation of a Prognostic Model -- Internal Validation -- Summary -- References -- Advance Organizer -- What Types of Studies Use Complex Sampling? -- Why Does Complex Sampling Matter? -- What Are Best Practices in Accounting for Complex Sampling? -- Does It Really Make a Difference in the Results? -- Conditions Used -- Unweighted -- Weighted -- Scaled Weights -- Appropriately Modeled -- Comparison of Unweighted Versus Weighted Analyses -- Large Effect in Ordinary Least Squares Regression -- Modest Effect in Binary Logistic Regression -- Null Effect in Analysis of Variance -- Null Effect in Ordinary Least Squares Regression -- Summary -- Enrichment -- References -- The Normal (Gaussian) Distribution -- Why Is the Normal Distribution Such a Big Deal?.
- Subject(s)
- ISBN
- 9781506302768 (hardcover ; alk. paper)
1506302769 (hardcover ; alk. paper) - Bibliography Note
- Includes bibliographical references and index.
View MARC record | catkey: 18152633