- Intro
- Preface
- Scope of the Book
- Organization of the Book
- Acknowledgments
- Contents
- List of Figures
- List of Tables
- 1 The Basics of Machine Learning
- 1.1 Introduction
- 1.2 Machine Learning: Definition, Rationale, Usefulness
- 1.3 From Symbolic AI to Statistical Learning
- 1.3.1 Numeral Recognition: Symbolic AI Versus Machine Learning
- 1.4 Non-identifiability of the Mapping: The Curse of Dimensionality
- 1.5 Conclusions
- References
- 2 The Statistics of Machine Learning
- 2.1 Introduction
- 2.2 ML Modeling Trade-Offs
- 2.2.1 Prediction Versus Inference
- 2.2.2 Flexibility Versus Interpretability
- 2.2.3 Goodness-of-Fit Versus Overfitting
- 2.3 Regression Setting
- 2.4 Classification Setting
- 2.5 Training and Testing Error in Parametric and Nonparametric ...
- 2.6 Measures for Assessing the Goodness-of-Fit
- 2.6.1 Dealing with Unbalanced Classes for Classification
- 2.7 Optimal Tuning of Hyper-Parameters
- 2.7.1 Information Criteria
- 2.7.2 Bootstrap
- 2.7.3 K-fold Cross-validation (CV)
- 2.7.4 Plug-in Approach
- 2.8 Learning Modes and Architecture
- 2.9 Limitations and Failures of Statistical Learning
- 2.9.1 ML Limitations
- 2.9.2 ML Failure
- 2.10 Software
- 2.11 Conclusions
- References
- 3 Model Selection and Regularization
- 3.1 Introduction
- 3.2 Model Selection and Prediction
- 3.3 Prediction in High-Dimensional Settings
- 3.4 Regularized Linear Models
- 3.4.1 Ridge
- 3.4.2 Lasso
- 3.4.3 Elastic-Net
- 3.5 The Geometry of Regularized Regression
- 3.6 Comparing Ridge, Lasso, and Best Subset Solutions
- 3.7 Choosing the Optimal Tuning Parameters
- 3.7.1 Adaptive Lasso
- 3.7.2 Plugin Estimation
- 3.8 Optimal Subset Selection
- 3.8.1 Best (or Exhaustive) Subset Selection
- 3.8.2 Forward Stepwise Selection
- 3.8.3 Backward Stepwise Selection
- 3.9 Statistical Properties of Regularized Regression
- 3.9.1 Rate of Convergence
- 3.9.2 Support Recovery
- 3.9.3 Oracle Property
- 3.10 Causal Inference
- 3.10.1 Partialing-Out
- 3.10.2 Double-Selection
- 3.10.3 Cross-Fit Partialing-Out
- 3.10.4 Lasso with Endogenous Treatment
- 3.11 Regularized Nonlinear Models
- 3.12 Stata Implementation
- 3.12.1 The lasso command
- 3.12.2 The lassopack Suite
- 3.12.3 The subset Command
- 3.12.4 Application S1: Linear Regularized Regression
- 3.12.5 Application S2: Nonlinear Lasso
- 3.12.6 Application S3: Multinomial Lasso
- 3.12.7 Application S4: Inferential Lasso Under Exogeneity
- 3.12.8 Application S5: Inferential Lasso Under Endogeneity
- 3.12.9 Application S6: Optimal Subset Selection
- 3.12.10 Application S7: Regularized Regression with Time-Series and Longitudinal Data
- 3.13 R Implementation
- 3.13.1 Application R1: Fitting a Gaussian Penalized Regression
- 3.13.2 Application R2: Fitting a Multinomial Ridge Classification
- 3.14 Python Implementation
- 3.14.1 Application P1: Fitting Lasso Using the LassoCV() Method
- 3.14.2 Application P2: Multinomial Regularized Classification in Python
This book presents the fundamental theoretical notions of supervised machine learning along with a wide range of applications using Python, R, and Stata. It provides a balance between theory and applications and fosters an understanding and awareness of the availability of machine learning methods over different software platforms. After introducing the machine learning basics, the focus turns to a broad spectrum of topics: model selection and regularization, discriminant analysis, nearest neighbors, support vector machines, tree modeling, artificial neural networks, deep learning, and sentiment analysis. Each chapter is self-contained and comprises an initial theoretical part, where the basics of the methodologies are explained, followed by an applicative part, where the methods are applied to real-world datasets. Numerous examples are included and, for ease of reproducibility, the Python, R, and Stata codes used in the text, along with the related datasets, are available online. The intended audience is PhD students, researchers and practitioners from various disciplines, including economics and other social sciences, medicine and epidemiology, who have a good understanding of basic statistics and a working knowledge of statistical software, and who want to apply machine learning methods in their work.