Data mining methods and models
 Author/Creator
 Larose, Daniel T.
 Language
 English.
 Imprint
 Hoboken, NJ : WileyInterscience, c2006.
 Physical description
 xvi, 322 p. : ill. ; 25 cm.
Access
Available online
 ieeexplore.ieee.org IEEE Xplore
 dx.doi.org Wiley Online Library

Stacks

Unknown
QA76.9 .D343 L378 2006

Unknown
QA76.9 .D343 L378 2006
More options
Contents/Summary
 Bibliography
 Includes bibliographical references and index.
 Contents

 Preface.1. Dimension Reduction Methods.Need for Dimension Reduction in Data Mining.Principal Components Analysis.Factor Analysis.UserDefined Composites.2. Regression Modeling.Example of Simple Linear Regression.LeastSquares Estimates.Coefficient or Determination.Correlation Coefficient.The ANOVA Table.Outliers, High Leverage Points, and Influential Observations.The Regression Model.Inference in Regression.Verifying the Regression Assumptions.An Example: The Baseball Data Set.An Example: The California Data Set.Transformations to Achieve Linearity.3. Multiple Regression and Model Building.An Example of Multiple Regression.The Multiple Regression Model.Inference in Multiple Regression.Regression with Categorical Predictors.Multicollinearity.Variable Selection Methods.An Application of Variable Selection Methods.Mallows' C p Statistic.Variable Selection Criteria.Using the Principal Components as Predictors in Multiple Regression.4. Logistic Regression.A Simple Example of Logistic Regression.Maximum Likelihood Estimation.Interpreting Logistic Regression Output.Inference: Are the Predictors Significant?Interpreting the Logistic Regression Model.Interpreting a Logistic Regression Model for a Dichotomous Predictor.Interpreting a Logistic Regression Model for a Polychotomous Predictor.Interpreting a Logistic Regression Model for a Continuous Predictor.The Assumption of Linearity.The ZeroCell Problem.Multiple Logistic Regression.Introducing Higher Order terms to Handle NonLinearity.Validating the Logistic Regression Model.WEKA: HandsOn Analysis Using Logistic Regression.5. Naive Bayes and Bayesian Networks.The Bayesian Approach.The Maximum a Posteriori (MAP) Classification.The Posterior Odds Ratio.Balancing the Data.Naive Bayes Classification.Numeric Predictors for Naive Bayes Classification.WEKA: HandsOn Analysis Using Naive Bayes.Bayesian Belief Networks.Using the Bayesian Network to Find Probabilities.WEKA: HandsOn Analysis Using Bayes Net.6. Genetic Algorithms.Introduction to Genetic Algorithms.The Basic Framework of a Genetic Algorithm.A Simple Example of Genetic Algorithms at Work.Modifications and Enhancements: Selection.Modifications and enhancements: Crossover.Genetic Algorithms for RealValued Variables.Using Genetic Algorithms to Train a Neural Network.WEKA: HandsOn Analysis Using Genetic Algorithms.7. Case Study: Modeling Response to DirectMail Marketing.The CrossIndustry Standard Process for Data Mining: CRISPDM.Business Understanding Phase.Data Understanding and Data Preparation Phases.The Modeling Phase and the Evaluation Phase.Index.
 (source: Nielsen Book Data)
 Publisher's Summary
 Apply powerful Data Mining Methods and Models to leverage your data for actionable results. "Data Mining Methods and Models" provides: the latest techniques for uncovering hidden nuggets of information; the insight into how the data mining algorithms actually work; and, the handson experience of performing data mining on large data sets."Data Mining Methods and Models": applies a "white box" methodology, emphasizing an understanding of the model structures underlying the software; walks the reader through the various algorithms and provides examples of the operation of the algorithms on actual large data sets, including a detailed case study, "Modeling Response to DirectMail Marketing"; tests the reader's level of understanding of the concepts and methodologies, with over 110 chapter exercises; demonstrates the Clementine data mining software suite, WEKA open source data mining software, SPSS statistical software, and Minitab statistical software; and, includes a companion Web site, where the data sets used in the book may be downloaded, along with a comprehensive set of data mining resources.Faculty adopters of the book have access to an array of helpful resources, including solutions to all exercises, a PowerPoint[registered] presentation of each chapter, sample data mining course projects and accompanying data sets, and multiplechoice chapter quizzes. With its emphasis on learning by doing, this is an excellent textbook for students in business, computer science, and statistics, as well as a problemsolving reference for data analysts and professionals in the field. An Instructor's Manual presenting detailed solutions to all the problems in the book is available online.
(source: Nielsen Book Data)  Supplemental links
 Table of contents
Subjects
Bibliographic information
 Publication date
 2006
 Responsibility
 Daniel T. Larose.
 ISBN
 0471666564
 9780471666561