Data mining : practical machine learning tools and techniques / Ian H. Witten, Eibe Frank, Mark A. Hall, Christopher J. Pal
- Author:
- Witten, I. H. (Ian H.)
- Published:
- Amsterdam ; Boston : Elsevier, [2017]
- Edition:
- Fourth Edition.
- Physical Description:
- xxxii, 621 pages ; 24 cm
- Additional Creators:
- Frank, Eibe
Hall, Mark A. (Mark Andrew)
Pal, Christopher J.
- Contents:
- Machine generated contents note: ch. 1 What's it all about? -- 1.1. Data Mining and Machine Learning -- Describing Structural Patterns -- Machine Learning -- Data Mining -- 1.2. Simple Examples: The Weather Problem and Others -- Weather Problem -- Contact Lenses: An Idealized Problem -- Irises: A Classic Numeric Dataset -- CPU Performance: Introducing Numeric Prediction -- Labor Negotiations: A More Realistic Example -- Soybean Classification: A Classic Machine Learning Success -- 1.3. Fielded Applications -- Web Mining -- Decisions Involving Judgment -- Screening Images -- Load Forecasting -- Diagnosis -- Marketing and Sales -- Other Applications -- 1.4. Data Mining Process -- 1.5. Machine Learning and Statistics -- 1.6. Generalization as Search -- Enumerating the Concept Space -- Bias -- 1.7. Data Mining and Ethics -- Reidentification -- Using Personal Information -- Wider Issues -- 1.8. Further Reading and Bibliographic Notes -- ch. 2 Input: concepts, instances, attributes -- 2.1. What's a Concept? -- 2.2. What's in an Example? -- Relations -- Other Example Types -- 2.3. What's in an Attribute? -- 2.4. Preparing the Input -- Gathering the Data Together -- ARFF Format -- Sparse Data -- Attribute Types -- Missing Values -- Inaccurate Values -- Unbalanced Data -- Getting to Know Your Data -- 2.5. Further Reading and Bibliographic Notes -- ch. 3 Output: knowledge representation -- 3.1. Tables -- 3.2. Linear Models -- 3.3. Trees -- 3.4. Rules -- Classification Rules -- Association Rules -- Rules With Exceptions -- More Expressive Rules -- 3.5. Instance-Based Representation -- 3.6. Clusters -- 3.7. Further Reading and Bibliographic Notes -- ch. 4 Algorithms: the basic methods -- 4.1. Inferring Rudimentary Rules -- Missing Values and Numeric Attributes -- 4.2. Simple Probabilistic Modeling -- Missing Values and Numeric Attributes -- Naive Bayes for Document Classification -- Remarks -- 4.3. Divide-and-Conquer: Constructing Decision Trees -- Calculating Information -- Highly Branching Attributes -- 4.4. Covering Algorithms: Constructing Rules -- Rules Versus Trees -- Simple Covering Algorithm -- Rules Versus Decision Lists -- 4.5. Mining Association Rules -- Item Sets -- Association Rules -- Generating Rules Efficiently -- 4.6. Linear Models -- Numeric Prediction: Linear Regression -- Linear Classification: Logistic Regression -- Linear Classification Using the Perceptron -- Linear Classification Using Winnow -- 4.7. Instance-Based Learning -- Distance Function -- Finding Nearest Neighbors Efficiently -- Remarks -- 4.8. Clustering -- Iterative Distance-Based Clustering -- Faster Distance Calculations -- Choosing the Number of Clusters -- Hierarchical Clustering -- Example of Hierarchical Clustering -- Incremental Clustering -- Category Utility -- Remarks -- 4.9. Multi-instance Learning -- Aggregating the Input -- Aggregating the Output -- 4.10. Further Reading and Bibliographic Notes -- 4.11. WEKA Implementations -- ch. 5 Credibility: evaluating what's been learned -- 5.1. Training and Testing -- 5.2. Predicting Performance -- 5.3. Cross-Validation -- 5.4. Other Estimates -- Leave-One-Out -- Bootstrap -- 5.5. Hyperparameter Selection -- 5.6. Comparing Data Mining Schemes -- 5.7. Predicting Probabilities -- Quadratic Loss Function -- Informational Loss Function -- Remarks -- 5.8. Counting the Cost -- Cost-Sensitive Classification -- Cost-Sensitive Learning -- Lift Charts -- ROC Curves -- Recall-Precision Curves -- Remarks -- Cost Curves -- 5.9. Evaluating Numeric Prediction -- 5.10. MDL Principle -- 5.11. Applying the MDL Principle to Clustering -- 5.12. Using a Validation Set for Model Selection -- 5.13. Further Reading and Bibliographic Notes -- ch. 6 Trees and rules -- 6.1. Decision Trees -- Numeric Attributes -- Missing Values -- Pruning -- Estimating Error Rates -- Complexity of Decision Tree Induction -- From Trees to Rules -- C4.5: Choices and Options -- Cost-Complexity Pruning -- Discussion -- 6.2. Classification Rules -- Criteria for Choosing Tests -- Missing Values, Numeric Attributes -- Generating Good Rules -- Using Global Optimization -- Obtaining Rules From Partial Decision Trees -- Rules With Exceptions -- Discussion -- 6.3. Association Rules -- Building a Frequent Pattern Tree -- Finding Large Item Sets -- Discussion -- 6.4. WEKA Implementations -- ch. 7 Extending instance-based and linear models -- 7.1. Instance-Based Learning -- Reducing the Number of Exemplars -- Pruning Noisy Exemplars -- Weighting Attributes -- Generalizing Exemplars -- Distance Functions for Generalized Exemplars -- Generalized Distance Functions -- Discussion -- 7.2. Extending Linear Models -- Maximum Margin Hyperplane -- Nonlinear Class Boundaries -- Support Vector Regression -- Kernel Ridge Regression -- Kernel Perceptron -- Multilayer Perceptrons -- Radial Basis Function Networks -- Stochastic Gradient Descent -- Discussion -- 7.3. Numeric Prediction With Local Linear Models -- Model Trees -- Building the Tree -- Pruning the Tree -- Nominal Attributes -- Missing Values -- Pseudocode for Model Tree Induction -- Rules From Model Trees -- Locally Weighted Linear Regression -- Discussion -- 7.4. WEKA Implementations -- ch. 8 Data transformations -- 8.1. Attribute Selection -- Scheme-Independent Selection -- Searching the Attribute Space -- Scheme-Specific Selection -- 8.2. Discretizing Numeric Attributes -- Unsupervised Discretization -- Entropy-Based Discretization -- Other Discretization Methods -- Entropy-Based Versus Error-Based Discretization -- Converting Discrete to Numeric Attributes -- 8.3. Projections -- Principal Component Analysis -- Random Projections -- Partial Least Squares Regression -- Independent Component Analysis -- Linear Discriminant Analysis -- Quadratic Discriminant Analysis -- Fisher's Linear Discriminant Analysis -- Text to Attribute Vectors -- Time Series -- 8.4. Sampling -- Reservoir Sampling -- 8.5. Cleansing -- Improving Decision Trees -- Robust Regression -- Detecting Anomalies -- One-Class Learning -- Outlier Detection -- Generating Artificial Data -- 8.6. Transforming Multiple Classes to Binary Ones -- Simple Methods -- Error-Correcting Output Codes -- Ensembles of Nested Dichotomies -- 8.7. Calibrating Class Probabilities -- 8.8. Further Reading and Bibliographic Notes -- 8.9. WEKA Implementations -- ch. 9 Probabilistic methods -- 9.1. Foundations -- Maximum Likelihood Estimation -- Maximum a Posteriori Parameter Estimation -- 9.2. Bayesian Networks -- Making Predictions -- Learning Bayesian Networks -- Specific Algorithms -- Data Structures for Fast Learning -- 9.3. Clustering and Probability Density Estimation -- Expectation Maximization Algorithm for a Mixture of Gaussians -- Extending the Mixture Model -- Clustering Using Prior Distributions -- Clustering With Correlated Attributes -- Kernel Density Estimation -- Comparing Parametric, Semiparametric and Nonparametric Density Models for Classification -- 9.4. Hidden Variable Models -- Expected Log-Likelihoods and Expected Gradients -- Expectation Maximization Algorithm -- Applying the Expectation Maximization Algorithm to Bayesian Networks -- 9.5. Bayesian Estimation and Prediction -- Probabilistic Inference Methods -- 9.6. Graphical Models and Factor Graphs -- Graphical Models and Plate Notation -- Probabilistic Principal Component Analysis -- Latent Semantic Analysis -- Using Principal Component Analysis for Dimensionality Reduction -- Probabilistic LSA -- Latent Dirichlet Allocation -- Factor Graphs -- Markov Random Fields -- Computing Using the Sum-Product and Max-Product Algorithms -- 9.7. Conditional Probability Models -- Linear and Polynomial Regression as Probability Models -- Using Priors on Parameters -- Multiclass Logistic Regression -- Gradient Descent and Second-Order Methods -- Generalized Linear Models -- Making Predictions for Ordered Classes -- Conditional Probabilistic Models Using Kernels -- 9.8. Sequential and Temporal Models -- Markov Models and N-gram Methods -- Hidden Markov Models -- Conditional Random Fields -- 9.9. Further Reading and Bibliographic Notes -- Software Packages and Implementations -- 9.10. WEKA Implementations -- ch. 10 Deep learning -- 10.1. Deep Feedforward Networks -- MNIST Evaluation -- Losses and Regularization -- Deep Layered Network Architecture -- Activation Functions -- Backpropagation Revisited -- Computation Graphs and Complex Network Structures -- Checking Backpropagation Implementations -- 10.2. Training and Evaluating Deep Networks -- Early Stopping -- Validation, Cross-Validation, and Hyperparameter Tuning -- Mini-Batch-Based Stochastic Gradient Descent -- Pseudocode for Mini-Batch Based Stochastic Gradient Descent -- Learning Rates and Schedules -- Regularization With Priors on Parameters -- Dropout -- Batch Normalization -- Parameter Initialization -- Unsupervised Pretraining -- Data Augmentation and Synthetic Transformations -- 10.3. Convolutional Neural Networks -- ImageNet Evaluation and Very Deep Convolutional Networks -- From Image Filtering to Learnable Convolutional Layers -- Convolutional Layers and Gradients -- Pooling and Subsampling Layers and Gradients -- Implementation -- 10.4. Autoencoders -- Pretraining Deep Autoencoders With RBMs -- Denoising Autoencoders and Layerwise Training
Note continued: Combining Reconstructive and Discriminative Learning -- 10.5. Stochastic Deep Networks -- Boltzmann Machines -- Restricted Boltzmann Machines -- Contrastive Divergence -- Categorical and Continuous Variables -- Deep Boltzmann Machines -- Deep Belief Networks -- 10.6. Recurrent Neural Networks -- Exploding and Vanishing Gradients -- Other Recurrent Network Architectures -- 10.7. Further Reading and Bibliographic Notes -- 10.8. Deep Learning Software and Network Implementations -- Theano -- Tensor Flow -- Torch -- Computational Network Toolkit -- Caffe -- Deeplearning4j -- Other Packages: Lasagne, Keras, and cuDNN -- 10.9. WEKA Implementations -- ch. 11 Beyond supervised and unsupervised learning -- 11.1. Semisupervised Learning -- Clustering for Classification -- Cotraining -- EM and Cotraining -- Neural Network Approaches -- 11.2. Multi-instance Learning -- Converting to Single-Instance Learning -- Upgrading Learning Algorithms -- Dedicated Multi-instance Methods -- 11.3. Further Reading and Bibliographic Notes -- 11.4. WEKA Implementations -- ch. 12 Ensemble learning -- 12.1. Combining Multiple Models -- 12.2. Bagging -- Bias-Variance Decomposition -- Bagging With Costs -- 12.3. Randomization -- Randomization Versus Bagging -- Rotation Forests -- 12.4. Boosting -- AdaBoost -- Power of Boosting -- 12.5. Additive Regression -- Numeric Prediction -- Additive Logistic Regression -- 12.6. Interpretable Ensembles -- Option Trees -- Logistic Model Trees -- 12.7. Stacking -- 12.8. Further Reading and Bibliographic Notes -- 12.9. WEKA Implementations -- ch. 13 Moving on: applications and beyond -- 13.1. Applying Machine Learning -- 13.2. Learning From Massive Datasets -- 13.3. Data Stream Learning -- 13.4. Incorporating Domain Knowledge -- 13.5. Text Mining -- Document Classification and Clustering -- Information Extraction -- Natural Language Processing -- 13.6. Web Mining -- Wrapper Induction -- Page Rank -- 13.7. Images and Speech -- Images -- Speech -- 13.8. Adversarial Situations -- 13.9. Ubiquitous Data Mining -- 13.10. Further Reading and Bibliographic Notes -- 13.11. WEKA Implementations. - Summary:
- This work offers a grounding in machine learning concepts combined with practical advice on applying machine learning tools and techniques in real-world data mining situations.
- Subject(s):
- ISBN:
- 9780128042915
0128042915 - Bibliography Note:
- Includes bibliographical references (pages 573-601) and index.
- Source of Acquisition:
- Purchased with funds from the Class of 1962 Libraries Endowment; 2017
View MARC record | catkey: 22083775