Ensemble Methods

1. Explain ensemble methods.

Ensemble methods is a class of machine learning models that combines the prediction of multiple models to improve overall performance and generalisation.

2. What is the assumption that ensemble method adheres to?

It relies on the assumption that decision-making of a larger group of models is better than one model.

3. Why are ensemble methods generally superior than a single model.

They achieve better performance by combining different individual models “weak learners” to produce a stronger learner (bagging or boosting). Aggregating the models together, can average individual mistakes to reduce the risk of over-fitting, while maintaining strong prediction performance.

Bagging

1. Explain bagging (Bootstrap Aggregation).

Bagging involves training multiple instances of the same learning algorithm on different subsets of the training data. Each model is trained independently, often in parallel.

2. Explain the main steps to perform bagging algorithms.

Bootstrapping: Bootstrapping is a random sampling method with replacement to generate different subsets of the training set.
Parallel Training: The bootstrap samples are then trained independently and in parallel with each other using weak or base learners.
Aggregate Results: An average (regression) or majority of the prediction (classification) are taken to compute a more accurate estimates.

3. What is the method for dividing the data samples for individual training models in bagging?

The subsets are typically created by random sampling with replacement, a technique known as bootstrapping. This means that some data points may be repeated in a subset, while others may be omitted.

4. Explain the importance of the model independence in bagging.

Diversity of Models: The diverse training sets lead to the creation of diverse models, capturing different aspects of the underlying data distribution.
Overfitting Mitigation: Model independence helps mitigate overfitting because the individual models are less likely to learn the same noise in the training data.
Stability and Reliability: If one model makes a prediction error on a specific subset of the data, other models may compensate for it, leading to a more reliable overall prediction.

Random Forest

1. What is the reasoning behind the name “Random Forest”?

Tree-Based ML algo that leverage the power of multiple decision trees from making decisions. Hence, ‘forest’ of trees.
‘random’ because it is a forest of randomly created decision trees due to bootstrapping.

2. Explain the steps to perform random forest algorithm.

Bootstrap sampling: A random sample is drawn from the original dataset with replacement.
Parallel Training: Construct a decision tree for every random subset and perform the training process concurrently.
Aggregate: An average (regression) or majority of the prediction (classification) are taken to compute a more accurate estimates.

3. A stand alone decision tree is susceptible to overfitting. How does random forest mitigates the overfitting issue?

Random forest deploys bootstrapped sampling and this means each tree in the forest is trained on different randomised subset of datapoints and features. This diversification introduces variability in the training sets for individual trees and thus reduces the risk of overfitting.

Boosting

1. Explain boosting.

Boosting is an ensemble learning method that combines the predictions of multiple weak learners sequentially (typically simple models) to create a strong learner.

2. Explain the key difference in model training between boosting and bagging.

The key difference is that bagging algorithms method usually performs parallel training. Whereas, boosting algorithms perform sequential training such that each subsequent model corrects the errors of its predecessor.

3. Explain the main steps to perform boosting algorithms.

Initialisation: Boosting starts by training a weak learner on the entire dataset.
Instance Weighting: After the initial model is trained, each instance in the dataset is assigned an initial weight. Misclassified instances are given higher weights, while correctly classified instances receive lower weights.
Sequential Training & Optimisation: Boosting trains a series of weak learners sequentially. Each subsequent learner focuses on the instances that the previous ones misclassified.
Weighted Aggregation: At each step, the weak learner’s prediction is combined with the predictions of the previous models, and the instance weights are adjusted based on the model’s performance. Correctly classified instances have their weights reduced, while misclassified instances have their weights increased.

4. Describe how boosting mitigates bias in a model.

Boosting reduces bias by iteratively emphasising the correction of errors made by the previous models in the ensemble. The key mechanism that contributes to bias reduction in boosting is the adaptive learning process, which assigns higher weights to instances that are difficult to classify correctly.

Last updated on 20 Jan 2024