11# Decision Tree Algorithm
22
33## Overview
4- A ** Decision Tree** is a supervised machine learning algorithm used for both classification and regression tasks.
4+ A ** Decision Tree** is a supervised machine learning algorithm used for both classification and regression tasks.
55It works by recursively splitting the dataset into smaller subsets based on feature values until a stopping criterion is met.
66
77---
@@ -31,9 +31,9 @@ IG(S, A) = H(S) - Σ ( |Sv| / |S| ) * H(Sv)
3131
3232
3333Where:
34- - ` S ` = dataset
35- - ` A ` = attribute (feature)
36- - ` Sv ` = subset after splitting by ` A `
34+ - ` S ` = dataset
35+ - ` A ` = attribute (feature)
36+ - ` Sv ` = subset after splitting by ` A `
3737
3838A split with the ** highest information gain** is chosen.
3939
@@ -48,30 +48,30 @@ Gini(S) = 1 - Σ (p(i)²)
4848
4949
5050Where:
51- - ` p(i) ` = probability of class ` i ` in dataset ` S `
51+ - ` p(i) ` = probability of class ` i ` in dataset ` S `
5252
5353A pure dataset has Gini = 0.
5454
5555---
5656
5757## Practical Use Cases
58- - ** Business** : Predicting customer churn
59- - ** Finance** : Credit scoring / loan approval
60- - ** Healthcare** : Diagnosing diseases based on symptoms
61- - ** Cybersecurity** : Spam / phishing detection
58+ - ** Business** : Predicting customer churn
59+ - ** Finance** : Credit scoring / loan approval
60+ - ** Healthcare** : Diagnosing diseases based on symptoms
61+ - ** Cybersecurity** : Spam / phishing detection
6262
6363---
6464
6565## Advantages
66- - Simple to understand and visualize
67- - Handles both numerical and categorical data
68- - Requires little preprocessing (no normalization or scaling)
66+ - Simple to understand and visualize
67+ - Handles both numerical and categorical data
68+ - Requires little preprocessing (no normalization or scaling)
6969
7070---
7171
7272## Limitations
73- - Prone to overfitting (can be solved using pruning or ensembles like Random Forests)
74- - Small changes in data can lead to different trees (instability)
73+ - Prone to overfitting (can be solved using pruning or ensembles like Random Forests)
74+ - Small changes in data can lead to different trees (instability)
7575
7676---
7777
0 commit comments