Skip to content

Commit ca2f18c

Browse files
[pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
1 parent 4ddf98f commit ca2f18c

1 file changed

Lines changed: 14 additions & 14 deletions

File tree

docs/decision_tree.md

Lines changed: 14 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
# Decision Tree Algorithm
22

33
## Overview
4-
A **Decision Tree** is a supervised machine learning algorithm used for both classification and regression tasks.
4+
A **Decision Tree** is a supervised machine learning algorithm used for both classification and regression tasks.
55
It works by recursively splitting the dataset into smaller subsets based on feature values until a stopping criterion is met.
66

77
---
@@ -31,9 +31,9 @@ IG(S, A) = H(S) - Σ ( |Sv| / |S| ) * H(Sv)
3131

3232

3333
Where:
34-
- `S` = dataset
35-
- `A` = attribute (feature)
36-
- `Sv` = subset after splitting by `A`
34+
- `S` = dataset
35+
- `A` = attribute (feature)
36+
- `Sv` = subset after splitting by `A`
3737

3838
A split with the **highest information gain** is chosen.
3939

@@ -48,30 +48,30 @@ Gini(S) = 1 - Σ (p(i)²)
4848

4949

5050
Where:
51-
- `p(i)` = probability of class `i` in dataset `S`
51+
- `p(i)` = probability of class `i` in dataset `S`
5252

5353
A pure dataset has Gini = 0.
5454

5555
---
5656

5757
## Practical Use Cases
58-
- **Business**: Predicting customer churn
59-
- **Finance**: Credit scoring / loan approval
60-
- **Healthcare**: Diagnosing diseases based on symptoms
61-
- **Cybersecurity**: Spam / phishing detection
58+
- **Business**: Predicting customer churn
59+
- **Finance**: Credit scoring / loan approval
60+
- **Healthcare**: Diagnosing diseases based on symptoms
61+
- **Cybersecurity**: Spam / phishing detection
6262

6363
---
6464

6565
## Advantages
66-
- Simple to understand and visualize
67-
- Handles both numerical and categorical data
68-
- Requires little preprocessing (no normalization or scaling)
66+
- Simple to understand and visualize
67+
- Handles both numerical and categorical data
68+
- Requires little preprocessing (no normalization or scaling)
6969

7070
---
7171

7272
## Limitations
73-
- Prone to overfitting (can be solved using pruning or ensembles like Random Forests)
74-
- Small changes in data can lead to different trees (instability)
73+
- Prone to overfitting (can be solved using pruning or ensembles like Random Forests)
74+
- Small changes in data can lead to different trees (instability)
7575

7676
---
7777

0 commit comments

Comments
 (0)