๐ณ Tree Detection and Segmentation Pipeline (DINO + ViTย +ย SAMย +ย YOLO+)
A modern deep learning project combining the power of vision transformers and segment-anything models to accurately detect and isolate trees in complex scenes using advanced filtering and evaluation techniques.
๐ Key Components
| Model | Purpose |
|---|---|
| DINO | Vision Transformer-based object detection |
| ViT | Backbone feature extraction |
| SAM | Smart segmentation with mask refinement |
| YOLO+ | Real-time object detection |
- โ Refined Segmentation Masks: Removed irrelevant masks (e.g., humans, background) generated by SAM.
- โ Post-processing Isolation: Only tree-like objects are segmented with minimal overhead.
- โ Mask Filtering: Shape and size-based filtering improves mask quality.
- โ
Evaluation Support: Built-in performance evaluation using:
- IoU (Intersection over Union)
- Precision & Recall
- AP (Average Precision)
๐ Quick Start
Requirements
bash pip install jupyter
Inference
bash jupyter nbconvert --to script Sam_Filtered_ViT_Segmentation.ipynb python sam_filtered_vit_segmentation.py --image Images/Trees/Tree.jpg
Options
| Argument | Description |
|---|---|
| --model | Choose model: dino, yolo+, sam |
| --evaluate | Run evaluation metrics |
| --refine | Apply shape/size mask filtering |
๐ย Evaluation
We evaluate performance using:
- IoU - Measures overlap of predicted vs ground truth
- Precision / Recall - Accuracy of segmentation results
- AP - Average precision across confidence thresholds
Results are printed and logged automatically.
๐ Project Structure
โโโ Images/ โ โโโ Trees/ โ โโโ NotTrees/ โโโ Labels/ โ โโโ Trees/ โ โโโ NotTrees/ โโโ dataset.yaml โโโ Sam_Filtered_ViT_Segmentation โโโ README.md โโโ LICENSE
๐ค Authors & Contributions
- ๐ฌ DINO & ViT Integration - @Zack4DEV
- ๐ง SAM Post-Processing - @Zack4DEV
- โ Evaluation Engine - @Zack4DEV
๐ License
MIT License - see LICENSE for details.
๐ Future Work
- โ Integrate image captioning for detected tree regions
- โณ Multi-class support (e.g., tree species)
- โณ Web UI with streamlit or Gradio