Skip to content

Latest commit

 

History

History
16 lines (9 loc) · 912 Bytes

File metadata and controls

16 lines (9 loc) · 912 Bytes

Computer Vision

Computer vision is an interdisciplinary field focused on enabling computers to gain high-level understanding from images and video—automatically extracting, analyzing, and interpreting visual information to produce outputs such as labels, measurements, 3D structure, or decisions.

In practice, computer vision methods combine geometry, physics, statistics, and machine learning to connect pixel data to semantic concepts like objects, actions, and scenes.

Image processing vs. computer vision

Image processing primarily transforms images (e.g., denoising, contrast enhancement, geometric warping) where the output is another image.

Computer vision uses images/video as input but often outputs information about the scene (e.g., detections, segmentation masks, pose estimates, tracking results, or a decision), which may then drive downstream behavior in a larger system.