
ЁЯУЭ Computer Vision Tasks: Classification vs. Detection vs. Segmentation vs. Recognition
1. Image Classification
- What it does: Identifies the main object or scene in an image.
- Output: A single label (class).
- Example: Input тЖТ Image of a cat; Output тЖТ
"Cat"
- Use Case: Photo tagging, disease diagnosis from X-rays.
- Model: VGG, ResNet, EfficientNet.
2. Object Detection
- What it does: Locates and identifies multiple objects in an image.
- Output: Labels + bounding boxes for each object.
- Example: Input тЖТ Street image; Output тЖТ
[{"label": "car", "bbox": [...]}, {"label": "person", "bbox": [...]}]
- Use Case: Self-driving cars, surveillance systems.
- Model: YOLO, Faster R-CNN, SSD.
3. Semantic Segmentation
- What it does: Classifies every pixel in the image.
- Output: Pixel-wise class labels (same label for same-class objects).
- Example: Input тЖТ Cityscape image; Output тЖТ Each pixel labeled as
"road"
,"sky"
,"building"
, etc. - Use Case: Medical imaging, autonomous navigation.
- Model: U-Net, DeepLab, FCN.
4. Instance Segmentation
- What it does: Like semantic segmentation but differentiates between instances of the same class.
- Output: Pixel-wise labels with unique IDs per object.
- Example: Two dogs in image тЖТ dog1 and dog2 are separately labeled.
- Use Case: Crowd analysis, robotics.
- Model: Mask R-CNN.
5. Object Recognition
- What it does: Recognizes specific identity of an object/person.
- Output: Identity of the object or person.
- Example: Input тЖТ Face image; Output тЖТ
"Barack Obama"
- Use Case: Face unlock, biometric security.
- Model: FaceNet, ArcFace, Siamese Networks.
ЁЯза Summary Table
Task | Output Type | Level of Understanding | Example Use |
---|---|---|---|
Classification | Single label | Global | Identify if image is a cat or dog |
Detection | Labels + Bounding Boxes | Local | Find where cats and dogs are |
Semantic Segmentation | Pixel-wise labels (class only) | Detailed | Mark all road pixels |
Instance Segmentation | Pixel-wise labels (unique per object) | Very detailed | Distinguish between two dogs |
Recognition | Object/Person ID | Specific | Recognize a face or product |