Computer Vision Week 1 Introduction
What is computer vision
Extracting description of images
Related Disciplines
- Image processing
- Computer graphic
image processing: image -> image
computer vision: image -> description
computer graphics: description -> image
- Pattern recognition
- recognnising and classifying stimuli in images and other datasets.
- Photogrammetry
- obtain measurements from images
- Biological vision
- understanding visual perception in humans and animals(studied in Neuroscience, Psychology, Psychophysics)
Application
- Hard writing detection
- ORC: optical character recognition
- Face detection/Smile detection
- Face recognition
- Biometrics: Iris Recognition, Fingerprint Recognition
- People tracking/Object tracking
- Content-base Image Retrieval
- Reverse image search
- Landmark recognition
- Driver assistance
- Space exploration
- Medical imaging
- 3D models from images
Why is vision difficult
~ 50% cerebral cortex is devoted to vision
vision consumes ~10% of entire human energy consumption.
Major challenges
- One image -> many interpretations
- problem is ill-posed
- One object -> many images
- problem is exponentially large
Vision is an ill-posed problem
mapping from world to image is unique(well-posed)
This is a “forward problem”
mapping form image to world is NOT unique(ill-posed)
This is an “inverse problem”
There are mulitple interpretations of an image
- Vision scales exponentially: one object can generate many images
- Viewpoint affect appearance
- Illumination affects appearance
- Non-rigid deformations affect appearance
- Within-category variation in appearance
- Discrimination despite variation
- Other objects affect appearance
Tackling the problem of vision
Biological approach
Computational approach
Need for constraints
To solve these challenges we need to employ constraints/prior/expectations
- Effects of inference:
- illumination
- perspective
- prior knowledge: learned familarity with certain objects/ knowledge of image formation process in general
- prior exposure/motion/priming: recent/preceding sensory input
- current context: surrounding visual scene(and concurrent input in other sensory modalities)
Two Main Approaches
- Engineering Approach
- determine what the system needs to do(requirements)
- design a system to perform this task
- implement the system, test and refine it.
- “top down”: start with computational theory annd fill out details
- Reverse Engineering Approach
- find a system that performs the task(e.g. the brain)
- analyse the system to determine how it does it.
- implement a new system using the same mechanisms.
- “bottom -up”: start with mechanisms and build a model.