Computer Vision Week 1 Introduction
What is computer vision
Extracting description of images
Related Disciplines
- Image processing
 - Computer graphic
 
image processing: image -> image
computer vision: image -> description
computer graphics: description -> image
- Pattern recognition
- recognnising and classifying stimuli in images and other datasets.
 
 - Photogrammetry
- obtain measurements from images
 
 - Biological vision
- understanding visual perception in humans and animals(studied in Neuroscience, Psychology, Psychophysics)
 
 
Application
- Hard writing detection
 - ORC: optical character recognition
 - Face detection/Smile detection
 - Face recognition
 - Biometrics: Iris Recognition, Fingerprint Recognition
 - People tracking/Object tracking
 - Content-base Image Retrieval
 - Reverse image search
 - Landmark recognition
 - Driver assistance
 - Space exploration
 - Medical imaging
 - 3D models from images
 
Why is vision difficult
~ 50% cerebral cortex is devoted to vision
vision consumes ~10% of entire human energy consumption.
Major challenges
- One image -> many interpretations
- problem is ill-posed
 
 - One object -> many images
- problem is exponentially large
 
 
Vision is an ill-posed problem
mapping from world to image is unique(well-posed)
This is a “forward problem”
mapping form image to world is NOT unique(ill-posed)
This is an “inverse problem”
There are mulitple interpretations of an image
- Vision scales exponentially: one object can generate many images
 - Viewpoint affect appearance
 - Illumination affects appearance
 - Non-rigid deformations affect appearance
 - Within-category variation in appearance
 - Discrimination despite variation
 - Other objects affect appearance
 
Tackling the problem of vision
Biological approach
Computational approach
Need for constraints
To solve these challenges we need to employ constraints/prior/expectations
- Effects of inference:
- illumination
 - perspective
 - prior knowledge: learned familarity with certain objects/ knowledge of image formation process in general
 - prior exposure/motion/priming: recent/preceding sensory input
 - current context: surrounding visual scene(and concurrent input in other sensory modalities)
 
 
Two Main Approaches
- Engineering Approach
- determine what the system needs to do(requirements)
 - design a system to perform this task
 - implement the system, test and refine it.
 - “top down”: start with computational theory annd fill out details
 
 - Reverse Engineering Approach
- find a system that performs the task(e.g. the brain)
 - analyse the system to determine how it does it.
 - implement a new system using the same mechanisms.
 - “bottom -up”: start with mechanisms and build a model.