Computer Vision Week 1 Introduction

What is computer vision

Extracting description of images

  • Image processing
  • Computer graphic

image processing: image -> image
computer vision: image -> description
computer graphics: description -> image

  • Pattern recognition
    • recognnising and classifying stimuli in images and other datasets.
  • Photogrammetry
    • obtain measurements from images
  • Biological vision
    • understanding visual perception in humans and animals(studied in Neuroscience, Psychology, Psychophysics)

Application

  • Hard writing detection
  • ORC: optical character recognition
  • Face detection/Smile detection
  • Face recognition
  • Biometrics: Iris Recognition, Fingerprint Recognition
  • People tracking/Object tracking
  • Content-base Image Retrieval
  • Reverse image search
  • Landmark recognition
  • Driver assistance
  • Space exploration
  • Medical imaging
  • 3D models from images

Why is vision difficult

~ 50% cerebral cortex is devoted to vision
vision consumes ~10% of entire human energy consumption.

Major challenges

  • One image -> many interpretations
    • problem is ill-posed
  • One object -> many images
    • problem is exponentially large

Vision is an ill-posed problem

mapping from world to image is unique(well-posed)
This is a “forward problem”
mapping form image to world is NOT unique(ill-posed)
This is an “inverse problem”

There are mulitple interpretations of an image

  • Vision scales exponentially: one object can generate many images
  • Viewpoint affect appearance
  • Illumination affects appearance
  • Non-rigid deformations affect appearance
  • Within-category variation in appearance
  • Discrimination despite variation
  • Other objects affect appearance

Tackling the problem of vision

Biological approach
Computational approach

Need for constraints

To solve these challenges we need to employ constraints/prior/expectations

  • Effects of inference:
    • illumination
    • perspective
    • prior knowledge: learned familarity with certain objects/ knowledge of image formation process in general
    • prior exposure/motion/priming: recent/preceding sensory input
    • current context: surrounding visual scene(and concurrent input in other sensory modalities)

Two Main Approaches

  • Engineering Approach
    • determine what the system needs to do(requirements)
    • design a system to perform this task
    • implement the system, test and refine it.
    • “top down”: start with computational theory annd fill out details
  • Reverse Engineering Approach
    • find a system that performs the task(e.g. the brain)
    • analyse the system to determine how it does it.
    • implement a new system using the same mechanisms.
    • “bottom -up”: start with mechanisms and build a model.

tutorial

tutorial for week 1