Instructor:
Daniel Kersten. Office: S212 Elliott Hall. Phone: 612 625-2589 email:
kersten@umn.edu
Office hours: Wednesdays 11:00-12:00 am or by appointment.
TA: Qiujie Weng. Office: N12. Elliott Hall.. Phone: 651 600-6351 (call this # for access to basement office, N12) email: wengx022@umn.edu
Office
hours: Mondays 11-Noon. Or by appointment.
The visual perception of what is in the world is accomplished continually, instantaneously, and usually without conscious thought. The very effortlessness of perception disguises the underlying richness of the problem. We can gain insight into the processes and functions of human vision by studying the relationship between neural mechanisms and visual behavior through computer analysis and simulation. Students will learn about the anatomy and neurophysiology of vision and how they related to the phenomona of perception. An underlying theme will be to treat vision as a process of statistical inference. There will be in-class programming exercises using the language Mathematica. No prior programming experience is required; however, a background in calculus and linear algebra is helpful.
Readings & software
Grade Requirements
There will be a mid-term, final
examination, programming assignments, as well as a final
project.
The grade weights are:
Assignment due BEFORE
class start time on the day due.
Late Policy: Assignments turned in within 24 hours following
the due date will have 15% deducted from the assignment score. Assignments
turned in between 24 and 48 hours following the due date will have 30% deducted
from the score. Assignments more than 48 hours late will receive a score
of zero.
Check
this section before each class for recent additions and revisions.
To
see what the course looked like last time see
Psy
5036W SPRING2010 Web Pages
University Calendar | Date |
Lecture | Main Readings |
Supplementary Material |
Assignments |
I.
Introduction |
Sep 4 |
1. Introduction to Computational Vision |
1.IntroToComputationalVision.nb Accessing Mathematica Kersten, D., & Yuille, A. (2003). Bayesian models of object perception. Current Opinion in Neurobiology, 13(2), 1-9. (pdf) |
Screencast: http://www.wolfram.com/broadcast/screencasts/handsonstart/ (WITH AUDIO) |
|
Sep 9 |
2. Limits to Vision |
Hecht, S., Shlaer, S., & Pirenne, M. H. (1942). Energy, quanta, and vision. Journal of General Physiology, 25, 819-840. (pdf) |
Barlow, H. B. (1981). Critical Limiting Factors in the Design of the Eye and Visual Cortex. Proc. Roy. Soc. Lond. B, 212, 1-34. (pdf) Baylor, D. A., Lamb, T. D., & Yau, K. W. (1979). Responses of retinal rods to single photons. Journal of Physiology, Lond., 288, 613-634. (pdf) |
||
Sep 11 |
3. The Ideal Observer |
|
Griffiths, T. L., & Yuille, A. (2008). A primer on probabilistic inference. In M. Oaksford and N. Chater (Eds.). The probabilistic mind: Prospects for rational models of cognition. Oxford: Oxford University Press (pdf). |
Email Assignment #0 to TA (1% bonus) Assignment_0_Mathematica.nb |
|
Sep 16 |
4. Ideal observer analysis: Humans vs. ideals |
Burgess, A. E., Wagner, R. F., Jennings, R. J., & Barlow, H. B. (1981). Efficiency of human visual signal discrimination. Science, 214(4516), 93-94. (pdf) |
Kersten and Mamassian (2008), Ideal observer theory. The New Encyclopedia of Neuroscience, Squire et al., editors (pdf). Deneve, S., Latham, P. E., & Pouget, A. (1999). Reading population codes: a neural implementation of ideal observers. Nature Neuroscience, 2(8), 740–745. (pdf) |
||
II. Image
formation, |
Sep 18 |
5.Psychophysics: tools & techniques |
|
Farell, B. & Pelli, D. G. (1999) Psychophysical methods, or how to measure a threshold and why. In R. H. S. Carpenter & J. G. Robson (Eds.), Vision Research: A Practical Guide to Laboratory Methods, New York: Oxford University (pdf) Press.http://psych.nyu.edu/pelli/ Morgenstern, Y., & Elder, J. H. (2012). Local Visual Energy Mechanisms Revealed by Detection of Global Patterns. Journal of Neuroscience, 32(11), 3679–3696. For an excellent and free Matlab psychophysics package, see: http://psychotoolbox.org |
|
Sep 23 |
6. Bayesian decision theory & perception | 6.BayesDecisionTheory.nb Geisler, W. S., & Kersten, D. (2002). Illusions, perception and Bayes. Nat Neurosci, 5(6), 508-510. (pdf) |
Email Assignment #1 to TA (7%) Assignmt_1IdealDetector.nb (pdf) |
||
Sep 25 |
7. Limits to spatial resolution, image modeling, introduction to linear systems | 7.ImageModelLinearSystems.nb Campbell, F. W., & Green, D. (1965). Optical and retinal factors affecting visual resolution. Journal of Physiology (Lond.), 181, 576-593. (pdf) |
Williams, D. R. (1986). Seeing through the photoreceptor mosaic. 9(5), 193-197. (pdf) |
|
|
III.
Early visual coding |
Sep 30 |
8. Linear systems analysis |
Tutorials: |
||
Oct 2 |
9. Features or filters? Spatial filter models of early human vision | Campbell, F. W., & Robson, J. R. (1968). Application of Fourier Analysis to the Visibility of Gratings. Journal of Physiology 197, 551-566. (pdf) De Valois, R. L., Albrecht, D. G., & Thorell, L. G. (1982). Spatial frequency selectivity of cells in macaque visual cortex. Vision Res, 22(5), 545-559. (pdf) Watson, A. B. (1987). Efficiency of a model human image code. J Opt Soc Am A, 4(12), 2401-2417. (pdf) |
|||
Oct 7 |
10. Features or filters? Local processing & image analysis |
Gollisch, T., & Meister, M. (2010). Eye Smarter than Scientists Believed: Neural Computations in Circuits of the Retina. Neuron, 65(2), 150–164. (pdf)
|
Albrecht, D. G., De Valois, R. L., & Thorell, L. G. (1980). Visual cortical neurons: are bars or gratings the optimal stimuli? Science, 207(4426), 88-90.(pdf) Adelson, E. H., & Bergen, J. R. (1991). The plenoptic function and the elements of early vision. In M. S. Landy & J. A. Movshon (Eds.), Computational Models of Visual Processing. Cambridge, MA: The MIT Press: A Bradford Book.(pdf) ClassificationImage demo (ReverseCorrelation.nb) Ahumada, A. J., Jr. (2002). Classification image weights and internal noise level estimation. J Vis, 2(1), 121-131. (pdf) |
Assignment
2 (7%) Assignmt_2_Convolve.nb |
|
Oct 9 |
11. Coding efficiency: Retina |
Geisler, W. S. (2008). Visual perception and the statistical properties of natural scenes. Annu Rev Psychol, 59, 167-192. (pdf)
|
Laughlin,
S. (1981). A simple coding procedure enhances a neuron's information
capacity. Z Naturforsch [C], 36(9-10), 910-912.(pdf) Meister, M., & Berry, M. J., 2nd. (1999). The neural code of the retina. Neuron, 22(3), 435-450.(pdf) Srinivasan, M. V., Laughlin, S. B., & Dubs, A. (1982). Predictive coding: a fresh view of inhibition in the retina. Proc R Soc Lond B Biol Sci, 216(1205), 427-459.(pdf)
|
||
Oct 14 |
12. Coding efficiency: Cortex |
12.SpatialCodingEfficiency.nb Simoncelli, E. P., & Olshausen, B. A. (2001). Natural image statistics and neural representation. Annu Rev Neurosci, 24, 1193-1216.(pdf) |
ContrastNormalizationNotes.nb
Laughlin, S. B., de Ruyter van Steveninck, R. R., & Anderson, J. C. (1998). The metabolic cost of neural information. Nat Neurosci, 1(1), 36-41.(pdf) Lennie, P. (2003). The cost of cortical computation. Curr Biol, 13(6), 493-497. (pdf) Multi-resolution, image pyramids, and efficient coding: |
||
IV.
Intermediate-level vision, integration, grouping |
Oct 16 |
13. Edge detection | 13.EdgeDetection.nb (pdf) |
Hubel, D. H., & Wiesel, T. N. (1977). Ferrier lecture. Functional architecture of macaque monkey visual cortex. Proc R Soc Lond B Biol Sci, 198(1130), 1-59. (pdf) |
|
Oct 21 |
MID-TERM |
|
MID-TERM Study guide |
||
Oct 23 |
14. Objects and scenes from images |
von der Heydt R (2003) Image parsing mechanisms of the visual cortex. In: The Visual Neurosciences (Werner JS, Chalupa LM, eds.), pp 1139-1150. Cambridge, Mass.: MIT press.(pdf) |
Zhou H, Friedman HS, von der Heydt R (2000) Coding of border ownership in monkey visual cortex. J Neuroscience 20: 6594-6611. (pdf) |
|
|
Oct 28 |
15. Scene-based generative models |
15.SurfaceGeometryDepth.nb Kersten, D., Mamassian, P., & Yuille, A. (2004). Object perception as Bayesian Inference. Annual Review of Psychology, 55, 271-304. (pdf) |
|
||
Oct 30 |
16. Shape-from-X | Reflectance map: Shape from shading: Horn BKP (1986) Robot Vision. Cambridge MA: MIT Press. Ch 11 (pdf). Barron, J. T., & Malik, J. (2012). Shape, Albedo, and Illumination from a Single Image of an Unknown Object. CVPR, 1–8. cube.mov random.mov |
|
||
Nov 4 |
17. Shape-from-shading | Belhumeur, P. N., Kriegman, D. J., & Yuille, A. (1997). The Bas-Relief Ambiguity. (pdf) Johnson, M. K., & Adelson, E. H. (2011). Shape Estimation in Natural Illumination. Computer Vision and Pattern Recognition (CVPR), 2553–2560. penrose.obj ProjectIdeasF2013.nb |
Assignment 3 |
||
Nov 6 |
18. Motion: optic flow | Horn,
B. K. P., & Schunck, B. G. (1981). Determining Optical
Flow. Artificial Intelligence, 17, 185-203. (pdf)
Borst, A. (2007). Correlation versus gradient type motion detectors: the pros and cons. Philos Trans R Soc Lond B Biol Sci, 362(1479), 369-374. (pdf) http://web.mit.edu/persci/people/adelson/illusions_demos.html |
|||
Nov 11 |
19. Motion: biological, human perception | 19.MotionHumanPerception.nb Weiss, Y., Simoncelli, E. P., & Adelson, E. H. (2002). Motion illusions as optimal percepts. Nat Neurosci, 5(6), 598-604.(pdf) |
Heeger, D. J., Simoncelli, E. P., & Movshon, J. A. (1996). Computational models of cortical visual processing. Proc Natl Acad Sci U S A, 93(2), 623-627. (pdf) aperturedemomovie.mov(quicktime) |
||
Nov 13 |
20. Material perception |
V1 and lightness (pdf) Doerschner, K., Fleming, R. W., Yilmaz, O., Schrater, P. R., Hartung, B., & Kersten, D. (2011). Visual motion and the perception of surface material. Current Biology, 21(23), 2010–2016.
|
Fleming,
R. W., Dror, R. O., & Adelson, E. H. (2003). Real-world illumination
and the perception of surface reflectance properties. J Vis, 3(5),
347-368. (link)
http://web.mit.edu/persci/people/adelson/checkershadow_illusion.html |
Email Final project title & paragraph outline to TA (2%) | |
Nov 18 |
21. Texture.
|
|
Heeger DJ and Bergen JR, Pyramid Based Texture Analysis/Synthesis, Computer Graphics Proceedings, p. 229-238, 1995. (pdf). |
|
|
Nov 20 |
22.Science writing
|
|
Gopen & Swan, 1990 (pdf) For practical advice from someone in our field see: http://psych.nyu.edu/pelli/style.html For an example of good writing on a topic relevant to this course see Pelli et al., 2006 (pdf)
|
|
|
V.
High-level vision |
Nov 25 |
23.Perceptual integration | McDermott,
J., Weiss, Y., & Adelson, E. H. (2001).
Beyond junctions: nonlocal form constraints on motion interpretation.
Perception, 30(8), 905-923. (pdf) Hillis, J. M., Ernst, M. O., Banks, M. S., & Landy, M. S. (2002). Combining sensory information: mandatory fusion within, but not between, senses. Science, 298(5598), 1627-1630.(pdf) Ernst, M. O., & Banks, M. S. (2002). Humans integrate visual and haptic information in a statistically optimal fashion. Nature, 415(6870), 429-433. (pdf)
|
||
Nov 27(Thanksgiving week) |
24. Object recognition
|
DiCarlo, J. J., Zoccolan, D., & Rust, N. C. (2012). How does the brain solve visual object recognition? Neuron, 73(3), 415–434. (pdf)
|
Liu, Z., Knill, D. C., & Kersten, D. (1995). Object Classification for Human and Ideal Observers. Vision Research, 35(4), 549-568. (pdf) Tjan, B., Braje, W., Legge, G. E., & Kersten, D. (1995). Human efficiency for recognizing 3-D objects in luminance noise. Vision Research, 35(21), 3053-3069. (pdf) Tanaka K (2003) Columns for complex visual object features in the inferotemporal cortex: clustering of cells with similar but slightly different stimulus selectivities. Cerebral cortex 13:90-99.(pdf) Serre, T., Oliva, A., & Poggio, T. (2007). A feedforward architecture accounts for rapid categorization. Proc Natl Acad Sci U S A, 104(15), 6424-6429. |
Assignment 4 |
|
Dec 2 |
25. Object recognition, clutter, learning categories
|
25RecognitionClutter.key.pdf
|
Grill-Spector, K. (2003). The neural basis of object perception. Curr Opin Neurobiol, 13(2), 159-166.(pdf) Rao, R. P., & Ballard, D. H. (1999). Predictive coding in the visual cortex: a functional interpretation of some extra-classical receptive-field effects. Nat Neurosci, 2(1), 79-87. (pdf) Bullier, J. (2001). Integrated model of visual processing. Brain Res Brain Res Rev, 36(2-3), 96-107. (pdf) Tenenbaum JB: Bayesian modeling of human concept learning. In Advances in Neural Information Processing Systems. Edited by Kearns MSS, Solla A, Cohn DA: Cambridge, MA: MIT Press: 1999.(pdf) |
|
|
Dec 4 |
26. Vision for action, spatial layout, heading. | 26.SpatialLayoutScenes.nb |
Longuet-Higgins,
H. C., & Prazdny, K. (1980). The Interpretation of a Moving Retinal
Image. Proceedings of the Royal Society of London B, 208, 385-397. (pdf) |
Email a complete DRAFT of FINAL PROJECT to TA by Friday December 6th., 5 PM. |
|
Dec 9 |
27. Spatial layout continued. |
|
Tomaso Poggio and Christian R. Shelton (1999). "Machine Learning, Machine Vision, and the Brain." AI Magazine, 20(3), 37-55.(pdf) Torralba, A., Oliva, A., Castelhano, M. S., & Henderson, J. M. (2006). Contextual guidance of eye movements and attention in real-world scenes: the role of global features in object search. Psychol Rev, 113(4), 766-786. (pdf) Chikkerur, S., Serre, T., Tan, C., & Poggio, T. (2010). What and where: A Bayesian inference theory of attention. Vision Research, 50(22), 2233–2247. |
Email your peer comments to TA Dec 12th. (5% ) |
|
Dec 11 (Last day of class) |
FINAL EXAM | Final Study Guide | FINAL EXAM (16%) Study Guide (pdf) Drafts returned to you with Instructor comments |
||
Dec 16
|
Email Final Revised Draft of Project to TA (28%) |
Goal: This course integrates the behavioral, neural and computational principles of perception. Students often find the interdisciplinary integration to be the most challenging aspect of the course. Through writing, you will learn to synthesize results from diverse and typically isolated disciplines. By writing about your project work, you will learn to think through the broader implications of your project, and to effectively communicate the rationale and results of your contribution in words. You will do a final page research report in which you will describe, in the form of a scientific paper, the results of an original computer program on a topic in computational vision.
Your final project will involve: 1) a computer program and; 2) a 2000-3000 word final paper describing your project. For your computer project, you will do one of the following: 1) Write a program to simulate a model from the computer vision literature ; 2) Design and program a method for solving some problem in perception. 3) Design and program a psychophysical experiment to study an aspect of human visual perception. The results of your final project should be written up in the form of a short scientific paper or Mathematica Notebook, describing the motivation, methods, results, and interpretation.
If you choose to write your program in Mathematica, your paper and program can be combined can be formated as a Mathematica notebook. See: Books and Tutorials on Notebooks.
Your paper will be critiqued and returned for you to revise and resubmit in final form. You should write for an audience consisting of your class peers.
Completing the final paper involves 4 steps. Each step requires that you email a document to the teaching assistant.
Some Resources:
Student Writing Support: Center for Writing, 306b Lind Hall and satellite
locations (612.625.1893) http://writing.umn.edu.
NOTE: Plagiarism, a form of scholastic dishonesty and a disciplinary offense, is described by the Regents as follows: Scholastic dishonesty means plagiarizing; cheating on assignments or examinations; engaging in unauthorized collaboration on academic work; taking, acquiring, or using test materials without faculty permission; submitting false or incomplete records of academic achievement; acting alone or in cooperation with another to falsify records or to obtain dishonestly grades, honors, awards, or professional endorsement; altering, forging, or misusing a University academic record; or fabricating or falsifying data, research procedures, or data analysis. http://www1.umn.edu/regents/policies/academic/Code_of_Conduct.html. See too: http://writing.umn.edu/tww/plagiarism/ andhttp://writing.umn.edu/tww/plagiarism/definitions.html