Penn Research Helps Develop Predictive Model of How Humans Estimate Speed

Most studies of psychological mechanisms involve watching those mechanisms in action and then devising a theory for how they work.

Psychologists at the University of Pennsylvania and the University of Texas at Austin have reversed this process. Operating more like physicists, they analyzed all of the steps involved in estimating how fast an object is moving, from light bouncing off the object, passing through the eye’s lens, hitting the retina and transmitting information to the brain through the optic nerve, in order to build an optimal model.

Such a model, that uses all the available information in the best way possible, is known as an “ideal observer.” They then tested this ideal observer model against people’s performance in a speed-estimation experiment.

That people are about as good as the optimal model at this task means that the neural mechanisms associated with speed estimation can be very precisely understood and predicted. It also suggests that engineers can similarly optimize technological applications that need to estimate the speed of a moving object, like cameras on a self-driving car, by mimicking biological systems.

Most previous studies of this aspect of visual processing used only artificial images. By employing small patches of natural images, the researchers’ model is more generally applicable to how speed estimation is accomplished in natural conditions in the real world.  

The research was conducted by Johannes Burge, assistant professor in the Department of Psychology in Penn’s School of Arts and Sciences, and Wilson Geisler, professor and director of Center for Perceptual Systems at UT-Austin.

It was published in Nature Communications.

“There have been many descriptions of what visual systems do when estimating motion, but there have not been many predictions for how they should do it,” Burge said. “We use a best-case scenario as a starting point for understanding what the visual system actually does. If we get a close match between the performance of the ideal observer model and the performance of humans, then we have evidence that humans are using the visual information in the best way possible.”

The aspect of the visual system that Burge and Geisler set out to model was its ability to estimate the speed of images of objects in motion.

Because this ability is critical to survival, there was reason to believe that evolutionary pressures had selected for visual systems that make very accurate estimates.

Burge and Geisler began by modeling the individual steps involved in processing moving images, such as the optics of the eye’s lens, how the retina translates stimuli into nerve impulses and how the early visual cortex interprets them.

The main challenge was determining what features in stimuli are truly critical for the latter task. Different sensory neurons have different receptive fields, which determine the stimulus features that cause the neuron to fire a signal. For example, one neuron might fire when it senses a bright patch of an image moving from right to left but not from left to right. Another neuron might have the opposite arrangement, firing only in response to images with bright patches that move left to right.

“We determine the small population of these different types of receptive fields that best supports accurate motion estimation,” Burge said. “We argue that these receptive fields constitute the population of receptive fields that visual systems ought to have if they want to maximize the accuracy of estimates of motion.” 

By combining the receptive fields with the well-understood physical model of how photons reach these receptive fields in the first place, the researchers were able to predict how a person would estimate the speed of motion in natural images. This was in contrast to previous studies of the topic, which tested models on abstract images in motion, such a black bars drifting across a white background. While accurate in those cases, such models begin to fail when applied to natural images. 

To make their ideal observer as realistic and generalizable as possible, Burge and Geisler trained it on small patches of natural scenes, similar to those that would be seen by looking out a moving car window through a straw. The speed of the image on the retina depends on the distance to the object in the scene. Images associated with more distant objects move more slowly. Images associated with near objects move more quickly. How to combine local estimates of image speed to obtain accurate estimates of self-motion and object motion is a big question for future research.

“With good local estimates, one will be in a better position to integrate them into an accurate global estimate of speed,” Burge said. 

To compare human behavior to their model, the researchers had experiment participants view thousands of pairs of moving natural image patches. Each movie in the pair moved at a slightly different speed. Participants would indicate which movie in the pair was moving faster.

The participants’ responses closely matched what the ideal observer model predicted, when the two speeds were nearly identical and when the two speeds were quite different.

“It is unusual to see data this clean in perceptual psychology experiments,” Burge said. “The close match between the performance of the ideal observer model and the performance of the humans suggests that we really understand the computations that are leading to humans’ speed estimation performance. We can apply that understanding to improving technology.”

Beyond the applications to future research on biological and machine vision systems, the researchers feel that this theory-driven approach to psychological research represents a better way of understanding the brain.

“This work is integrative in a way that much research is not,” Burge said. “Modeling how light falls on the retina, modeling how the light gets captured by neurons, selecting the relevant features and measuring behavior in the experiments — each of these steps requires a different set of skills and know-how. Each could constitute a stand-alone project. We’ve put them all together in an attempt to improve our understanding of how the visual system works.”

The research was supported by the National Science Foundation through grant IIS-1111328 and by the National Institutes of Health through grants EY011747 and EY021462.

Story Photo