Monday, April 11, 2016

Understanding Foveated Rendering





Foveated rendering is a rendering technique that takes advantage of the fact that that the resolution of the eye is highest in the fovea (the central vision area) and lower in the peripheral areas. As a result, if one can sense the gaze direction (with an eye tracker), GPU computational load can be reduced by rendering an image that has higher resolution at the direction of gaze and lower resolution elsewhere.

The challenge in turning this from theory to reality is to find the optimal function and parameters that maximally reduce GPU computation while maintaining highest quality visual experience. If done well, the user shouldn’t be able to tell that foveated rendering is being used. The main questions to address are:
  1. In what angle around the center of vision should we keep the highest resolution? 
  2. Is there a mid-level resolution that is best to use? 
  3. What is the drop-off in “pixel density” between central and peripheral vision? 
  4. What is the maximum speed that the eye can move? This question is important because even though the eye is normally looking at the center of the image, the eye can potentially rotate so that the fovea is aimed at image areas with lower resolution.
Let's address these questions:

1. In what angle around the center of vision should we keep the highest resolution?

Source: Wikipedia
The macula portion of the retina is responsible for fine detail. It spans the central 18˚ around the gaze point, or 9˚ eccentricity (the angular distance away from the center of gaze). This would be the best place to put the boundary of the inner layer. Fine detail is processed by cones (as opposed to rods), and at eccentricities past 9˚ you see a rapid fall off of cone density, so this makes sense biologically as well. Furthermore, the “central visual field” ends at 30˚ eccentricity, and everything past that is considered periphery. This is a logical spot to put the boundary between the middle and outermost layer for foveated rendering.

2. Is there a mid-level resolution that is best to use?  and 3. What is the drop-off in “pixel density” between central and peripheral vision? 

Some vendors such as Sensomotoric Instruments (SMI) use an inner layer at full native resolution, a middle layer at 60% resolution, and an outer layer at 20% resolution. When selecting the resolution dropoff, it is important to ensure that at the layer boundaries, the resolution is at or above the eye’s acuity at that eccentricity. At 9˚ eccentricity, acuity drops to 20% of the maximum acuity, and at 30˚ acuity drops to 7.4% of the max acuity. Given this, it appears that SMI’s values work, but are generous compared to what the eye can see.




4.    What is the maximum speed that the eye can move?


Source: Indiana University
A saccade is a rapid movement of the eye between fixation points. Saccade speed is determined by the distance between the current gaze and the stimulus. If the stimulus is as far as 50˚ away, then peak saccade velocity can get up to around 900˚/sec. This is important because you want the high resolution layer to be large enough so that the eye can’t move to the lower resolution portion in the time it takes to get the gaze position and render the scene. So if system latency is 20 msec, and assume eye can move at 900˚/sec – eye could move 18˚ in that time, meaning you would want the inner (higheslayer radius to be greater than that – but that is only if the stimulus presented is 50˚ away from current gaze. 

Additional thoughts

Source: Vision and Ocular Motility by Gunter Noorden

Visual acuity decreases on the temporal side (e.g. towards the ear) somewhat more rapidly than on the nasal side. It also decreases more sharply below and, especially, above the fovea, so that lines connecting points of equal visual acuity are elliptic, paralleling the outer margins of the visual field. Following this, it might make sense to render the different layers in ellipses rather than circles. The image shows the lines of equal visual acuity for the visual field of the left eye – so one can see that it extends farther to the left (temporal side) for the left eye, and for the right eye visual field would extend farther to the right.

For additional reading

This paper from Microsoft research is particularly interesting. 
They approach the foveated rendering problem in a more technical way – optimizing to find layer parameters based on a simple but fundamental idea: for a given acuity falloff line, find the eccentricity layer sizes which support at least that much resolution at every eccentricity, while minimizing the total number of pixels across all layers. It explains their methodology though does not give their results for the resolution values and layer sizes.

Note: special thanks to Emma Hafermann for her research on this post

For additional VR tutorials on this blog, click here
Expert interviews and tutorials can also be found on the Sensics Insight page here

No comments: