Below is a brief summary of our rendering framework and several key demonstrations of the power of this framework. To read the original paper, click on the title above. For a more detailed summary, including more results not presented in the paper as well as matlab and python (theano) implementations of the NLP distance model and code to reproduce the results presented here, please click here.
Abstract
We develop a framework for rendering photographic images by directly optimizing their perceptual similarity to the original visual scene. Specifically, over the set of all images that can be rendered on a given display, we minimize the normalized Laplacian pyramid distance (NLPD), a measure of perceptual dissimilarity that is derived from a simple model of the early stages of the human visual system. When rendering images acquired with a higher dynamic range than that of the display, we find that the optimization boosts the contrast of low-contrast features without introducing significant artifacts, yielding results of comparable visual quality to current state-of- the-art methods, but without manual intervention or parameter adjustment. We also demonstrate the effectiveness of the framework for a variety of other display constraints, including limitations on minimum luminance (black point), mean luminance (as a proxy for energy consumption), and quantized luminance levels (halftoning). We show that the method may generally be used to enhance details and contrast, and, in particular, can be used on images degraded by optical scattering (e.g., fog).
Rendering Framework
Here, we formulate a general solution for perceptually accurate rendering, directly optimizing the rendered image to minimize perceptual differences with the light intensities of the original scene, subject to all constraints imposed by the display. This constrained optimization formulation relies on four ingredients: knowledge of the original scene luminances (or calibration information that allows calculation of those luminances), a measure of the perceptual similarity between images, knowledge of the display constraints, and a method for optimizing the image to be rendered. We use a model of perceptual similarity loosely based on the transformations of the early stages of the human visual system [specifically, the retina and lateral geniculate nucleus (LGN)], that has previously been fit to a database of human psychophysical judgments. Because this model is continuous and differentiable, our method can be efficiently solved by first-order constrained optimization techniques. We show that the solution is well defined and general, and therefore represents a framework for solving a wide class of rendering problems.
Computing Perceptual Distance
Varying Image Acquisition Conditions
We performed a set of experiments to test the capabilities of our optimization framework over different image acquisition conditions. We begin with calibrated images, for which we know the exact luminance values (in cd∕m2) of the original scene.
We followed this with uncalibrated HDR images, where we need to make an educated guess about the luminance range in the original scene.
Detail Enhancement and Haze Removal
We showed in the preceding sections that using knowledge about the image acquisition process helps greatly in automatically rendering images, given the display constraints. In some cases, however, detail visibility in the scene might be unsatisfactory. Intuitively, photographers know that the amount of detail visible in a scene depends on the amount of available light. If the image has already been acquired, it is of course not possible to alter the light sources. However, since the scene luminances scale linearly with the intensity of the light sources, our method allows us to simulate increased intensity post hoc, by linearly re-scaling the luminances of the scene S.
Surprisingly, this same method of detail enhancement can also be used for the problem of haze removal. In a hazy scene, the local contrast has effectively been reduced (roughly speaking, by adding a constant level of scattered light) which makes detail more difficult to discern.