paint-brush
New Framework by Beeble Researchers Promises to Bring Realistic Glow to Digital Portraits Using AIby@autoencoder
346 reads
346 reads

New Framework by Beeble Researchers Promises to Bring Realistic Glow to Digital Portraits Using AI

Too Long; Didn't Read

Researchers at Beeble AI have developed a method for improving how light and shadows can be applied to human portraits in digital images.
featured image - New Framework by Beeble Researchers Promises to Bring Realistic Glow to Digital Portraits Using AI
Auto Encoder: How to Ignore the Signal Noise HackerNoon profile picture

Authors:

(1) Hoon Kim, Beeble AI, and contributed equally to this work;

(2) Minje Jang, Beeble AI, and contributed equally to this work;

(3) Wonjun Yoon, Beeble AI, and contributed equally to this work;

(4) Jisoo Lee, Beeble AI, and contributed equally to this work;

(5) Donghyun Na, Beeble AI, and contributed equally to this work;

(6) Sanghyun Woo, New York University, and contributed equally to this work.

Editor's Note: This is Part 1 of 14 of a study introducing a method for improving how light and shadows can be applied to human portraits in digital images. Read the rest below.


Appendix


Figure 1. Be Anywhere at Any Time. SwitchLight processes a human portrait by decomposing it into detailed intrinsic components, and re-renders the image under a designated target illumination, ensuring a seamless composition of the subject into any new environment.

Abstract

We introduce a co-designed approach for human portrait relighting that combines a physics-guided architecture with a pre-training framework. Drawing on the Cook-Torrance reflectance model, we have meticulously configured the architecture design to precisely simulate light-surface interactions. Furthermore, to overcome the limitation of scarce high-quality lightstage data, we have developed a selfsupervised pre-training strategy. This novel combination of accurate physical modeling and expanded training dataset establishes a new benchmark in relighting realism.

1. Introduction

Relighting is more than an aesthetic tool; it unlocks infinite narrative possibilities and enables seamless integration of subjects into diverse environments (see Fig. 1). This advancement resonates with our innate desire to transcend the physical constraints of space and time, while also providing tangible solutions to practical challenges in digital content creation. It is particularly transformative in virtual (VR) and augmented reality (AR) applications, where relighting facilitates the real-time adaptation of lighting, ensuring that users and digital elements coexist naturally within any environment, offering a next level of telepresence.


In this work, we focus on human portrait relighting. While the relighting task fundamentally demands an indepth understanding of geometry, material properties, and illumination, the challenge is more compounded when addressing human subjects, due to the unique characteristics of skin surfaces as well as the diverse textures and reflectance properties of a wide array of clothing, hairstyles, and accessories. These elements interact in complex ways, necessitating advanced algorithms capable of simulating the subtle interplay of light with these varied surfaces.


Currently, the most promising approach involves the use of deep neural networks trained on pairs of high-quality relit portrait images and their corresponding intrinsic attributes, which are sourced from a light stage setup [10]. Initial efforts approached the relighting process as a ‘black box’ [45, 48], without delving into the underlying mechanisms. Later advancements adopted a physics-guided model design, incorporating the explicit modeling of image intrinsics and image formation physics [32]. Pandey et al. [34] proposed the Total Relight (TR) architecture, also physics-guided, which decomposes an input image into surface normals and albedo maps, and performs relighting based on the Phong specular reflectance model. The TR architecture has become foundational model for image relighting, with most recent and advanced architectures building upon its principle [23, 31, 52].


Following the physics-guided approach, our contribution lies in a co-design of architecture with a self-supervised pre-training framework. First, our architecture evolves towards a more accurate physical model by integrating the Cook-Torrance specular reflectance model [8], representing a notable advancement from the empirical Phong specular model [37] employed in the Total Relight architecture. The Cook-Torrance model adeptly simulates light interactions with surface microfacets, accounting for spatially varying roughness and reflectivity. Secondly, our pretraining framework scales the learning process beyond the typically hard-to-obtain lightstage data. By revisiting the masked autoencoder (MAE) framework [19], we adept it for the task of relighting. These modifications are crafted to address the unique challenges posed by this task, enabling our model to learn from unlabelled data and refine its ability to produce realistic relit portraits during fine-tuning. To the best of our knowledge, this is the first time applying self-supervised pre-training specifically to the relighting task.


To summarize, our contribution is twofold. Firstly, by enhancing the physical reflectance model, we have introduced a new level of realism in the output. Secondly, by adopting self-supervised learning, we have expanded the scale of the training data and enhanced the expression of lighting in diverse real-world scenarios. Collectively, these advancements have led SwitchLight framework to achieve a new state-of-the-art in human portrait relighting.


This paper is available on arxiv under CC BY-NC-SA 4.0 DEED license.