Results for ""
Researchers from the MIT Computer Science and Artificial Intelligence Laboratory (CSAIL) and Google Research have developed a diffusion model that can alter the material characteristics of objects in photographs.
The system, known as Alchemist, enables users to modify four characteristics of authentic and AI-generated images: roughness, metallicity, albedo (the primary base colour of an object), and transparency. As an image-to-image diffusion model, one can input any photograph and modify each characteristic within a continuous range of -1 to 1 to generate a novel visual representation. The photo editing capabilities can enhance the models in video games, the capabilities of AI in visual effects, and the quality of robotic training data.
The underlying mechanism of Alchemist begins with a denoising diffusion model: Researchers employed Stable Diffusion 1.5, a text-to-image model renowned for its ability to produce lifelike outcomes and facilitate editing. Prior research expanded upon the widely-used paradigm to allow users to make more advanced modifications, such as exchanging objects or adjusting the image's depth. In contrast, the CSAIL and Google Research approach uses this model to specifically target low-level characteristics, enhancing the intricate aspects of an object's material properties through a distinctive interface that surpasses other methods.
While previous diffusion systems could create impressive visual effects, Alchemist can make the same animal appear translucent. The device can transform a rubber duck into a metallic appearance, eliminate the golden colouration of a goldfish, and enhance the shine of an old shoe. Software applications such as Photoshop possess comparable functionalities, but this model can modify material properties more directly and uncomplicatedly. For example, altering the metallic appearance of a photograph necessitates multiple processes in the commonly employed software.
Additionally, the technique might improve robotic training data for manipulating tasks. If the machines are exposed to a broader range of textures, they can better comprehend the variety of objects they will handle in the real world. Alchemists can assist with picture classification by identifying instances in which a neural network cannot identify an image's material changes.
However, Alchemist currently has a few constraints. The model's ability to accurately deduce lighting is limited, occasionally failing to adhere to a user's input. Furthermore, the researchers aim to explore further the potential of utilizing such a model to enhance 3D assets for graphics at the scene level. Additionally, the Alchemist tool can deduce material properties based on visual data. The researchers suggest that this particular experiment can reveal connections between the visual and mechanical characteristics of items in the future.
Source: Alchemist: Parametric Control of Material Properties
Image source: Copilot