Generating realistic image edits, like changing someone's hair or facial features, used to require professional photo editing skills. But with AI models like Diffusion Autoencoders (DiffAE), anyone can seamlessly modify portraits with just a few lines of code.
In this post, we'll learn how to use DiffAE to edit portraits by adding or removing various attributes like hair, facial hair, makeup, and more. Whether you're a photographer looking to streamline edits or a designer wanting to prototype ideas, this model makes it simple to get creative results.
We'll go over real-world examples of using DiffAE, explain how the model works, and provide sample code to help you get started. We'll also see how we can use AIModels.fyi to find similar models and decide which one we like. Let's begin.
Here are just a few examples of how DiffAE can be used to edit portraits:
The AI-powered edits open up new creative opportunities for all kinds of artists and creators. Apps built with DiffAE could allow everyday users to have fun modifying selfies and profile pictures as well.
Here are just a few examples of what the model can do to real images of real people! Some of these changes - turning old people into young people, making sad people smile, or adding or removing accessories like glasses - look super realistic!
DiffAE is an image-to-image model implemented and maintained by cjwbw and hosted on Replicate. It leverages a technique called diffusion models to manipulate images. It is hosted on Replicate and has an average runtime of 43 seconds with a cost per run of $0.02365. The hardware used is an Nvidia T4 GPU. More technical details and API specifications are available on its detail page.
The model takes in an input photo and optional parameters like "add bangs" or "make bald." It outputs a modified version of the original image with the requested changes. The diffusion process allows it to generate realistic, high-quality results.
Under the hood, DiffAE uses an autoencoder architecture. The encoder portion learns to represent the key features of an image, like hair and facial shapes. The decoder then transforms those features while preserving overall realism. This allows seamless modifications.
Let me first give you a technical explanation, then explain how the model works in plain English.
Technical explanation first! The diffusion autoencoder comprises two main parts: a semantic encoder and a conditional Denoising Diffusion Implicit Model (DDIM).
The semantic subcode captures the overarching themes, while the fine-grained form captures the minute details. Together, they can recreate the original image almost exactly.
To create new samples, the autoencoder uses a latent DDIM that is trained on the distribution of the semantic subcodes. During sampling, new subcodes and fine-grained forms are generated from a standard Gaussian distribution and then decoded to create a new image.
Make sense? If not, let's look at it again in plain English:
Imagine the diffusion autoencoder as an advanced art student who excels at capturing the essence and minute details of a painting.
So basically, one part understands what the image is about (the "big picture"), and the other part gets how the image is made (the "tiny details"). When you put both together, you can recreate the original image or variants that match very closely.
To make new art, our student would use their understanding of many different paintings to create a new masterpiece by combining a new big-picture idea with new tiny details.
While DiffAE is versatile and effective, it has its set of limitations:
Here's a breakdown of the input properties and output schema for the AI model specialized in face manipulation. This is critical information for anyone, especially founders and developers like you guys building AI products, who need to know what exactly they're working with.
image file
target_class (string)
manipulation_amplitude (number)
T_step (integer)
T_inv (integer)
T_step
, pick an integer from the allowed values.Your result comes in a JSON format with the following structure:
{
"type": "array",
"items": {
"type": "object",
"title": "ModelOutput",
"required": [
"image"
],
"properties": {
"image": {
"type": "string",
"title": "Image",
"format": "uri"
}
}
},
"title": "Output"
}
Here's what each element means:
type: "array"
items:
type: "object"
title: "ModelOutput"
required:
properties:
title: "Output"
You can read more information in the Diffae Replicate API spec here.
Okay, ready to fire up DiffAE to build your own face-modifying app? You've come to the right part of the guide!
In this example, we'll build a basic script we can use in a product that will artificially age people to show them what they'll look like when they're older. This is a popular feature in applications like Snapchat.
Aging filters are popular in applications like Snapchat. We'll build one that turns an input image into that of a much older person.
Here's a step-by-step guide with code snippets to help you pull this off. This example uses Node.js and the Replicate API to manage the whole operation.
young_face.jpg
in our example)First things first, you need to install the Replicate package for Node.js.
npm install replicate
Go get your Replicate API token and set it as an environment variable:
export REPLICATE_API_TOKEN=<paste-your-token-here>
Create a new Node.js file (make_old_with_webhook.js
) and let's get started with the coding part. We'll be using webhooks, so also include axios
for the HTTP call for the webhook.
npm install axios
import Replicate from 'replicate';
import axios from 'axios';
const replicate = new Replicate({
auth: process.env.REPLICATE_API_TOKEN,
});
const imageFilePath = './young_face.jpg';
const imageBase64 = fs.readFileSync(imageFilePath, { encoding: 'base64' });
// Using the `run` method to age the image
const output = await replicate.run(
'cjwbw/diffae:5d917b91659e117aa8b0c5d6213077e9132083e4a8a272f344cc52c3ba2f6e98',
{
input: {
image: imageBase64,
//add your other parameters here. Ex: target_class of Gray_Hair
},
}
);
// Using the `predictions.create` method with a webhook
const prediction = await replicate.predictions.create({
version: '5d917b91659e117aa8b0c5d6213077e9132083e4a8a272f344cc52c3ba2f6e98',
input: {
image: imageBase64,
},
webhook: 'https://example.com/your-webhook',
webhook_events_filter: ['completed'],
});
// Implement your webhook to receive the processed image asynchronously
// Assuming an Express.js setup:
app.post('/your-webhook', (req, res) => {
const processedData = req.body.data;
// Handle the processed data
res.status(200).send('Received');
});
A couple of variables to take note of:
version
: The specific version of the model you are using.input
: The actual data you are sending for processing. Here it's the base64 image string.webhook
: The URL to call when the prediction is complete.webhook_events_filter
: List of events that will trigger the webhook. Here, we are using completed
to indicate that the webhook should fire when the prediction is done.Execute your script:
node make_old_with_webhook.js
Once the prediction is complete, your webhook URL will be triggered, and you'll receive the processed image there.
Encountering issues with your script? Check the following:
So, we've taken a deep dive into DiffAE, understanding its capabilities, technical specs, and even how to get your hands dirty with the code. Now, what's next? You've got this powerful tool at your disposal, and there are limitless possibilities.
Optimization: Now that you know the basics, the next step is to optimize your workflow. Can you speed up the run time? Can you batch-process multiple images to make the most out of your investment?
Combine Models: Use AIModels.fyi to explore other models that you can integrate with DiffAE for even more powerful image manipulations. Maybe combine it with a model that enhances image quality or does background removal?
Build an App: Consider building an application or integrating it into an existing one. With the API setup, it's totally doable to have an app where users can upload a photo and tweak it based on preset categories like 'smiling,' 'tired,' etc.
Stay Informed: If you haven't already, consider subscribing to the AIModels.fyi mailing list. We offer concise digests, expert guidance, and a community of AI enthusiasts to help you stay on the cutting edge.
With that, you're well-equipped to forge ahead into the creative universe that DiffAE opens up. So go forth and hack away!
The resources below can help you out if you get stuck or just want to learn more about DiffAE.
Also published here.