top of page
  • Writer's pictureAnantaya Pornwichianwong

AI Generating and Transforming Images: How does it work?

Recently, a fascinating trend in AI technologies has emerged: AI that generates and transforms images. We have witnessed the rise of AI image generators like MidJourney and DALL·E, which creates unique images based on text descriptions. Another innovation is Photoshop Beta, an AI-powered tool that expands and fills the background of any image. Among the recent viral tools is Snow AI, an AI-powered face transformation feature that effortlessly turns our portraits into Korean idol lookalikes, as if taken in a professional studio. The output images are so impressive that many users adopt them as their profile pictures on social media platforms.

As a tech company that highly values technology literacy and utilization, Sertis encourages everyone to keep up with innovations and learn to understand the mechanics behind these technologies while being mindful of the potential implications.

In this article, Sertis will take you to explore how these AI-powered innovations work, including AI Image Generator, Generative Fill AI, and Face Transformation AI, enabling you to stay connected with the ever-evolving technological world.

How does AI generate and transform images?

An AI that generates and transforms images can be categorized into several types, for instance, types that create images based on input descriptions, expand the original image, and transform the original image such as a face-aging app that shows the older version of you.

In this article, we will categorize these AI technologies into three types: AI Image Generator, Generative Fill AI, and Face Transformation AI.

Let's delve into how each of these three technologies works.

AI Image Generator

This category includes AI technologies like Midjourney and DALL·E, which can generate unique images from descriptions. The AI model is trained on labeled images, such as a picture of a table labeled as 'Table.'

Using this approach, the AI can recognize various objects and concepts. For instance, when it encounters the word 'Apple,' it knows it represents a red, round-shaped fruit. It can even identify the distinctive features of Van Gogh's paintings and understand how to compose a picture realistically.

When we input descriptions, the AI analyzes the elements and generates new, unique images based on its understanding and training data. The remarkable feature of these applications is that the AI doesn't simply copy and paste from the images it was trained on. Instead, it creates unique images just like humans do when drawing from their imagination and contextual understanding.

For example, if we describe 'a person carrying a yellow umbrella walking on a rainy night,' the AI immediately knows that it needs to create an image with elements such as a person, a yellow umbrella, and a background depicting a rainy night. The AI then generates these elements with unique characteristics and styles that fit the context, resulting in a novel and one-of-a-kind image.

Generative Fill AI

A buzzworthy AI feature making headlines lately is the Generative Fill from Photoshop Beta. This powerful application can effortlessly add or remove objects in images while filling and expanding backgrounds flawlessly. The model is extensively trained on a vast dataset of images to recognize different objects and their distinctions.

The process begins with the AI analyzing the image and identifying the areas that require filling. It then generates new elements that seamlessly blend with the original image.

When adding an object, the AI carefully studies the original image and creates the object with matching shape, color, features, and even shadows, ensuring a harmonious fit with the background.

In the case of removing an object, such as eliminating a person from a street scene, users can select the object, and the AI will analyze the surrounding background, considering factors like the street's color, texture, lighting, buildings, and sky. It skillfully connects the dots and fills in the missing part without a trace.

Expanding the background follows a similar approach. Although the AI doesn't know precisely how the real place looks, it leverages the data it was trained on and the original image to determine how the background should be expanded. For instance, it predicts the shape of tree branches or extends the road in a way that creates a perfect, seamless background expansion.

Face Transformation AI

The AI technology behind the viral face transformation feature is known as Face Transformation AI. It powers popular applications like 'Snow,' which turns our faces into Korean idol lookalikes, and 'FaceApp,' which gained widespread attention a few years ago for revealing older versions of ourselves. This AI leverages deep learning models and computer vision technology to achieve its remarkable transformations, following these key steps.

In the first step, the AI is trained on a vast collection of final result images. For example, if we want a face transformation resembling Anime characters, the model needs to learn from a substantial dataset of Anime pictures to identify and memorize their distinct features. Once we upload our original image to the application, the face recognition system detects our face and performs feature extraction. This process captures the unique and distinctive features of our face, including the shape of our eyes, nose, and mouth. By preserving these essential characteristics in the final image, the AI ensures that we can still recognize ourselves even after the transformation.

After that, the AI proceeds with the mapping and manipulation process. By carefully studying the connections between the elements on our face and the target transformation, the AI creates a detailed map of our facial features. This map is then overlaid onto the final result image, and the AI performs reshaping and adjustments to align our face with the target transformation.

For instance, if we wish to resemble an Anime character, the AI system will skillfully reshape, resize, and reposition our eyes to give them a more Anime-like appearance, or for the face-aging app, it may add subtle wrinkles near our eyes to create a more aging effect. Throughout this process, the AI ensures that the original distinctive features of our face are preserved, allowing us to be able to identify ourselves even after the transformation.

After the mapping and manipulation process, the AI proceeds with the style transfer to enhance the transformation by adding features or objects to the background for a convincing result. Once the transformation is complete, the AI generates the final image, giving us Anime-like or Korean idol-like faces, or even a glimpse into our older selves.

Furthermore, these AI applications learn from our usage, becoming more adept as more users engage with them. The collective usage data enhances the AI's capabilities.

Nevertheless, it is essential to remember that behind these fascinating applications are learning AI systems, some limitations and flaws are inevitable. Additionally, uploading personal images raises concerns about privacy, personal data, consent, and potential misuse, calling for both social and individual responsibility.

While embracing these technologies, we must remain mindful of the cautions involved, ensuring that technology remains a safe tool for creativity and inspiration.

Stay connected with the ever-evolving technological world with Sertis. We are here to partner with you to unlock new possibilities in the world of data and AI.

Learn more about Sertis and our solutions:

bottom of page