Carnegie Mellon engineers made an AI-powered robot that manually paints pictures from text, audio, and visual prompts

Will the real DALL-E please stand up

By Cal Jeffrey February 20, 2023, 18:06 9 comments

Carnegie Mellon engineers made an AI-powered robot that manually paints pictures from text, audio, and visual prompts

Serving tech enthusiasts for over 25 years.
TechSpot means tech analysis and advice you can trust.

In a nutshell: Researchers at Carnegie Mellon University's (CMU) Bot Intelligence Group (BIG) have developed a robotic arm that can paint pictures based on spoken, written, and visual prompts. The AI is very similar to DALL-E, except it physically paints the output in real time instead of producing a near-instant digital image.

The BIG team named the robot FRIDA as a nod to Mexican artist Frida Kahlo and as an acronym for Framework and Robotics Initiative for Developing Arts. Currently, the robot requires at least some contextual input and about an hour to prepare its style of brush strokes.

Users can also upload an image to "inspire" FRIDA and influence the output by providing plain language descriptors. For instance, given a bust shot of Elon Musk and the spoken prompt "baby sobbing," the AI created the portrait below (top left). The researchers have experimented with other input types, such as letting the AI listen to a song like Abba's Dancing Queen.

Some of our new work on the FRIDA project: Robot Synesthesia, painting from sound and emotion inputs.https://t.co/LrqyGigg5J pic.twitter.com/ouswMrMdyh
--- FRIDA Robot Painter (@FridaRobot) February 12, 2023

Carnegie Mellon Ph.D. student and lead engineer Peter Schaldenbrand quickly pointed out that FRIDA cannot perform like a true artist. In other words, the robot is not expressing creativity.

"FRIDA is a robotic painting system, but FRIDA is not an artist," Schaldenbrand said. "FRIDA is not generating the ideas to communicate. FRIDA is a system that an artist could collaborate with. The artist can specify high-level goals for FRIDA, and then FRIDA can execute them."

The robot's algorithms are not unlike those used in OpenAI's ChatGPT and DALL-E 2. It is a generative adversarial network (GAN) set up to paint pictures and evaluate its performance to improve its output. Theoretically, with each painting, FRIDA should better interpret the prompt and its product, but since art is subjective, who is to say what is "better."

Interestingly, FRIDA creates a unique color palate for each portrait but cannot mix the paints. For now, a human must mix and supply the right colors. However, a team in CMU's School of Architecture is working on a method for automating paint mixing. The BIG students could borrow that method to make FRIDA fully self-contained.

The bot's painting process is similar to an artist's and takes hours to generate a completed image. The robotic arm applies paint strokes to the canvas while a camera monitors from above. Occasionally, the algorithms evaluate the emerging image to ensure it creates the desired output. If it gets off track, the AI adjusts to get it more in line with the prompt, which is why each portrait has its own unique little flaws.

The BIG researchers recently published their research with Cornell University's arXiv. The team has also maintained a FRIDA Twitter account since August 2022, with plenty of the robot's creations and posts on its progress. However, FRIDA is not available to the public, unfortunately. The team's next project is to build on what it learned with FRIDA to develop a robot that sculpts.

9 comments 88 likes and shares

Tech Jobs: Find the next step in your career

Related Stories

Featured on TechSpot