Getting Creative with AI - By Elizabeth Shubov

Page created by Shawn Potter

Home & Garden

English

Like
Share
Embed
Fullscreen
Slides
Download HTML
Download PDF
Abuse

←

→

Page content transcription

If your browser does not render page correctly, please read the page content below

Getting Creative with AI - By Elizabeth Shubov

Getting Creative with AI
By Elizabeth Shubov

Advances in artificial intelligence (AI) have prompted innovations in fields ranging from
DNA sequencing to chess. Recently, much commentary has focused on progress in AI-
driven artistic creation. AI can now produce photo-realistic images capable of bending
reality and artistic content impossible to distinguish from human-created works. Soon,
we may find that AI has created or inspired much of the content we consume, including
news stories, literature, and even music. These innovations, which warrant both
excitement and trepidation, affirm that technologies are not just engines of productivity
for rote tasks but have the power to expand our creative and problem-solving capacities
– and literally to depict and “see” things in new ways.

To understand the technology behind the state of AI image art today, and learn
something about its viability, I set out to create my own AI-powered images. I’ve
documented my experience and tried to explain the technology in simple terms to break
down the technological barriers.

OpenAI took the internet by storm a
few months ago releasing their
groundbreaking art-generating model
called DALLE-2. Last week, Google
released Imagen, a model they claim is
even more advanced. Both models
produce photo-realistic images with
little to no evidence of machine
creation. Where prior models typically
had trouble forming things like human
and animal faces, these new models
are a profound technological leap
forward.

This image was created using DALLE-
2 with the text prompt “A Shibu Inu
dog wearing a beret and black
turtleneck.” The features of the dog’s
face are remarkably realistic and
uniform. There is no smudging or
smearing around the transitional areas,
as was typical with earlier AI art. One
could easily assume this is a real dog -
that is, unless you have ever tried to
take a picture of a real dog in a hat.

Not to be outdone, Google released the image below last week showing a sunglass-
wearing, bike-riding dog generated from a text prompt. Since Imagen has not been
released broadly yet it is unclear how consistent the results are, but the images are quite
impressive, nonetheless.

                                                      These images are so realistic,
                                                      powerful, and intriguing that, for
                                                      some, they stoke fears over the ways
                                                      AI technology might replace human
                                                      image artists, confuse a consuming
                                                      public about what is “real” content
                                                      and what is “synthetic” content, or
                                                      lead to new and immersive ways for
                                                      people to exploit and harm each
                                                      other. All of these are critical
                                                      questions. For purposes of this piece,
                                                      however, I am focused on how the
                                                      technology works and how to work
                                                      with it. From there, we can discuss
                                                      policy and governance, but playing
                                                      with it and learning how it works was
                                                      an important journey too, with
                                                      important lessons.

                                                      How Does AI Generate Image Art?

                                                      There are several different ways AI
                                                      can generate image art. Much of the
                                                      AI image art available over the last five
                                                      years has been generated using deep
                                                      learning systems called generative
adversarial networks (GANs). In this type of deep learning network, the system is trained
using competing AI that learns from each other. In simple terms, using real image data
fed into a system, a “generator” and a “discriminator” work together to create new
images based on either user input, random input, or input generated using language
models such as GPT-3. As the training process continues, AI learns to discriminate
against images of poor quality or inaccurate subject matter. The networks take negative
feedback from the bad images and positive feedback from the better ones and continue
to learn from this process to create higher fidelity images.

Moving beyond GANs, popular AI image art algorithms such as DALLE-2, Imagen, Disco
Diffusion, and Midjourney are diffusion model deep learning networks. In a diffusion
model, AI is provided an image, and noise or static is added until the image is
unrecognizable. The training network takes the image and tries to remove the noise to

_____________________________________________________________________________________
Getting Creative with AI                                                                    2

recreate the original image. Through each step of the process, it learns how to create
more cohesive images. These models are typically slower due to the many steps in the
denoising process but are promising in the stability of the models and ability to produce
ultra-realistic images.

Along with these different types of neural networks, deep learning systems used to
classify images may be used to double-check that the image output matches the original
text prompt. Guided by natural language processing (NLP), one such network referred
to as CLIP (or Contrastive Language-Image Pre-Training) acts as a bridge between text
and images. CLIP “views” the image, compares it to other images, comes up with
descriptors for the image, and then compares that to the original text input used to create
the image. If they do not match, the system will use this negative feedback to try to
reform the image until the image matches the text or discard errant images.

The results of these combined neural networks are mind-blowing machine-created
images such as those released by OpenAI showing a raccoon astronaut and now
Twitter-famous avocado armchair.

_____________________________________________________________________________________
Getting Creative with AI                                                                3

Images similarly created and released by Google include a raccoon in a space helmet
and a cactus wearing sunglasses in the desert.

Accessing and Generating AI Art

Some larger models that create hyper-realistic images are restricted in access, citing
proprietary issues and concerns over the generation of salacious or malicious content.
OpenAI, for example, states in their content policy that it is an experimental research
platform and that the images and content created must remain G-rated, non-deceptive,
not targeted at individuals, and not used for commercial purposes. (Cue questions about
how well those policies can be enforced and where the line will fall between good clean
fun and harm and deception; when and how consumers should be informed when they
are being served AI-generated images; how AI-generated images may be IP-protected
(or not); when and how consumers can opt into or out of more challenging content; or
myriad other challenges.)

Other image-generating algorithms are widely available online either through websites
that charge a small platform fee or in open-source repositories. Varying degrees of
coding and math knowledge are necessary depending on the type of generator one is
using. However, on some sites, one can create images based solely on text input and
setting the number of iterations to evolve the images.

_____________________________________________________________________________________
Getting Creative with AI                                                            4

Exploring Creative Potential Using AI

In January, I started experimenting with AI-generated images in order to create art. Some
of the images I created using a GAN-based model known as VQGAN+CLIP are up in a
public gallery on Spatial.io, which you can check out on any device or using a VR
headset.

Since it was shortly before Valentine’s Day, I used the heart theme to learn about the
creation process. Most of the early images went directly into the recycling bin. However,
in time, I discovered how to work with algorithms to produce better art. As illustrated by
the differences in the images below, the order, type, and descriptors used made a huge
difference in the quality of the work produced.

This is one of the earlier images with the
prompt “valentine overjoyed.” While it is
interesting, it lacks cohesion, is disjointed,
and does not look very polished.

                                                 This one using the prompt “broken heart”
                                                 is fractured and includes both valentine
                                                 and real anatomical hearts.

_____________________________________________________________________________________
Getting Creative with AI                                                               5

In this later image I used the prompt “robotic
long stem rose coming out of the road with an
oil painting in steampunk engine.” The image
has a steampunk feel and you can identify a
rose in the image with its robotic counterpart.

                                         Improvements are also noticeable in “abstract
                                         lotus as a deep neural network in the water
                                         with bright colors.”

And “fireworks over a boat on the ocean with
northern lights.”

_____________________________________________________________________________________
Getting Creative with AI                                                           6

Using the prompt “earth on fire in Unreal Engine 3D shading shadow depth” you see
more photorealistic molten lava burning on the earth in the foreground and an earthly
room in the background.

AI is trained on databases of images and
styles, so incorporating those into the
keywords drastically improved the outputs.

Through the process of learning to work
with AI to create recognizable and
cohesive images, I found that my idea of
creativity changed. Learning how to work
with AI became a crucial part of the
creative process. Despite many frustrating
failed attempts, this exercise opened by
creative outlets and pushed my boundaries
in unexpected ways. Rather than replace
my creativity, AI helped supplement and
expand it.

What does this mean for the future?

As we look at processes in the workplace and our lives that can be improved with AI, we
should not discount the potential this technology has to unleash creativity in people. It
is not just art that can be autonomously created, but also music, literature, and news. As
AI gets better at generating content, we can continue to look for ways to use it to increase
productivity or spur our own creativity. Healthy skepticism is normal and necessary when
examining new technologies, and some of these questions and risks are serious and
essential to sort out. At the same time, we can have a touch of optimism and appreciate
the possibilities of the future. After spending this time creating with AI, I remain hopeful
that human artists will continue to evolve with and without AI, and that AI art capabilities
will help push humans to new heights.

Written By Elizabeth Shubov. Elizabeth is an Emerging Technology Consultant, Attorney
and an Advisor with The Cantellus Group.

These blogs by TCG Advisors express their views and insights. The strength and beauty
of our team is that we encompass many opinions and perspectives, some of which will
align, and some which may not. These pieces are selected for their thoughtfulness,
clarity, and humor. We hope you enjoy them and that they start conversations!

_____________________________________________________________________________________
Getting Creative with AI 7