AI Artistry Unveiled

Robert Dawson
p:AI:nt
Published in
5 min readNov 6, 2023

--

Inspired by this article, I asked DALL-E and MidJourney to recreate a few of my drawings.

Assumptions

It’s the prompt, silly.

One of my now embarrassing assumptions going into this experiment was that it would test and compare the capabilities of DALL-E and MidJourney to recreate my drawings. It turns out that the actual test was of each technology’s ability to abide by the prompt, which in turn makes this a test of ChatGPT’s ability to understand the content and context of an image.

Process

  1. Upload a drawing to ChatGPT.
  2. Instruct ChatGPT to write a prompt describing the drawing for DALL-E.
  3. Instruct DALL-E to generate the drawing.
  4. Repeat step 3 with MidJourney.

Methodology

I chose three drawings of human faces, two in pencil and one in charcoal. My goal was to test how accurately DALL-E and MidJourney are able to recreate my drawings.

Tests

1. Pencil Drawing of Human Eye

Prompt

A detailed pencil sketch of a human eye, rendered with realism. The eye should be centered and possess a well-defined black pupil surrounded by a complex-patterned iris. Eyelashes should sprout unevenly from the rims of both the slightly creased upper and smoother lower eyelids. Above the eye, hints of a brow bone should be visible, leading to the onset of an eyebrow. The skin surrounding the eye should exhibit a combination of hatching, cross-hatching, and smooth shading techniques, providing depth and texture. The tear duct should be subtly suggested on the inner corner. The drawing’s mood should evoke a sense of depth and introspection, capturing the idea of the eye as a window to the soul.

My Drawing

DALL-E

MidJourney

Reflections

MidJourney’s image looks more like a traditional analog drawing. DALL-E’s looks more like a digital illustration. Neither captures the style of my drawing, but this reflects the accuracy of the prompt. I could have added descriptors like “crosshatching” and “sculptural,” but I wanted to test this idea with minimal intervention.

Also, I drew the right eye, and both systems captured the left one. But the prompt doesn’t mention which eye. Even if it did, my drawings often lack clear delineation.

2. Pencil Drawing of Sculpture Bust

Prompt

A meticulous pencil sketch of a classical Greco-Roman bust fragment, portraying the upper part of a human figure. The figure possesses defined facial features such as a pronounced nose, slightly parted lips, and partially visible ears. The top and sides of the bust are chiseled off, revealing rough, jagged edges contrasted against the smooth, detailed face. Notable details include the gentle curve of the chin, the subtle shadows cast by the nose and lips, and the intricate shading on the neck and broken edges. The skin texture is expertly rendered with a combination of smooth shading, hatching, and cross-hatching techniques, illustrating depth and volume. The overall mood evokes antiquity, capturing the essence and elegance of classical sculpture.

My Drawing

DALL-E

MidJourney

Reflections

Both generated images resemble traditional drawings, even if DALL-E’s looks more professional or commercial, just like before. Neither captured the viewing angle, but the prompt doesn’t note that.

3. Charcoal Drawing of Masked Woman

Prompt

A charcoal sketch of a woman wearing a surgical mask and AirPods in her right ear, depicted in a semi-realistic style. The mask covers her nose, mouth, and chin, adhering to the face’s contours, with ear loops visible. Her eyes are expressive and half-veiled by a wisp of hair. The mask’s texture is detailed with folds and creases. Her hair is styled back, exposing the ear with the AirPod. The artwork’s shading suggests a light source to the left, casting subtle shadows across her facial features, with a focus on the masked area and the earbud. The background is muted to draw attention to the portrait’s subject.

My Drawing

DALL-E

MidJourney

Reflections

Looks like MidJourney (version 5.2) might not know about AirPods. However, “earbud” is mentioned in reference to an AirPod, so this could reflect a misapprehension of, or lack of training material to learn, the difference between earbuds and on- or over-ear earphones.

Also, DALL-E’s image again looks more professional and digital, while MidJourney’s looks less like charcoal than a marker, especially in the bottom right (the subject’s left shoulder).

Discussion

Good:

  • Both DALL-E and MidJourney generated beautiful art that would require a talented artist to render traditionally.
  • Either of these tools works well for most image-related creative needs.

Bad:

  • DALL-E’s images look too professional. These tools seem to default to generating professional-looking imagery unless you specify otherwise.
  • MidJourney isn’t as accurate. This distinction was one of the biggest initial selling points of DALL-E 3.

Other:

  • ChatGPT amazes me with its ability to understand images. But it isn’t perfect, and these tests confirm as much. Manually adjusting the generated prompts would have likely given DALL-E an advantage.

Takeaways

  • Both DALL-E and MidJourney excel at text-to-image generation.
  • Create sufficiently detailed prompts to capture essential details of the image you want the tool to recreate.

--

--