zhaopinxinle.com

A Beginner's Insights on Text-to-Image from Midjourney's Oddities

Written on

I set out with the simple goal of using AI to create images of my favorite extinct creature: the sabertooth cat. These majestic predators, known for their impressive canine teeth, have fascinated me for over a decade. With the advent of new digital tools, I found an exciting opportunity to revive these ancient beings through art.

Midjourney, a text-to-image generator available on Discord, was my chosen platform. While I've dabbled with StableDiffusion, it remains less accessible until I can run it locally. My initial attempts at generating AI art of sabertooths had resulted in disappointing outcomes.

Midjourney boasts numerous time-saving features that have been well-documented by others, such as the commands for "remix," "remaster," and "test." However, I decided to experiment with starting from a basic prompt rather than using the advanced commands right away. Some of the images I present later did utilize those commands, as they were unable to prevent the bizarre results I encountered.

Eventually, I managed to produce a few stunning sabertooth cat images, which I will display at the conclusion of this article. First, though, I will delve into the unsettling variations I encountered on this journey.

This experience has taught me a great deal about how Midjourney operates. As a newcomer, having used this platform for less than two months while juggling the responsibilities of caring for three young children, I’ve begun to grasp the AI’s processing capabilities based on its outputs. The most significant lesson I learned is this: prompt engineering is fundamentally about effectively communicating with an AI that possesses impressive technical capabilities but lacks a true understanding of the real world. It often struggles to interpret our desires in the same way a human artist would. While you can request an image on these AI platforms, it doesn't guarantee that the outcome will match your expectations.

Creating satisfactory images of sabertooth cats was a lengthy and often perplexing process. The journey was filled with oddities!

The Disturbing Results

The results I obtained often leaned towards skeletal forms. It seems that Midjourney’s dataset on sabertooth cats comprises numerous skeletal remains. Even when I refrained from using advanced commands, many initial images depicted skeletal figures, despite my prompt indicating a desire to see living animals. These images showed spiky skulls with jagged teeth—far too many for any existing creature—accompanied by sparse or missing fur and empty eye sockets. With each new variation, the AI would attempt to add a bit more fur and a vaguely feline shape, resembling some half-preserved Pleistocene saber cat discovered in a cave, albeit a rather unfortunate version.

The transformation from prompt to image appeared almost like a reverse decomposition, evolving from bones to ragged furred figures, and ideally culminating in a lovely reconstruction. However, more often than not, my experiments yielded more bizarre creatures!

Initially, Midjourney produced various toothy and partial skull images when I requested sabertooth cats. Instead of layering flesh over the bones as a human might, the AI sought to transform the entire skeletal structure into the animal, resulting in some comically grotesque outcomes. Even when the AI seemed to understand the prompt, it often "forgot" the original intention by the end, leading to bizarre interpretations. The "remaster" feature could either refine a peculiar piece to align with the prompt or veer it into the surreal. When the output diverged significantly from the intended vision, Midjourney would still eagerly fill in the background around the oddly rendered creature.

This experience served as a reminder that Midjourney lacks the intuitive understanding of context that humans possess. While machine learning has advanced significantly, it still has a long way to go. The AI may not fully grasp the context of its creations, as its understanding differs from ours. Interestingly, I’ve observed that as long as the creature has legs, Midjourney tends to place it in an appropriate setting. (I should eventually write an article about my collection of floating prehistoric creatures, from tyrannosaurs to cephalopods.)

I must clarify: I am not an expert in machine learning! This is simply an observation I've made. Many others possess extensive knowledge in this area. My expertise lies primarily in art, not AI.

That said, it does appear that having feet grounds the creatures in Midjourney’s outputs. This particular creation lacked paws but was still categorized as an animal. After I hit "Remaster" on the black jawbone, it introduced a new environment. It seemed logical to place it in the sky, right?

Enough with the floating heads! At least this one had a body, right? It even featured two long teeth! You might think that remastering these images would finally yield the desired sabertooths. Some came close, but most still lacked lower jaws or had an excess of upper teeth.

As I attempted to refine my prompt to include a lower jaw, the results promptly lost all of their elongated teeth. Of course.

In an effort to save time, I utilized the " — test" and " — testp" commands during my quest for sabertooths, but these too produced peculiar results—albeit more polished ones.

The above creation raises questions: Is it a sabertooth or a hulking alien lemur, perhaps the next antagonist in the Kung Fu Panda series? Midjourney can pile on details until the subject appears meticulously crafted, but it falls short when the fundamental representation is unclear. I'm intrigued by the unusual facial markings, though their origins remain a mystery.

Occasionally, the issue arises when mentioning teeth, resulting in an overwhelming number of them. Crafting prompts involves a degree of randomness. You might either not receive what you specified or end up with an excess.

Midjourney is doing its best, but it can't always interpret nuances. It’s as if it exclaimed, "I heard you mention teeth? You want teeth? Here you go!"

What appeared to be sabers on a skull I generated seeking a living sabertooth transformed into the legs of this creature upon remastering.

The "remaster" feature can sometimes produce astonishing results, while at other times, it feels like decorating a pile of refuse with gold leaf. Predicting the outcome is often challenging. Frequently, when I generate a skeletal Smilodon approximation, upscaling and remastering it results in a striking depiction of a cat. Sometimes it appears fierce, while other times it looks serene (though usually without the desired teeth). Other times, the output is simply bizarre.

This creature resembles a bipedal sloth cryptid from a nightmare. Its features, from the peculiar yellow beard to the black-spiked tongue and strange posture resembling a bird, make it one of the most nightmarish creations due to its polished appearance.

Bonus: Diego, is that you? This resemblance may be coincidental, or perhaps it was influenced by various images of characters from the Ice Age franchise.

This creature exemplifies a recurring issue I face where the sabers warp and merge with the lower jaw.

Finally, Sabertooths!

Were all those uncanny creatures worth the few stunning sabertooth images I eventually produced? Personally, I believe they were invaluable for the learning curve they provided. I feel more adept now at generating the images I envision, even though the process still involves a degree of luck. It took considerable time and numerous attempts to create something resembling an actual sabertooth cat. Perhaps next time, the process will be swifter. Even so, none of these creations are based on specific scientifically documented species of sabertooth cats, which varied widely. These images merely resemble what could potentially be real animals.

This one appears somewhat believable, though the canine teeth lack proper tapering, and there seems to be a second row of upper incisors.

The first sabertooth I created with mostly appropriate teeth, this one seems notably small, with oversized eyes!

This result is quite good, though the vertical line down the center of its face is rather strange, and the double bottom teeth are inaccurate.

The upper teeth appear decent, but the rest of the mouth looks oddly smooth. However, the fur texture is beautifully rendered.

This strange result has its charms, especially since it appears to possess a tongue, unlike many of my previous creations.

width:800
alt:Watercolors by the author.

All digitally rendered cats in this article were generated on Midjourney, with no human modification.

To sum up, using " — test" and " — testp" can save significant time, but they won't eliminate the need for adjustments when striving for specific outcomes.

I may be preaching to the choir here, but while Midjourney is a valuable tool, it is just that—a tool. Its design is not to create scientifically accurate illustrations from mere words. Crafting effective prompts requires a different skill set than creating art from scratch. Midjourney is not a mind reader. For now, it remains quicker and more effective for me to sketch or paint sabertooth cats by hand, using either digital or traditional methods.

When I create a sabertooth cat or any other machairodontine creature, I aim to capture accurate details from the outset. Handmade art grants complete control to the artist. I don't consider myself a control freak, but when it comes to my scientific art, I cannot compromise. If I claim, "This is such-and-such an animal," it must be depicted as accurately as possible.

I have a passion for illustrating extinct animals, particularly those that are often overlooked. Many people lack mental images of these creatures. Though obscure, I believe that lesser-known prehistoric animals deserve broader recognition because they once roamed the Earth. How can I contribute to the paleoart conversation without striving for the most accurate representation? While AI text-to-image generators can be entertaining, they cannot replace the dedication and expertise of human paleoartists.

Share the page:

Twitter Facebook Reddit LinkIn

-----------------------

Recent Post:

# Unveiling Pilates: A Comprehensive Guide to Its Strength Benefits

Discover how Pilates can enhance strength training and overall fitness, featuring insights, comparisons, and scientific backing for your exercise journey.

Mastering Decision-Making: 7 Steps to Gain Control of Your Life

Discover 7 effective steps to overcome decision paralysis and regain control of your life through informed decision-making and planning.

Exploring Subspace Topology: A Deep Dive into Point-Set Concepts

An insightful overview of subspace topology and its foundational concepts in point-set theory.

Innovative Steps in Space: Relativity Space's Journey to the Stars

Relativity Space attempts its first test launch of the Terran 1 rocket, showcasing innovative 3D printing technology in space exploration.

The Future of Artificial Intelligence: Fear or Opportunity?

Explore the implications of AI on our jobs, society, and future. Is AI a threat or a boon?

Maximizing Engagement: The 7-Second Reel Strategy on Instagram

Discover how to leverage 7-second Instagram reels for increased engagement and follower growth.

Unlocking Your Creative Potential: Strategies for Imagination

Explore techniques to enhance your creativity and imagination for better personal and professional growth.

Exploring Creative Expression Through Notebooks and Journals

Discover how using notebooks can enhance creativity and self-expression through personal reflections and artistic endeavors.