Skip to content

Ask A Genius 1443: AI Art Revolution: MidJourney’s Impact and the Rise of Immersive Visual Worlds

2025-07-22

Author(s): Rick Rosner and Scott Douglas Jacobsen

Publication (Outlet/Website): Ask A Genius

Publication Date (yyyy/mm/dd): 2025/07/11

Rick Rosner and Scott Douglas Jacobsen explore MidJourney’s advancements in AI-generated visual art, from cinematic stills to the cusp of immersive, real-time worlds. As tools like Runway and Pika expand video capabilities, they discuss the creative potential, logical flaws, and societal implications of this rapidly evolving technology.

Scott Douglas Jacobsen: What are your thoughts on the recent announcements and advancements from MidJourney?

Rick Rosner: MidJourney is an AI tool that specializes in generating high-quality visual art from text prompts. It started off producing still images—a few years ago—and has become incredibly refined. While MidJourney itself does not yet generate video natively, other AI platforms like Runway, Pika, and Sora are making strides in that direction. That said, MidJourney’s image outputs are so detailed that people sometimes animate them or integrate them into video workflows.

On my other computer, I’m looking at one of its outputs. I don’t usually make art with AI myself, but someone typed in a prompt like, “Gladiators fighting three lions while the whole coliseum cheers.” It’s striking. The composition is epic—gods looking down, gladiators roaring in victory.

The image shows three gladiators taking turns swinging their swords and shields at a lion in the center of the arena. The coliseum is packed with spectators and classical sculptures. MidJourney has a firm visual grasp of what the coliseum might have looked like in its prime. You’ve got tiers of onlookers, columns, even shafts of sunlight piercing through. Were you able to see it?

Jacobsen: No.

Rosner: Probably because I was holding the screen at a weird angle. Anyway, it’s fantastically intricate. Just from a short prompt, it produces this layered, cinematic result. People sometimes take sequences of these images and animate them—either manually or using tools like Kaiber or Runway—to turn them into video-like experiences.

Jacobsen: However, and this is important—there’s a mistake we often make with these kinds of AI outputs: they can become more visually stunning, more photorealistic, even more pixel-accurate… but still be logically or physically flawed.

Rosner: What kind of flaws?

Jacobsen: Well, you’ve probably seen some. The physics might be off, even though the water, fabric, or shadows look real. That’s the thing.

Rosner: Yeah, I saw an AI-generated video clip—created using something like Runway or Pika—of a girl doing a skateboard trick off a stair set. She flips the board in the air, lands on it, and it looks smooth. Most of the motion is believable. But after watching it five or six times, I noticed a break in the logic of the physics—the way the board snaps perfectly into place midair seemed off.

Her hair looked real, too, long and blonde. As she descended, it flowed just like real hair would. The physics of that seemed spot-on. And most of the skateboarding motion was well done, but there were a few minor visual cheats.

So the question becomes: do you care if there’s a bit of cheating—if, say, you can type a couple of sentences and get a beautifully rendered, near-realistic video of cats in safety vests picking up litter along the highway, like it’s their job in a civic campaign?

The video I showed you—they’re putting up a hundred of those every day. And they rotate them out over 24 hours. They’re incredibly well-rendered.

They posted something yesterday saying their next big step is immersive, real-time video, where they create entire worlds. It could be a sci-fi world. It could be a world where cats are picking up litter. It could be a 1990s-style party with fashion models, shot using the aesthetic of a specific Kodak film stock.

And in those worlds, you’ll be able to walk around. That’s what they’re working on next.

Jacobsen: Wild. And how long has this level of AI-generated art even been public?

Rosner: That’s the thing. AI only started making amazing images—at least ones the general public had access to—maybe two and a half years ago? Maybe even less. And now we’re seeing entire explorable worlds. Stuff that would have taken an art director weeks to conceptualize and build—this technology pulls straight out of its training set in seconds.

Here’s an example: a futuristic world that looks like a European city. The description reads: “Hundreds of gigantic, strange, tall creatures walk through dystopian London streets, dressed in dirty yellow clothes. It’s foggy and gloomy.” The render is hyper-realistic and convincing. These creatures look like tree people—like I am Groot—but in yellow raincoats, walking among regular Londoners.

Jacobsen: That sounds wild.

Rosner: And it generated that in no more than five seconds. So I don’t know what to think.

On one hand, it’s thrilling. On the other hand, it is not very encouraging. Why be an art director if someone can type three sentences and get a fully realized, cinematic rendering of a made-up world?

Sure, there will probably be rules to protect human jobs, but primarily for union roles in places like Hollywood. If you want to make something for TikTok or Instagram? No rules. Anyone can do this.

Here’s another one: “Detailed view of a future city designed by social workers. Dinosaurs walk the streets. Everyone is sharing empathy and unconditional positive regard.”

https://googleads.g.doubleclick.net/pagead/ads?gdpr=0&us_privacy=1YYN&gdpr_consent=tcunavailable&tcfe=3&client=ca-pub-6496503159124376&output=html&h=600&slotname=2877544770&adk=1010750690&adf=551102728&pi=t.ma~as.2877544770&w=300&abgtt=6&fwrn=4&fwrnh=100&lmt=1753176371&rafmt=1&format=300×600&url=https%3A%2F%2Frickrosner.org%2F2025%2F07%2F11%2Fask-a-genius-1443-ai-art-revolution-midjourneys-impact-and-the-rise-of-immersive-visual-worlds%2F&host=ca-host-pub-5038568878849053&h_ch=3624119425&fwr=0&fwrattr=true&rpe=1&resp_fmts=4&wgl=1&dt=1753176371477&bpp=3&bdt=446114&idt=3&shv=r20250717&mjsv=m202507170101&ptt=9&saldr=aa&abxe=1&cookie=ID%3D2c1a04abb752983f%3AT%3D1753170404%3ART%3D1753176194%3AS%3DALNI_MZL-05NKJGw6ibDzLHvv9_thsScPQ&gpic=UID%3D0000121c61154c83%3AT%3D1753170404%3ART%3D1753176194%3AS%3DALNI_Mal7GlIEdLbUNrYBNmJVaQRaCUO2A&eo_id_str=ID%3D008ad1ff8cc92ba2%3AT%3D1753170404%3ART%3D1753176194%3AS%3DAA-AfjYu7BufnQ1X2sY9ka1oOBlK&prev_fmts=0x0%2C728x90&nras=1&correlator=3444562586579&pv_h_ch=3624119425&frm=20&pv=1&u_tz=180&u_his=1&u_h=900&u_w=1440&u_ah=799&u_aw=1440&u_cd=24&u_sd=2&adx=360&ady=5697&biw=1376&bih=719&scr_x=0&scr_y=6124&eid=31093516%2C31093577%2C95362655%2C95365880%2C95366024%2C95366350%2C95366854%2C95359266%2C95366368&oid=2&pvsid=181176817729698&tmod=906064014&uas=3&nvt=1&fc=1920&brdim=2%2C26%2C2%2C26%2C1440%2C25%2C1376%2C799%2C1376%2C719&vis=1&rsz=%7C%7CpoeE%7C&abl=CS&pfx=0&fu=128&bc=31&bz=1&pgls=CAA.&ifi=3&uci=a!3&fsb=1&dtd=11

It’s Dinotopia. The render looks like Victorian London crossed with Mexico City, populated with dinosaurs and people in unusual, stylish clothing. All of it generated in five seconds.

Jacobsen: So what are we supposed to think about this stuff?

Rosner: That’s where we are. We’re in an era where the line between replication and productivity, where you generate similar content, and generativity, where something truly novel is created, is beginning to blur.

We know it’s derivative. It’s not pure imagination—it’s the result of a massive training dataset. But the outcomes are still visually and conceptually stunning.

But when Carole and I sit down to watch several hours of TV every night, we’re watching shows that are still the product of human imagination. The creators of those shows use their databases too—their memories, instincts, artistic training, or research.

The end product—at least in quality productions—is imaginative but informed. I don’t watch much Star Wars anymore because a lot of it feels lazy and derivative. But I started watching Andor—have you seen it?

It’s a heist movie spread across eight episodes. It’s set in the Star Warsuniverse, and a group of characters comes together to steal the Empire’s payroll for some sector of the galaxy. So yes, it’s a heist plot—but it looks fantastic.

The city where much of it takes place has its distinct architecture. It looks retro-futuristic. It’s well-designed. Still, it’s as derivative as anything else. They made stylistic choices, like settling on a particular kind of brick. Everything in that city is made from these big, thin bricks—maybe six by eight inches and an inch thick. There are lots of arches. These were design decisions, and they work. They look great and convincing.

But now, a database can do the same thing—make a set of aesthetic choices that also look good.

I used to look at early Star Trek or even the original Star Wars movies and think, “Wow, this looks kind of cheap.” That first Star Wars movie came out in 1977, and they didn’t have today’s tech or resources. But now? We’re going to be surrounded by great-looking content. We’re going to live in it.

Take MidJourney—it gives you hundreds of images to choose from. And through platforms like Runway or Pika, you can generate short video clips. Each one runs for about three seconds. About 10% of the videos feature beautiful women modelling, dancing in clubs, and walking through cities. And for every two of those, there’s usually one with an attractive man.

So if you want to walk around a disco with your VR headset, surrounded by supermodels, you’ll be able to do that within a few months. And many people are going to want to.

Here’s another: a video of a capybara lounging in an inner tube, floating in a swimming pool. It’s an overhead shot, and it looks fantastic. The water is perfect—the reflections, the waves—it’s compelling. The physics is not cheated at all. In a couple of months, you’ll probably be able to jump into that pool—visually, at least. You’ll be able to move down to the capybara’s level and experience that world.

Jacobsen: It’s wild.

Rosner: I don’t have anything profound to say except: people need to be aware of what’s happening with AI.

If you’re in college, high school, or even middle school, you’re probably already spending time experimenting with AI—or at least using TikTok, where AI is running constantly in the background. You’ve got some idea of what’s going on.

But if you’re my age? Maybe you don’t. And you should, because this stuff is coming. And in many ways, it’s already here.

Last updated May  3, 2025. These terms govern all In Sight Publishing content—past, present, and future—and supersede any prior notices.In Sight Publishing by Scott  Douglas  Jacobsen is licensed under a Creative Commons BY‑NC‑ND 4.0; © In Sight Publishing by Scott  Douglas  Jacobsen 2012–Present. All trademarksperformancesdatabases & branding are owned by their rights holders; no use without permission. Unauthorized copying, modification, framing or public communication is prohibited. External links are not endorsed. Cookies & tracking require consent, and data processing complies with PIPEDA & GDPR; no data from children < 13 (COPPA). Content meets WCAG 2.1 AA under the Accessible Canada Act & is preserved in open archival formats with backups. Excerpts & links require full credit & hyperlink; limited quoting under fair-dealing & fair-use. All content is informational; no liability for errors or omissions: Feedback welcome, and verified errors corrected promptly. For permissions or DMCA notices, email: scott.jacobsen2025@gmail.com. Site use is governed by BC laws; content is “as‑is,” liability limited, users indemnify us; moral, performers’ & database sui generis rights reserved.

Leave a Comment

Leave a comment