Ask A Genius 1332: ChatGPT, AGI, and the Future of Multimodal AI
Author(s): Rick Rosner and Scott Douglas Jacobsen
Publication (Outlet/Website): Ask A Genius
Publication Date (yyyy/mm/dd): 2025/04/05
Rick Rosner: You had ChatGPT summarize my life using publicly available sources, which it processed into a coherent narrative. There were several minor errors—for example, it claimed I spent ten years in high school. That is inaccurate. I returned to high school a few times over the course of a decade, but I did not attend for ten continuous years.
Despite these inaccuracies, the summary presented a clear and compelling story. It identified key themes in my life, which is notable. I have lived according to certain recurring ideas, and it successfully recognized and organized them. It was impressive. Each claim included a source citation, which added credibility. That said, the quality may reflect strong summarization of sources more than independent insight.
Scott Douglas Jacobsen: We’re at a turning point with AI, particularly in language models. Because these tools work with language—the medium we rely on most—they feel especially impactful.
Artificial General Intelligence (AGI), defined as systems that act in the world rather than only process text, will be fundamentally different. Still, scaling improvements are significant. For example, the jump from GPT-3.5 to GPT-4o might represent a tenfold improvement. Integration across systems is another major step.
Rosner: You asked about multimodality.
Jacobsen: Technically, modality refers to sensory data, but AI developers often use the term differently. If five distinct systems can be integrated so that each fully interoperates with the others, the increase in capability could match or exceed the jump from GPT-3.5 to GPT-5o.
Even small updates—such as shifting the model to a more powerful server—can noticeably improve performance. Integration across varied modalities, and increasing processing power to unify them, is likely the next major leap. At that point, we may begin to approach truly general AI, or something that functions as such through AGIs.
Well, in the sense that if we’re using humans as a benchmark—and that’s typically the default—we assume the human brain and body as the standard: movement, language generation, and integrating both to make plans, act in the world, and communicate.
So, yes, we have the five basic senses. To replicate that in AI might require a similar scaling progression—from something like GPT-3.5 to GPT-5o. But I would argue motor processing is probably far less computationally intensive for machines than it is for us, even though it has been a challenging problem to solve. I think language is far more processing-intensive, and we are already making significant progress there.
Rosner: You mentioned five senses, but there are many others that could be developed artificially. The five human senses just happen to be well-adapted to our evolutionary needs. But machines can have others. For instance, some forms of artificial intelligence have been designed with magnetic field detection, and that’s just one example.
Jacobsen: Then there are derivative senses—like proprioception—which are combinations or extensions of the basic ones. I think it’s very possible to build on those. But integrating them would just require another order of magnitude in processing power—a 10x scale-up, as with earlier transitions.
Machines will likely be able to take in broader bands of the light spectrum, for example—far beyond what we can naturally perceive.
Rosner: Alright. Let’s move on.
Last updated May 3, 2025. These terms govern all In Sight Publishing content—past, present, and future—and supersede any prior notices. In Sight Publishing by Scott Douglas Jacobsen is licensed under a Creative Commons BY‑NC‑ND 4.0; © In Sight Publishing by Scott Douglas Jacobsen 2012–Present. All trademarks, performances, databases & branding are owned by their rights holders; no use without permission. Unauthorized copying, modification, framing or public communication is prohibited. External links are not endorsed. Cookies & tracking require consent, and data processing complies with PIPEDA & GDPR; no data from children < 13 (COPPA). Content meets WCAG 2.1 AA under the Accessible Canada Act & is preserved in open archival formats with backups. Excerpts & links require full credit & hyperlink; limited quoting under fair-dealing & fair-use. All content is informational; no liability for errors or omissions: Feedback welcome, and verified errors corrected promptly. For permissions or DMCA notices, email: scott.jacobsen2025@gmail.com. Site use is governed by BC laws; content is “as‑is,” liability limited, users indemnify us; moral, performers’ & database sui generis rights reserved.
