In November 2022, ChatGPT burst onto the scene and rapidly recalibrated expectations about the facility of computers with natural language inputs and outputs. AI, and, more specifically, GenAI has dominated the collective conversation ever since. Yet, for all this talk, if anyone has a functional definition of AI or GenAI, we would be most obliged if you could share.

While the concepts are slippery, the impact thereof are comprehensible. The digital domain appears to have reached another inflection point—even if the tech itself does not quite live up to the extreme hype, the expectation shift seems material and enduring. The following is meant to help orient those who are still trying to establish some baseline understanding.

The popular conception of AI tends to be ‘things a machine can’t do yet.’ It is a job-killing robot until it is just a dishwasher or ATM. It was The Brain’s Last Stand until we acclimated to the reality that computers could dominate humans at chess. Defining artificial intelligence as “the ability of a digital computer or computer-controlled robot to perform tasks commonly associated with intelligent beings” may be accurate, but it is certainly not precise because the tasks commonly associated with human intelligence shift in concert with the ability of computers to perform those tasks.* AI is, by this definition, forever just over the horizon.

What is currently being broadly termed “Generative AI” suffers from a similar elusiveness. GenAI, or “GAI”, is sometimes used to describe tech that generates new content (whether it be words or images). That usage, however, can exclude other important applications like classifying data. Alternatively, GenAI is sometimes used as an umbrella term for the models enabling the tech, or for the tech incorporating such models—large language models (“LLMs”), large multimodal models (“LMMs”), diffusion models (often used for video and images)—rather than the type of output. Additional descriptors like “foundation model” and “general purpose AI” are also in relatively wide use but do not, so far, really solve the definitional challenges.

While this tech is decades in development, our focus herein is on recent advances in machine capacity to transmute various forms of input (text, voice, pictures) into various forms of output (text, voice, pictures, video, charts, functional code, structured data) to a degree that has most casual observers responding, “Wait, computers can do what now?” This standard remains fluid. But is shifting faster than ever before.

While there are many brilliant minds sharing on this topic, we recommend following Ethan Mollick as the most digestible popularizer to the laity (like us).

MENU