Agents vs. applications

How are agents different from applications?

Krishna Nandakumar
October 20, 2023

When ChatGPT launched in November 2022, we were introduced to reasoning on demand. You can ask it to solve a problem and include the steps it took to solve it.

The natural evolution of a machine being able to reason is for it to start taking actions. People have started to refer to this as “agents”. But our applications already take action; so what are agents and how are they different?

I still have no idea what an AI agent is.
— Suhail (@Suhail)
8:44 PM • Oct 16, 2023

When we first started using computers, we would type into a terminal to ask it to do something. This was restrictive because you needed to know the syntax if you wanted to do anything.

A computer terminal

Next, we had graphical user interfaces (GUIs) that continue to be commonplace today. You click a button and something happens. Suddenly, billions of people had the ability to use a computer and complete tasks.

Gmail’s graphical user interface

The tech for this has changed a lot: storing your data on a local server vs. In the cloud for example. But for the end user, the method of interaction hasn’t changed much.

The complexity of what the software can do has also changed a lot. We went from publishing content to sending money across countries with the click of a button. However, we still need to be specific and explicit about what we want.

An agent, much like in real life, does things for you. You don’t need to click a button to do something, you can simply ask for it.

Like any technology, the shift will be gradual and then asymptotic. In the beginning, you will have agents that are very good at a small number of tasks. You can ask them for something, and they will choose the best course of action.

For example, you might ask your sales agent to find you more leads for your business, or your personal agent to do the grocery shopping for the week. You will likely confirm all of the actions that the agent takes before they are completed.

Agents will start to take on a broader range of tasks. They may even do tasks for you autonomously. For example, reach out to you when you have an email or message that requires urgent attention.

Given this evolution, I’ve started to classify agents using the following framework:

Conversational agents: they take instructions in natural language and can complete basic actions. ChatGPT is a conversational agent and can complete basic actions (e.g. search the internet).
Domain-specific agents: agents who are restricted to a specific domain. For example, Kili is building a domain-specific agent for customer service.
General purpose agents: agents who can complete a wide variety of tasks for you. MultiOn is building a personal agent.

To close, agents are different from applications in two ways. First, agents allow the user to do things by simply asking. This saves time because you don’t have to hop through multiple screens to do something. Second, because natural language is the syntax, it allows the agent to act passively and take on a broad range of tasks.