- Krishna's Newsletter
- Posts
- Software that runs when you sleep
Software that runs when you sleep
The best software is the kind you never see
The best software is the kind that you never see. It runs quietly in the background and only interrupts you when you need. Many of the products you use today will move to this paradigm.
Deterministic user interfaces
It’s Friday night and you’re meeting friends. You need to find a restaurant and you’re dreading it. OpenTable is your go to: open the app, choose your area and then cuisines. Every time you find a restaurant you like, you review the menu and ratings on Google. It takes you half an hour. Every single time.
Most software today works this way. User interfaces are deterministic. Take a look at the OpenTable website below: the filters, buttons and other elements are pre-defined. The diner needs to make your decision based on the inputs available on the interface. If you’re looking for South Indian food and the filter doesn’t exist, tough luck.
Your OpenTable search isn’t going well. You decide to pull the plug and call a friend instead. The conversation is natural. Your friend gives you a couple of suggestions. Within 5 minutes, you have 4 options to choose from. One of them meets your criteria: a lively place with at least a few vegan options.
Why was that so easy? Two reasons: trust and conversation. You trust your friend’s recommendation over anything on the internet. Conversation makes the back and forth easier. It natural and free flowing.
Conversation as input
Conversation will become an input for every product.
To start with software will do more heavy lifting for us. Instead of filtering through information and clicking buttons, the product will do that for you.
Large language models (LLMs) make conversation possible with every product. It allows products to mix in natural language along with a visual interface. It feels easier because it’s how we converse with each other.
As soon as conversation becomes the medium of input, products can start listening in the background. In our restaurant reservation example, your product could listen for events on your calendar.
It does all the things you would do: check whether the restaurant has a vegan friendly menu, triangulates reviews across a few different sources and checks how convenient the location is for all your friends.
Two developments make this possible:
LLMs are excellent at reasoning. I cannot stress this enough. Give them a task and they can think through how this task can be done. With this ability, you don’t need to program “specifics” in products. You have a base set of tasks that a product should do, and LLMs are able to figure out which one to do automatically. You say: “I need a dinner reservation for 10PM” and the product knows exactly what to do.
This unlocks the ability to make decisions on the fly — something we’ve never had before. Sticking with our OpenTable example, here’s a demonstration of reasoning using ChatGPT:
The second break through is the ability to process images and audio. Instead of having to type into an interface, products could simply “listen” or “see”. In the example below, ChatGPT can identify the exact steps to complete the task below.
The new device
It is obvious that there is a very large opportunity to build a new machine that works 24/7 in the background and takes tasks off your plate. For this to happen, you need to tap into every signal you can get. The obvious signals are within the apps you use already: email, calendars etc. The unexplored signals — likely the most valuable — are the ones that are never recorded at all. For example, when you’re talking to someone at home about ordering groceries for the week.
Companies are going after this opportunity in a variety of ways. Humane is creating an AI pin that you wear. Meta has partnered with Rayban to create smart glasses. Rewind is making a pendant that users can wear.
There’s a serious concern around privacy and security for any device that listens. I’m less concerned about the user’s privacy and more concerned about the people the user is with. If I’m wearing a device that listens constantly, you may not be okay with it. And if I need to ask for permission each time, it might make things a little awkward: “Hey, do you mind if I my sunglasses listen in?”. More awkward that recording a Zoom for sure.
Phone manufacturers will have a big advantage. They have a distribution network and brand (important given the privacy and security concerns). On the contrary, phones are designed to be held in your hand, not to be worn like a pair of glasses or a pin — you need this in order to for the device to “see”. If they can innovate quickly, they definitely have the edge.
Fortunately, that’s not how life works. The next next Steve Jobs is sitting somewhere in the world, working away on a device that is the next digital paradigm just like the iPhone was.
An exciting future
We’re approaching a world where software does more of the work for us. Today, interaction between a human and computer is mostly active. We’ll move to a world where this is passive. LLMs make this possible by allowing us to converse with products in natural language, and by processing audio visual inputs.
As this future approaches, we’ll need to think carefully about protecting the security and privacy of users. The amount of productivity this can unlock for us is immense and I’m excited for it. We’re approaching a world where software runs when you sleep.