Rethinking the Relationship Between Humans and Machines: AI Leads the Interactive Revolution

PIX Team
May 11, 2024
4 min read

Updated: May 15, 2024

Developing AI and AGI has always been a great dream of the computing industry. Besides learning ability, the ability to use tools is another important distinguishing feature of humans compared to other animals. The birth of AI Agents means that we are one step closer to that dream.

Reviewing the development of human-vehicle interaction

In 1885, German mechanical engineer Karl Friedrich Benz transformed a prototype of a bicycle, replacing one rear wheel with two and adding an Otto engine in the middle, thus giving birth to the world's first gasoline-powered car. Driving this difficult-to-control three-wheeler, Benz often crashed into walls amid the laughter of passerby. At this time, the entire body of the car was exposed, with virtually no interior.

The Benz Patent-Motorwagen Number 3 of 1888, Wikipedia

In 1908, Ford introduced the Model T in Detroit, priced at only $260. A Ford worker at the time could buy one with three months' wages. From then on, the aristocracy no longer enjoyed such luxurious privileges, and cars became a mass-produced commodity and transportation tool, gradually replacing horse carriages worldwide. People interacted with cars through mechanical pointers and knobs.

1912 Ford Model T Torpedo, Hymanltd.com

As on-board communication, audio, air conditioning, and other systems gradually improved, the need for human-vehicle interation also increased. Eventually, the car's center console was filled with various physical buttons.

In 1986, Buick introduced the world's first touch-screen computer-controlled various functions in the car, which was an unprecedented interactive experience at the time. Buick's astonishing foresight had a profound impact on later car designs.

1986 Buick Touch Screen Dash, engineering.com

More than twenty years later, Apple redefined human-computer interaction through simple and intuitive graphical interfaces. vivid and smooth animation effects, and convenient touch experiences. The user habits of operating hardware devices such as phones and computers through graphical interfaces (GUI) spread from consumer electronics to car controls as well.

In 2016, electric cars represented by the Tesla Model S integrated cockpit instruments and

controls, displaying vehicle information and interactive functions on a 17-inch central touchscreen, implementing most operations through clicking and sliding.

Tesla Model S touchscreen control panel, Wikipedia

So far, large vertical screens, large horizontal screens, rotating screens, multi-link screens... new cars launched on the market are conducting an "arms race" with larger and more touchscreens as the main selling point.

In order to completely replace physical buttons, the Mercedes-Benz EQS, launched in 2021, uses a single 1.41-meter-long hyperlinked screen that spans from the main driver to the co-driver seat, even controlling the rear audio and air conditioning through screens. surrounded by dazzling RGB lights.

EQS SUV, mercedes-benz.com

As the functionality and complexity of electronic products gradually increase, GUI interfaces become more and more complicated. The safety hazards caused by operating touchscreens blindly while driving have also been criticized.

The emergence of ChatGPT

In 2023, artificial intelligence applications, led by OpenAI, entered the public eye, bringing new possibilities for human-machine interaction.

In the same year, Mercedes-Benz announced that its MBUX intelligent system is integrated with ChatGPT. Starting from June 16th, over 900,000 American custoemrs can participate in this testing program. This is the first large-scale application of ChatGPT in the automotive scene.

MBUX Beta Program,

mercedes-benz.com

The rise of ChatGPT marks significant progress in AI in areas such as speech recognition and natural language processing, giving rise to conversational User Interfaces (CUI) based on large language models.

These interfaces adopt a more intuitive conversational form, allowing users to interact with the system through voice or text to issue commands. AI can also learn from context and memory to provide a more personalized and intelligent user experience.

NIO NomiGPT, ‍‍‍Instagram @nioglobal‍

Today, an increasing number of electric vehicle manufacturers are integrating large language models into their voice systems. NIO unveiled its in-car artificial intelligence NomiGPT on April 12th this year. It can answer various questions from car owners, chat with them in real-time, perceive the situation inside and outside the car, and support all functions related to travel.

Reconsidering the development trend of human-vehicle interation, from physical buttons to touchscreens, from touch input to voice recognition, even with the emergence of more advanced interaction methods, current in-car intelligent assistants still primarily execute tasks assigned by car owners, providing functional values such as navigation, communication, entertainment, and setting.

Technological advancement should not be merely about accumulation and repetition. Guided by the wave of artificial intelligence technology, large language models will undoubtedly bring about more natural and convenient interaction methods, leading to revolutionary changes in the driving experience.

AI Agent that understands how to use tools

In his blog post titled "AI is about to completely change how you use computers" on GatesNotes, Bill Gates wrote that within the next five years, people won't need to use different applications for different tasks. They'll simply need to tell their devices what they want to do in everyday language. Based on how much information you choose to share with it, the software will have a rich understanding of your life and be able to give personalized responses. In the near future, anyone online will be able to have a personal assistant driven by artificial intelligence, whose capabilities far beyond today's technology.

This type of software, which can respond to natural language and complete many different tasks based on its understanding of the user, is called an Agent. Gates claims in the blog that he has been thinking about Agents for nearly 30 years, and thanks to advances in AI, this idea has only recently become pratical.

AI is about to completely change

how you use computers, GatesNotes

With a large language model (LLM) as its brain, an automated Agent consists of planning, memorizing, and tool using. Agents can use external tools to expand the model's capabilities and accomplish tasks that language alone cannot achieve. From physical buttons to touchscreens, from GUIs to CUIs, in today's world where attention is dominated by various electronic devices, people long to be liberated. Agents are one of the hottest solutions right now.

One or many?

The company behind Pokémon Go, Niantic Labs, and a design studio based in London, Liquid City, recently collaborated to releas a video exploring artificial intelligence agents and XR called "Agents". The five-minute short film not only demonstrates how the concept of Agents integrates into the lives of ordinary people but also conpares the potential differences between having multiple decentralized, personalized Agents and a single super-Agent.

Agents, Liquid City

In the short film, the protagonist Maya possesses various Agents that exist for different purposes. Nav provides her with travel advice and guidance; Vibes is a personalized music and IoT lighting assistant that can make choices based on her mooed; Well-being focuses on Mays's long-term goals and helps her stay on track; There are also Agents for handling payments, caring for plants, personal traning, recipe inspiration, and more. Maya interacts with her Agents in the same way she interacts with friends or colleagues. In contrast, her date only has one Agent name One, which can perform all the tasks mentioned above.

This prompts us to consider whether each device/application should have its own "soul", or if a "soul" that communicates across different applications/devices will be more suitable in the future AI era.

AGI Agent, PIX Moving

In response, PIX is currently developing a modular AGI Agent that can connect to and communicate with cars or other hardware through magnetic attachment. PIX believes that when general artificial intelligence becomes widespread, compatibility with other hardware will become an important product element for Agents. Modular design will be the prerequisite for AGI to be scalable, compatible, configurable, and shareable, with Robo-EV being one such extension.

All hardware devices with mobility capabilities, such as humans, cars, and robotic dogs, can endow AGI Agents with the ability to act. By continuously training, learning, and collecting data in different scenarios, users will have an Agent that is not limited to a single hardware device, capable of switching between different devices and providing a more comprehensive user experience.

Not only will AGI Agents change the interaction between humans and cars, but they will also reshape the interaction between humans and all future intelligent hardware. As Gates said, "They're also going to upend the software industry, bringing about the biggest revolution in computing since we went from typing commands to tapping on icons."

As Agents become ubiquitous in human society, what changes do you think they will bring to life? Please share your thoughts with us in the comments section.

Rethinking the Relationship Between Humans and Machines: AI Leads the Interactive Revolution

Reviewing the development of human-vehicle interaction

The emergence of ChatGPT

AI Agent that understands how to use tools

One or many?

Comments

Our Office:
Waaggasse 5, 8001 Zürich,
Switzerland