Levels of AI Agents: from Rules to Large Language Models
Levels of AI Agents: from Rules to Large Language Models
Abstract:
AI agents are defined as artificial entities to perceive the environment, make decisions and take actions. Inspired by the six levels of autonomous driving by SAE, the AI agents are also categorized based on utilities and strongness, as the following levels: LO-no AI, with tools (with perception) plus actions; L1- use rule-based AI; L2-let rule-based AI replaced by IL/RL-based AI, with additional reasoning and decision making; L3-apply LLM-based AI instead of IL/RL-based AI, additionally setting up memory and reflection; L4-based on L3, facilitating autonomous learning and generalization; L5- based on L4, appending personality (emotion plus character) and collaborative behavior (multi-agents).
One Introduction
One Introduction
Any entity, that is able to perceive its environment and execute actions, can be regarded as an agent. Agents can be categorized into five types: Simple Reflex agents, Model-based Reflex agents, Goal-based agents, Utility-based agents, and Learning agents.
As AI advanced, the term "agent" is used to depict entities exhibiting intelligent behavior and possessing capabilities like autonomy, reactivity, pro-activeness, and social interactions. In the nineteen fifties, Alan Turing proposed the renowned Turing Test. It is a cornerstone in AI and aims to explore whether machines can show intelligent behavior comparable to human beings. These AI entities are usually called "agents", setting up the basic building blocks of AI systems.
Foundation models have taken shape most strongly in NLP. On a technical level, foundation models are enabled by transfer learning and scale. The idea of transfer learning is to take the "knowledge" learned from one task and apply it to another task. Foundation models usually follow such a paradigm that a model is pre-trained on a surrogate task and then adapted to the downstream task of interest via fine-tuning.
Most of the Large Scale Language Models appearing recently are among or based on the Foundation Models. Due to the remarkable capabilities exhibited recently, LLMs are considered as potential penetration of AI for Artificial General Intelligence, offering hope for building general AI agents.
An AI agent mostly refers to an artificial entity that is able to perceive its surroundings using sensors, making decisions, and taking actions using actuators. According to the notion of World Scope that audits the progress of NLP by encompassing five levels from NLP to general AI, the pure LLMs-based agents are only built on the second level from the written Internet world.
Except this, LLMs have proved remarkable capabilities in knowledge capture, instruction interpretation, generalization, planning, and reasoning, while showing natural language interactions with humans. From this status, the LLM assisted agents with an expanded perception space and action space, have the potential to reach the third and the fourth levels of World Scope, i.e. Perception AI and Embodied AI respectively.
Moreover, these LLMs-based agents can handle more difficult tasks through collaboration or gaming, and social phenomena can be found, realizing the fifth level of World Scope, the Social World.
In session two, LLMs is reviewed briefly; session three elaborates on various AI agents; levels of AI agents are analyzed and defined in session four; and conclusion is given at the end.