Levels of AI Agents: from Rules to Large Language Models

100%

Levels of AI Agents: from Rules to Large Language Models

Abstract:

AI agents are defined as artificial entities to perceive the environment, make decisions and take actions. Inspired by the six levels of autonomous driving by SAE, the AI agents are also categorized based on utilities and strongness, as the following levels: LO-no AI, with tools (with perception) plus actions; L1- use rule-based AI; L2-let rule-based AI replaced by IL/RL-based AI, with additional reasoning and decision making; L3-apply LLM-based AI instead of IL/RL-based AI, additionally setting up memory and reflection; L4-based on L3, facilitating autonomous learning and generalization; L5- based on L4, appending personality (emotion plus character) and collaborative behavior (multi-agents).

One Introduction

Any entity, that is able to perceive its environment and execute actions, can be regarded as an agent. Agents can be categorized into five types: Simple Reflex agents, Model-based Reflex agents, Goal-based agents, Utility-based agents, and Learning agents.

As AI advanced, the term "agent" is used to depict entities exhibiting intelligent behavior and possessing capabilities like autonomy, reactivity, pro-activeness, and social interactions. In the nineteen fifties, Alan Turing proposed the renowned Turing Test. It is a cornerstone in AI and aims to explore whether machines can show intelligent behavior comparable to human beings. These AI entities are usually called "agents", setting up the basic building blocks of AI systems.

Foundation models have taken shape most strongly in NLP. On a technical level, foundation models are enabled by transfer learning and scale. The idea of transfer learning is to take the "knowledge" learned from one task and apply it to another task. Foundation models usually follow such a paradigm that a model is pre-trained on a surrogate task and then adapted to the downstream task of interest via fine-tuning.

Most of the Large Scale Language Models appearing recently are among or based on the Foundation Models. Due to the remarkable capabilities exhibited recently, LLMs are considered as potential penetration of AI for Artificial General Intelligence, offering hope for building general AI agents.

An AI agent mostly refers to an artificial entity that is able to perceive its surroundings using sensors, making decisions, and taking actions using actuators. According to the notion of World Scope that audits the progress of NLP by encompassing five levels from NLP to general AI, the pure LLMs-based agents are only built on the second level from the written Internet world.

Except this, LLMs have proved remarkable capabilities in knowledge capture, instruction interpretation, generalization, planning, and reasoning, while showing natural language interactions with humans. From this status, the LLM assisted agents with an expanded perception space and action space, have the potential to reach the third and the fourth levels of World Scope, i.e. Perception AI and Embodied AI respectively.

Moreover, these LLMs-based agents can handle more difficult tasks through collaboration or gaming, and social phenomena can be found, realizing the fifth level of World Scope, the Social World.

In session two, LLMs is reviewed briefly; session three elaborates on various AI agents; levels of AI agents are analyzed and defined in session four; and conclusion is given at the end.

Two LLMs

Three AI Agents

Four Levels of AI Agents

Four point one Tools (Perception plus Action)

Four point two Reasoning and Decision making

Four point three. Memory plus Reflection

Four point four. Generalization and Autonomous Learning

Four point five. Personality (Emotion plus Character) and Collaborative Behavior (Multi-agents)

Four point six Hierarchical Levels of AI Agents

Five Conclusion

Overview

The document investigates AI agents' progression from simple rule-based systems to sophisticated LLMs, illustrating how these technological advancements impact their capacity for learning, reasoning, and interacting with the environment.

Key Points

1AI agents are classified into five levels based on their capabilities
2LLMs represent a significant advancement in AI, showing potential for general intelligence
3The document discusses the importance of human feedback in aligning LLM behavior with human values
4It explores the collaboration and competition among LLM-based multi-agent systems
5The role of reasoning and decision-making in enhancing agent performance is emphasized.

Details

Authors: Yu Huang
Category: Technology and Engineering

PDF
Group Decision Support Systems and Executive Support Systems
This document presents an overview of Group Decision Support Systems (GDSS) and Executive Support Systems (ESS), detailing their functions, benefits, limitations, and characteristics to aid in collective decision-making processes in business contexts.
PDF
Information Management and Decision Making
This document provides an overview of Decision Support Systems (DSS), discussing their components, types, and the importance of information management in decision-making processes for business executives.
PDF
Electronic Communication Systems
The document provides an overview of electronic communication systems, discussing various types such as electronic conferencing, meeting systems, and publishing, while also addressing benefits and risks associated with electronic publishing.
PDF
End-User Computing
This document provides a comprehensive overview of End-User Computing (EUC), discussing its definition, types, benefits, risks, and the tools used in EUC environments. It aims to explain how non-programmers can effectively participate in computing processes and develop their own applications.
PDF
FLASHATTENTION: Fast and Memory-Efficient Exact Attention with IO-Awareness
This document presents FLASHATTENTION, an IO-aware exact attention algorithm that significantly improves the speed and memory efficiency of Transformers on long sequences. It introduces methods to reduce memory reads and writes, resulting in faster training and better model performance compared to existing methods.