Foundations¶

This section provides a comprehensive exploration of the core concepts and architectural foundations that power the Agents Research Environments (ARE). Understanding these fundamentals is essential for effectively using ARE and creating your own scenarios.

Overview¶

ARE is a research-focused environment designed to simulate complex, realistic tasks that span several minutes and require multiple steps to complete. Unlike static simulation frameworks, ARE provides a dynamic, ever-evolving setting where the state of the environment changes over time and new information is continuously introduced. Today, agent evaluation is often constrained by several limitations:

There is currently no open environments with a built-in reward signal beyond web-based environments such as GAIA or BrowserComp.
Most existing benchmarks focus on narrow, domain-specific tasks intended for expert systems rather than general agents.
Many simulations are not grounded in real user scenarios or everyday tasks that reflect actual human workflows.
There is no flexible environment for testing new agents directly on real-world applications or APIs

Meta Agents Research Environments is designed to address these gaps by offering strong foundations and abstractions that are general enough to support a wide range of tasks and benchmarks. By doing so, it aims to make it easier for researchers to evaluate new directions in agentic research in settings that better reflect the complexity and unpredictability of the real world. To finish, ARE offers a clear decoupling between Environment and Agent, which is not always clear in recent benchmarks, by offering a ARE interface for agents.

High-level View of ARE Abstractions¶

ARE is built around 4 fundamental concepts that work together to create dynamic, realistic simulations. Understanding these concepts is essential for effectively using ARE and creating your own scenarios. These 4 concepts are:

Environment: The environment in which agents and users interact and collaborate.
Apps: Similar to a desktop or mobile phone, the environment can support a set of apps that users use in their daily life, which the agent can interact with. People can add their own apps, and it is also possible having only one or even no apps beyond the internal ones (e.g. for chat or assistant tasks).
Events: Central to ARE and makes it dynamic. In ARE, everything is an event, meaning any action the user, the agent takes or even external events generated by the environment itself, are events. The execution of the environment is based on these events for dynamics, condition execution and validation.
Scenarios: A scenario starts from an an initial state, and some events are scheduled to happen at some point (the first event is usually a user message that gives the task to the agent). Each scenario has its own validation/verification that states how to evaluate the agent trajectory wrt the user request and the env state.

The diagram below shows a high level-view of the ARE Abstractions interacting with each other:

ARE is an event-based environment in which we can load scenarios. All interactions with the environment, whether its an agent or a user, are done via the same interfaces.

The Environment is the central unit in ARE. It manages registered Apps, maintains the global simulation timeline, and coordinates all interactions through an internal event loop. The environment provides the necessary abstractions to decouple completely the agent from the environment by offering a ARE Interface to connect external actors (Agents, Users, …). Key responsibilities of the environment include:

Instantiating and registering Apps and their exposed APIs as tools available to the agent.
Managing the flow of time in discrete increments, evaluating event and scheduling execution accordingly.
Recording all executed Events in an immutable log for later inspection and evaluation.
Exposing the current state of the simulation at any point in time.

The Environment runs its event loop on a separate thread, ensuring that event processing and time progression do not block the main agent process, just like an agent would be deployed in a real-world scenario.

Core Concepts¶

Meta Agents Research Environments is built around fundamental concepts that work together to create dynamic, realistic simulations. Understanding these concepts is essential for effectively using the system and creating your own scenarios.

Understanding these core concepts provides the foundation for effectively using the platform, whether you’re running existing scenarios, creating benchmarks, or developing your own custom content.

Learn the Foundations¶

To fully understand the framework, it’s essential to grasp its core concepts. The following subsections provide a comprehensive explanation of the foundations of the framework.

We highly encourage that you read through them, starting with Apps.

Once you have a solid understanding of the core concepts, you can move on to the next section, which covers the practical aspects of using the Meta Agents Research Environments.

Foundation Topics

Next Steps¶

Now that you understand the core concepts:

Run Benchmarks: Evaluate agents systematically with Benchmarking with Meta Agents Research Environments
Understand JSON Format: Learn the detailed scenario format in Scenario JSON Format
Explore Scenarios: Learn how to work with and create scenarios in Working with Scenarios
Develop Custom Scenarios: Create your own scenarios using Scenario Development

For hands-on examples and tutorials, see the practical examples in the ARE repository’s tutorials/ directory.