This guide will help you run your first Meta Agents Research Environments (ARE) scenario in just a few minutes. We’ll walk through the basic steps to get you up and running quickly.
Scenarios are designed to simulate real-world tasks that an agent might encounter.
A scenario is more than just a task description. It’s a complete simulation setup that includes:
Initial Environment State: How the world looks when the scenario starts
Available Applications: Which tools the agent can use
Dynamic Events: Things that happen during the scenario execution
Task Definition: What the agent needs to accomplish
Validation Logic: How success is measured
By running a scenario, you’ll see how the agent interacts with the environment, makes decisions, and completes the task.
Scenarios are the base of the agent benchmarking process, so understanding them is crucial for effective testing.
An API key for your chosen model provider (optional for basic testing)
Note
We recommend using uvx to run the Agents Research Environments commands without installing the package locally. If you want to dig deeper into the library or develop custom scenarios, you can install it locally (see Installation).
-sscenario_find_image_file: Specifies the scenario to run
-adefault: Uses the Meta OSS agent
--providermock: Uses the mock model provider (no API calls) - this will return fake inference calls, your scenario will run but the task will fail
Hint
In the following command example, we omit uvx –from meta-agents-research-environments to make it easier to read.
If you do not want to run through the installation guide, keep using uvx –from meta-agents-research-environments in your commands.
Initialization: The environment and apps are set up
Agent Actions: The agent’s reasoning and tool calls
Environment Updates: How the environment responds to actions
Results: Whether the scenario was completed successfully
Example output:
======== New task for base_agent ========
Received at: 1970-01-01 00:00:00
Sender: User
Message: I need to find the image file in the current directory
Starting iteration 0...
===== Output message of the LLM: =====
Thought: To find the image file in the current directory, I need to list all the files in the current directory and then filter ...
Action:
{
"action": "SandboxLocalFileSystem__ls",
"action_input": {
"path": ".",
"detail": true
}
}
Calling tool: 'SandboxLocalFileSystem__ls' with arguments: {'path': '.', 'detail': True}
Starting iteration 1...
===== Output message of the LLM: =====
Thought: The output of the SandboxLocalFileSystem__ls tool shows a list of files in the current directory. I need to filter ...
Action:
{
"action": "AgentUserInterface__send_message_to_user",
"action_input": {
"content": "The image file in the current directory is llama.jpg"
}
}
Calling tool: 'AgentUserInterface__send_message_to_user' with arguments: {'content': 'The image file in the current directory is llama.jpg'}
Terminated turn 1 over 1
Max iterations reached - Stopping Agent: after 1 turns
For a more interactive and visual experience, the Agents Research Environments provides a comprehensive web-based GUI. The interface allows you to explore scenarios, monitor agent behavior, and debug interactions in real-time.
Try the online demo first! Visit the Hugging Face Space to explore the playground without any local setup. The demo showcases the agent’s capabilities across various tasks and tools.
To start the GUI locally, use the are-gui command:
are-gui
The GUI will start a web server, typically accessible at http://localhost:8080. Open this URL in your browser to begin interacting with the environment.
The GUI supports different view modes optimized for various use cases. You can switch between them using the top left dropdown menu.
Playground Mode
The playground mode provides a chat-like interface for direct interaction with agents:
are-gui-sscenario_universe_hf_0
Features:
Direct chat interface with the agent.
Real-time response streaming.
Access to all available tools and applications.
Perfect for testing and experimentation.
Scenarios Mode
The scenarios mode is designed for structured task execution and evaluation. You can load scenarios directly from Hugging Face datasets using the hf:// protocol:
Access scenarios without downloading datasets locally.
Explore community-contributed scenarios.
Exploring Gaia2 Scenarios
You can load individual scenarios from the Gaia2 dataset to check their annotations, see the task and expected agent actions and explore the universe’s applications.
Using the Execution Panel you can run the scenario and see the agent’s actions and the environment’s response directly in the UI.
Hint
are-run also supports –hf-url where you can pass a URL to a compatible Hugging Face dataset. This allows you to run scenarios from Gaia2 on the cli,
for example –hf-url “hf://datasets/meta-agents-research-environments/gaia2/adaptability/validation/scenario_universe_21_5e0gvz.
# Specify custom port
are-gui-sscenario_name--port8888# Use different model providers
are-gui-sscenario_name--providerllama-api--modelLlama-4-Maverick-17B-128E-Instruct-FP8
Check the project’s CONTRIBUTING.md guide for community support
Congratulations! You’ve successfully run your first Meta Agents Research Environments scenario. You’re now ready to explore more advanced features and create your own scenarios.