Quickstart

This guide will get you up and running with the HUD SDK using a simple browser task and the OpenAI Operator Agent.

1. Prerequisites

Python: Ensure you have Python 3.10 or later installed.
API Keys: You’ll need API keys for both HUD and the agent you want to use (e.g., OpenAI).

2. Installation

Install the HUD SDK using pip:

pip install hud-python

For more details, see the Installation Guide.

3. API Key Setup

The SDK automatically loads API keys from environment variables or a .env file in your project root. Set the following:

HUD_API_KEY: Your key from app.hud.so.
OPENAI_API_KEY: Your OpenAI API key (if using OperatorAgent).
ANTHROPIC_API_KEY: Your Anthropic API key (if using ClaudeAgent).

Example .env file:

HUD_API_KEY=hud_...
OPENAI_API_KEY=sk-...
# ANTHROPIC_API_KEY=sk-ant-...

4. Run Your First Agent

This example uses the OperatorAgent to interact with a browser environment. It defines a task, creates an environment, runs the agent, and evaluates the result.

import asyncio
import os
from hud import gym, job                  # Import gym for environments and job decorator
from hud.task import Task                 # Import Task to define the goal
from hud.agent import OperatorAgent       # Import the agent
# hud.settings automatically loads keys from .env or environment variables

# Decorator to group this run under a job named "quickstart-run"
@job("quickstart-run")
async def main():
    # 1. Define a Task: What should the agent do?
    task = Task(
        prompt="Search for 'capybara' on Google",
        gym="hud-browser",               # Use a browser environment
        setup=("goto", "google.com"),    # Action to perform before the agent starts
        evaluate=("contains_text", "capybara") # How to check if the task succeeded
    )

    # 2. Create Environment: The runtime for the task
    print("Creating environment...")
    env = await gym.make(task)          # Creates the environment specified in the task

    # 3. Initialize Agent: Perform the task
    #    API keys are loaded automatically by hud.settings
    print("Initializing agent...")
    agent = OperatorAgent(environment="browser") # Specify environment type for the agent

    # 4. Interaction Loop: Agent observes and acts
    print("Starting interaction loop...")
    # Get initial observation (screenshot, text, etc.) by stepping without actions
    obs, _ = env.reset()

    for i in range(5): # Limit to 5 steps for this example
        print(f"--- Step {i+1} ---")
        # Agent predicts the next action(s) based on the observation
        actions, done = await agent.predict(obs)
        print(f"Agent action(s): {actions}")

        if done:
            print("Agent signaled task completion.")
            break

        # Execute the action(s) in the environment
        obs, reward, terminated, info = await env.step(actions)

        if terminated:
            print("Environment terminated.")
            break

    # 5. Evaluate & Close
    print("Evaluating task...")
    result = await env.evaluate()       # Run the evaluation defined in the Task
    print(f"Evaluation result: {result}")

    # Trajectory is automatically saved if a @job decorator is used
    # trajectory = await env.get_trajectory() # You can optionally get trajectory data
    # print(f"Trajectory ID: {trajectory.id}")

    print("Closing environment...")
    await env.close()                   # Clean up environment resources

if __name__ == "__main__":
    # Ensure API keys are set before running
    if not os.getenv("HUD_API_KEY") or not os.getenv("OPENAI_API_KEY"):
        print("Error: Please set HUD_API_KEY and OPENAI_API_KEY environment variables or in a .env file.")
    else:
        asyncio.run(main())

Explanation:

Task: Defines the goal (prompt), the type of environment (gym), initial setup steps (setup), and how success is measured (evaluate).
Environment: gym.make(task) creates the specified browser environment instance.
Agent: OperatorAgent is initialized. It automatically uses the OPENAI_API_KEY found by hud.settings.
Interaction Loop:
- env.step() with no actions gets the initial observation.
- agent.predict(obs) gets the next action(s) from the agent.
- env.step(actions) executes the actions and gets the new observation.
Evaluation & Close: env.evaluate() checks if the task succeeded based on the evaluate definition. env.close() shuts down the environment.
@job Decorator: Wrapping main with @job("quickstart-run") automatically creates a Job. When env.close() is called, the recorded interactions (trajectory) are associated with this Job. You can view the job and its trajectory video on the HUD Jobs page.

Next Steps

Explore the Core Concepts to understand the SDK architecture in more detail.
Check out the Examples folder in the GitHub repo for more detailed, runnable notebooks covering different agents and environments.
Review the API Reference for comprehensive documentation on specific functions and classes.

Getting Started

Core Concepts

Advanced Topics

API Reference

1. Prerequisites

2. Installation

3. API Key Setup

4. Run Your First Agent

Next Steps

Getting Started

Core Concepts

Advanced Topics

API Reference

​1. Prerequisites

​2. Installation

​3. API Key Setup

​4. Run Your First Agent

​Next Steps

1. Prerequisites

2. Installation

3. API Key Setup

4. Run Your First Agent

Next Steps