The hud.task module provides the Task class for defining evaluation scenarios.

Classes

Task

class Task(pydantic.BaseModel):
    id: str | None = None
    prompt: str
    setup: HudStyleConfigs | None = None
    evaluate: HudStyleConfigs | None = None
    gym: Gym | None = None
    target: str | list[str] | None = None # Inspect compatibility
    choices: list[str] | None = None      # Inspect compatibility
    files: dict[str, str] | None = None   # Inspect compatibility
    metadata: dict[str, Any] | None = None
    config: dict[str, Any] | None = None

Represents a specific task or scenario for an agent to perform within an Environment.

Task objects define the goal, environment requirements, setup steps, and evaluation criteria. They are typically passed to hud.gym.make() to create configured environments or grouped within a TaskSet.

See the Tasks and TaskSets Concepts page for detailed explanations, examples, and configuration styles.

Attributes:

  • id (str | None): Optional unique identifier, often assigned when loaded from the HUD platform.
  • prompt (str): The main instruction or goal for the agent.
  • setup (HudStyleConfigs | None): Configuration for setup actions executed before the agent starts. See Configuration Styles.
  • evaluate (HudStyleConfigs | None): Configuration defining the evaluation logic executed by env.evaluate(). See Configuration Styles.
  • gym (Gym | None): Specifies the required environment type (e.g., "hud-browser", CustomGym object). See hud.types.
  • target (str | list[str] | None): Ideal target output (primarily for compatibility with inspect-ai).
  • choices (list[str] | None): Multiple choice options (primarily for compatibility with inspect-ai).
  • files (dict[str, str] | None): Files associated with the task (primarily for compatibility with inspect-ai).
  • metadata (dict[str, Any] | None): Arbitrary dictionary for storing extra task-related information.
  • config (dict[str, Any] | None): Dictionary primarily used for remote task execution configuration details.