Advanced Environment Control
Using invoke, execute, and _setup for finer control over environments
Advanced Environment Control
While the standard step
, evaluate
, and close
methods cover most interactions, the Environment
object provides lower-level methods for more direct control, particularly useful for custom environments, debugging, and complex setup/evaluation scenarios.
invoke
The env._invoke_all()
method (and its underlying client.invoke()
) is the core mechanism for calling specific functions within the environment’s controller script.
- Purpose: Execute custom functions defined in your environment controller (the Python code running inside the Docker container or remote instance). This is how
setup
andevaluate
configurations in aTask
are ultimately executed. - Usage: You provide a configuration (string, tuple, dict, or list) matching the
HudStyleConfigs
format. The SDK sends this to the environment controller, which runs the specified function(s) with the given arguments. - When to Use:
- Triggering custom evaluation logic not suitable for the standard
evaluate
attribute. - Running specific diagnostic or state-setting functions within your custom environment controller during development or debugging.
- Implementing complex, multi-step setup or teardown procedures beyond what’s easily defined in the
Task
setup
.
- Triggering custom evaluation logic not suitable for the standard
execute
The client.execute()
method (accessible via env.client.execute()
if env.client
is a DockerClient
subclass like LocalDockerClient
or RemoteDockerClient
) allows running arbitrary shell commands inside the environment container.
- Purpose: Directly interact with the environment’s shell.
- Availability: Primarily available for Docker-based environments (local or remote custom). Standard remote environments (like
"hud-browser"
) might not support arbitrary command execution via this method. - When to Use:
- Debugging: Checking file existence, process status, or network connectivity inside the container.
- Complex Setup: Running intricate setup scripts or commands that are difficult to express using the standard
setup
configuration. - Local Development: Installing packages or modifying the container state interactively during development of a custom environment.
- Returns: An
ExecuteResult
typed dictionary containingstdout
(bytes),stderr
(bytes), andexit_code
(int).
_setup
- Purpose: Executes the setup configuration for the environment.
- Execution: This is normally called automatically by
hud.gym.make(task)
if the providedtask
has asetup
configuration. - When to Use Manually:
- Debugging: To re-run setup steps on an already created environment instance without recreating it entirely.
- Custom Flow: If you create an environment without an initial task (
env = await gym.make("gym-id")
) and later want to apply setup steps before starting agent interaction (thoughenv.reset(task=...)
might be more idiomatic). - To override the task’s default setup by passing a different
config
.
These advanced methods provide deeper control when the standard step
/evaluate
/close
cycle isn’t sufficient. Use them carefully, especially execute
, as direct shell access can make scenarios less reproducible if not managed properly.