Pause a sweep
Pause a sweep so it temporarily stops creating new runs. Runs that are already executing continue running until completion. Use thewandb sweep --pause command to pause a sweep. Provide the sweep ID that you want to pause.
Resume a sweep
Resume a paused sweep with thewandb sweep --resume command. The sweep starts creating new runs again according to its search strategy. Provide the sweep ID that you want to resume:
Stop a sweep
Stop a sweep to prevent the creation of new runs while letting executing runs finish gracefully. Use thewandb sweep --stop command:
Cancel a sweep
Cancel a sweep to immediately terminate all active runs and stop creating new runs. This is the only sweep command that forcibly terminates existing runs. Runs terminate abruptly, and the running processes have no chance to run user-defined signal handlers. Use thewandb sweep --cancel command to cancel a sweep. Provide the sweep ID that you want to cancel. For more information about signals and sweep runs, see Signal handling and sweep runs.
Sweep and run statuses
A sweep orchestrates multiple runs to explore hyperparameter combinations. To manage your hyperparameter optimization effectively, you must understand how sweep status and run status interact. The following sections describe how the two statuses differ, what happens when you stop an individual run, and which lifecycle command to choose.Key differences
- Sweep status controls whether the agent creates new runs (Running, Paused, Stopped, Cancelled, Finished, Failed, Crashed).
- Run status reflects the execution state of individual runs (Pending, Running, Finished, Failed, Crashed, Killed).
Stop an individual run
When you stop a run in a sweep, the sweep agent automatically starts the next run in the sweep. You can skip poorly performing configurations without interrupting the sweep’s overall progress.Best practices
The following recommendations help you choose the right lifecycle command for the situation, so you avoid losing useful work or holding onto unwanted compute.- Use
--pauseinstead of cancel when you want to temporarily halt exploration without losing running experiments. - Monitor individual run statuses to identify systematic failures.
- Use
--stopfor graceful termination when you’ve found satisfactory hyperparameters. - Reserve
--cancelfor emergencies when runs consume excessive resources or produce errors.