Sweeps troubleshooting - Weights & Biases Documentation

This page helps you diagnose and resolve common error messages you might encounter when you run W&B Sweeps. The following sections describe an error, explain why it occurs, and recommend how to fix it.

`CommError, Run does not exist` and `ERROR Error uploading`

Your W&B run ID might be defined if W&B returns both of these error messages. For example, you might have a similar code snippet defined somewhere in your Jupyter Notebooks or Python script:

wandb.init(id="some-string")

You can’t set a run ID for sweeps because W&B automatically generates random, unique IDs for runs that sweeps create. W&B run IDs need to be unique within a project. If you want to set a custom name that appears on tables and graphs, pass a name to the name parameter when you initialize W&B. For example:

wandb.init(name="a helpful readable run name")

After you remove the id argument from wandb.init(), the sweep can assign its own unique run IDs, and the upload errors stop.

`CUDA out of memory`

If you see this error message, refactor your code to use process-based executions. When you run each trial in its own process, W&B releases GPU memory between runs. To refactor your code, complete the following steps:

Rewrite your code as a Python script called train.py. Add the name of the training script (train.py) to your YAML sweep configuration file (config.yaml in this example):

program: train.py
method: bayes
metric:
  name: validation_loss
  goal: maximize
parameters:
  learning_rate:
    min: 0.0001
    max: 0.1
  optimizer:
    values: ["adam", "sgd"]

Add the following to your train.py Python script:
```
if __name__ == "__main__":
    train()
```
From your CLI, initialize a sweep with wandb sweep:
```
wandb sweep config.yaml
```
Note the sweep ID that W&B returns. Start the sweep job with wandb agent from the CLI instead of the Python SDK (wandb.agent()). Replace [SWEEP-ID] in the following code snippet with the sweep ID that W&B returned in the previous step:
```
wandb agent [SWEEP-ID]
```

With your training code running as a script under the CLI agent, each trial executes in its own process, and W&B releases GPU memory between runs.

`anaconda 400 error`

The following error usually occurs when you don’t log the metric you’re optimizing:

wandb: ERROR Error while calling W&B API: anaconda 400 error: 
{"code": 400, "message": "TypeError: bad operand type for unary -: 'NoneType'"}

Within your YAML file or nested dictionary, you specify a key named metric to optimize. Ensure that you log this metric with wandb.Run.log(). Also, ensure you use the exact metric name you defined the sweep to optimize within your Python script or Jupyter Notebook. For more information about configuration files, see Define sweep configuration.

​CommError, Run does not exist and ERROR Error uploading

​CUDA out of memory

​anaconda 400 error

`CommError, Run does not exist` and `ERROR Error uploading`

`CUDA out of memory`

`anaconda 400 error`