CommError, Run does not exist and ERROR Error uploading
Your W&B run ID might be defined if W&B returns both of these error messages. For example, you might have a similar code snippet defined somewhere in your Jupyter Notebooks or Python script:
name parameter when you initialize W&B. For example:
id argument from wandb.init(), the sweep can assign its own unique run IDs, and the upload errors stop.
CUDA out of memory
If you see this error message, refactor your code to use process-based executions. When you run each trial in its own process, W&B releases GPU memory between runs.
To refactor your code, complete the following steps:
-
Rewrite your code as a Python script called
train.py. Add the name of the training script (train.py) to your YAML sweep configuration file (config.yamlin this example): -
Add the following to your
train.pyPython script: -
From your CLI, initialize a sweep with
wandb sweep: -
Note the sweep ID that W&B returns. Start the sweep job with
wandb agentfrom the CLI instead of the Python SDK (wandb.agent()). Replace[SWEEP-ID]in the following code snippet with the sweep ID that W&B returned in the previous step:
anaconda 400 error
The following error usually occurs when you don’t log the metric you’re optimizing:
metric to optimize. Ensure that you log this metric with wandb.Run.log(). Also, ensure you use the exact metric name you defined the sweep to optimize within your Python script or Jupyter Notebook. For more information about configuration files, see Define sweep configuration.