Troubleshooting
Common issues, diagnostics, and debugging techniques for merobox
Node Startup Problems
Docker not running
Symptom: Cannot connect to the Docker daemon or docker.errors.DockerException
- Verify Docker Desktop (or dockerd) is running: docker info
- On Linux, check the daemon: sudo systemctl status docker
- Ensure the Docker socket is accessible: ls -la /var/run/docker.sock
- If using a remote Docker host, verify DOCKER_HOST is set correctly
Port conflicts
Symptom: Bind for 0.0.0.0:2428 failed: port is already allocated
- Check what’s using the port: lsof -i :2428 (macOS/Linux) or netstat -ano | findstr 2428 (Windows)
- Stop conflicting processes or containers: docker ps to find running containers
- Run merobox nuke -y to clean up stale merobox containers and free ports
- Use a different port range in your workflow YAML via node configuration
Permission issues
Symptom: Permission denied when creating containers, volumes, or binding ports
- On Linux, add your user to the docker group: sudo usermod -aG docker $USER (requires logout/login)
- Verify group membership: groups
- For binary backend, ensure the merod binary is executable: chmod +x /path/to/merod
- Check that the data directory is writable: ls -la ~/.merobox/
Workflow Execution Issues
Variable resolution fails
Symptom: KeyError or Variable 'XXX' not found during step execution
- Environment variables (${ENV_VAR}): ensure the variable is exported in the shell before running merobox. Use echo $ENV_VAR to verify.
- Dynamic values (${results.step_name.field}): verify the referenced step has a name field and has executed successfully before the current step.
- Check for typos in variable names — both the reference and the definition must match exactly.
- Use ${VAR:-default} syntax for optional environment variables.
- Run with --verbose to see which variables are being resolved and their values.
Step validation fails
Symptom: StepValidationError: Missing required field 'xxx'
- Check the step type’s required fields in the Workflow YAML reference.
- Ensure YAML indentation is correct — misaligned fields may not be parsed as part of the step.
- Verify field types: some fields expect strings, others expect lists or integers.
- Run merobox run --dry-run workflow.yaml to validate without executing.
API calls fail (JSON-RPC errors)
Symptom: ClientError, ConnectionRefused, or JSON-RPC error responses in call steps
- Verify the target node is running and healthy: merobox health --name <node>
- Check that the application is installed and the context exists on the target node.
- Ensure the method name and arguments match the application’s expected interface.
- Inspect node logs for server-side errors: merobox logs --name <node> --tail 50
- For auth-related failures, verify JWT token validity or re-authenticate.
Auth Service Issues
nip.io URL not resolving
Symptom: DNS resolution failure for *.nip.io addresses used by the Traefik auth stack
- Check DNS resolution: nslookup 127.0.0.1.nip.io
- Some corporate networks or VPNs block wildcard DNS services — try disconnecting from VPN.
- As a workaround, add manual entries to /etc/hosts:
127.0.0.1 auth.127.0.0.1.nip.io - Consider using the binary backend which does not require Traefik or nip.io.
404 on auth URLs
Symptom: 404 Not Found when accessing /auth/token or /auth/refresh
- Verify the Traefik container is running: docker ps | grep traefik
- Check Traefik routing rules: docker logs <traefik-container>
- Ensure the auth service container is healthy and registered with Traefik.
- Verify the correct host header is being sent — auth routing depends on Host-based rules.
- Try a direct request bypassing Traefik to isolate the issue.
Network connection problems
Symptom: ConnectionError or timeouts when authenticating with remote nodes
- Test basic connectivity: curl -v <node-url>/health
- For remote nodes, check firewall rules and security group settings.
- Verify TLS certificates if using HTTPS — self-signed certs may need to be trusted.
- Run merobox remote test <name> for a structured connectivity diagnostic.
- Check if the node’s auth endpoint is on a different port than the main API.
Docker Issues
Container creation fails
Symptom: docker.errors.APIError during container creation
- Check Docker disk space: docker system df — prune unused resources with docker system prune
- Verify the image exists locally or can be pulled: docker pull <image>
- Check Docker resource limits (memory, CPU) in Docker Desktop settings.
- For image pull failures, check network connectivity to the container registry.
- Review the full error message — Docker API errors usually include the root cause.
Docker networking problems
Symptom: Nodes can’t discover each other, peer connections fail, or bootstrap nodes are unreachable
- Verify the merobox Docker network exists: docker network ls | grep merobox
- Check that all containers are on the same network: docker network inspect merobox-net
- Ensure bootstrap multiaddresses use container names (not localhost) for intra-container communication.
- Run merobox nuke -y and re-run to recreate the network from scratch.
- On macOS, Docker networking has known limitations — containers can reach the host via host.docker.internal.
Performance Issues
Slow workflow execution
Symptom: Workflows take significantly longer than expected to complete
- Enable debug logging to identify which steps are slow: LOG_LEVEL=DEBUG merobox run workflow.yaml
- Check for excessive retries in step configuration — each retry adds delay with exponential backoff.
- Verify node health — unhealthy nodes cause timeouts and retries: merobox health
- For wait_for_sync steps, ensure the timeout and poll interval are appropriate for the data volume.
- Consider using parallel steps to execute independent operations concurrently.
- Check Docker resource allocation — insufficient CPU or memory causes slow container performance.
High memory usage
Symptom: System becomes unresponsive or Docker reports OOM (out of memory) kills
- Check container memory usage: docker stats
- Reduce the number of concurrent nodes in the workflow.
- Increase Docker Desktop memory allocation in Settings → Resources.
- For the binary backend, check merod process memory: ps aux | grep merod
- If running a NEAR sandbox, it consumes additional memory — ensure at least 4 GB total is available.
- Use merobox nuke -y between test runs to clean up orphaned containers.
Debugging
Systematic techniques for diagnosing issues in merobox workflows and node management.
Enable Debug Logging
Set the LOG_LEVEL environment variable to get detailed output from all merobox components.
# Log levels: DEBUG, INFO (default), WARNING, ERROR, CRITICAL
Verbose CLI Output
The --verbose flag increases output detail for any merobox command.
merobox health --verbose
merobox remote test my-server --verbose
Check Node Logs
View merod output for a specific node to diagnose server-side issues.
merobox logs --name node-1 --tail 100
# Follow live output
merobox logs --name node-1 --follow
# Logs since a time
merobox logs --name node-1 --since "5m"
Inspect Containers
Access a running container’s shell for direct inspection.
docker ps --filter "label=merobox"
# Shell into a container
docker exec -it <container-id> /bin/sh
# View container details
docker inspect <container-id>
Network Diagnostics
Diagnose connectivity between nodes, the sandbox, and external services.
curl -s http://localhost:2428/health | python -m json.tool
# Check NEAR sandbox
curl -s http://localhost:3030/status | python -m json.tool
# Inspect Docker network
docker network inspect merobox-net
# DNS resolution test
nslookup node-1.127.0.0.1.nip.io