Related Alerts

Alert Severity

Depends on user impact. See ‣ for the appropriate severity.

Alert Summary

This alert means that errors have occurred in the block derivation pipeline while communicating with op-geth over the engine API. Typically, this means:

The batch derivation pipeline will continue to retry op-geth until it comes back online.

Investigation

  1. Query the op-node logs for errors. Use the eror and warn level selectors. The logs will tell you why the op-node can’t connect to op-geth.
    1. If it’s a network error, you’ll see something like connection refused or no such host.
    2. If it’s an RPC error coming from op-geth, you’ll see the RPC error op-geth returns.
  2. Query op-geth for errors. Use the |~ "err" line filter.
  3. Look at the Geth and Replica Healthcheck dashboards to make sure that Geth is healthy.

Mitigation

  1. Ensure that op-geth is up.
    1. If op-geth was just restarted, wait for the logs to indicate that the authenticated HTTP server is up. If it comes up and you start seeing chain head was updated logs, there’s nothing else to do and the alert will close on its own.
  2. Ensure that op-geth is healthy.
    1. If it is memory-constrained, try restarting the node.
  3. Otherwise, escalate the issue for further remediation.