Runbook: Global Event Block Stalled

This runbook ensures that any issues with the global event block are addressed promptly, minimizing downtime and restoring monitoring capabilities effectively.

1. Investigation

Initial Checks

  1. Verify RPC Responsiveness (likely to be these RPC used but to make sure, check in argocd of op-monitorism)

CleanShot 2025-05-09 at 14.26.14.gif

<aside> 💡

Mainnet RPC: https://proxyd-l1-consensus.primary.mainnet.prod.oplabs.cloud

Sepolia RPC: https://proxyd-l1-consensus.primary.sepolia.prod.oplabs.cloud

</aside>

  1. Command to Check:

<aside> ℹ️

Make sure tailscale is enabled before doing this since these are internal RPC of Labs.

</aside>

cast block-number --rpc-url <https://proxyd-l1-consensus.primary.mainnet.prod.oplabs.cloud>

Replace the RPC URL based on the network (Mainnet or Sepolia)

  1. Assess Monitoring Impact:

    ▪ If the RPC is unresponsive, monitoring is blind, and immediate action is required.

2. Actions

  1. Open an Incident with SEV2: [May 9, 2025 ] Blind Monitoring on Mainnet|Sepolia

  2. Notify Teams:

    ▪ Ping ⁠#eng-oncall and ⁠@platform-oncall on Slack to inform them of the issue.

    You can provide more info such as the last block that was read by global event by checking Grafana dashboard of global-event in this link.

    CleanShot 2025-05-09 at 14.37.40.gif

  3. Await Response:

    ▪ Typically, the platform team can resolve the issue quickly.

    ▪ If unresolved, proceed to redeploy new RPCs via Kubernetes temporarily (If platform-oncall think the delay of fix is higher than 20m).