Summary

Pyth Entropy experienced a rolling outage across several different chains between Apr 29 and May 1st.

Timeline

29 Apr 2:37 AM UTC:

A new deployment was merged adding 10 second timeouts to the rpc calls. When the new container was spawned the initial rpc call for fetching the contract info timed out for 4 chains:

2025-04-29 02:39:28.938	2025-04-29T02:39:28.938787Z ERROR ThreadId(01) fortuna::command::run:168: Failed to setup merlin error sending request for url (<http://erpc.erpc:4000/fortuna/evm/4200>): operation timed out
2025-04-29 02:39:28.938	2025-04-29T02:39:28.938783Z ERROR ThreadId(01) fortuna::command::run:168: Failed to setup base error sending request for url (<http://erpc.erpc:4000/fortuna/evm/8453>): operation timed out
2025-04-29 02:39:28.938	2025-04-29T02:39:28.938777Z ERROR ThreadId(01) fortuna::command::run:168: Failed to setup blast error sending request for url (<http://erpc.erpc:4000/fortuna/evm/81457>): operation timed out
2025-04-29 02:39:28.938	2025-04-29T02:39:28.938745Z ERROR ThreadId(01) fortuna::command::run:168: Failed to setup etherlink (code: -32603, message: 

After this timeout, the keeper service for the above chains did not start and the outage began.

29 Apr 5:10 AM UTC:

A team member was notified of the issue where on-chain transactions were reverting.

29 Apr 5:24 AM UTC:

The cause of the outage was determined to be the earlier deployment.

29 Apr 5:50 AM UTC:

The deployment was reverted.


30 Apr 7:00 PM UTC:

After updating our alerting system we detected additional downtimes on: