TL;DR - Production incidents do not wait for you to find your laptop, configure a VPN, or remember which SSH key connects to which bastion host. Emergency database access needs to work instantly from any device. - Device-dependent tooling (MySQL Workbench, local SSH configs, VPN clients) is the single biggest bottleneck during on-call incidents. 72% of MTTR is spent on non-repair activities like gaining access. - The fix is pre-configured, browser-based database access through a static IP that is already whitelisted in your database firewall — no installation, no configuration, no "which machine has the credentials" scramble. - Combine this with read-only emergency accounts, saved runbook queries, and documented escalation paths so any on-call engineer can diagnose issues in minutes, not hours. - Planning for emergency access before the incident is the difference between a 15-minute resolution and a 2-hour outage.
Table of Contents
- Emergency Database Access: How SREs Connect During Incidents
- The 3 AM Problem Every On-Call Engineer Knows
- Why Device-Dependent Tools Fail During Incidents
- Pre-Whitelisted Static IP: Always Ready, Zero Setup
- Browser-Based Access: Any Device, Anywhere
- Read-Only Emergency Accounts: Diagnose Without Risk
- Runbook Queries: What to Check First
- Building Your Emergency Access Runbook
- FAQ
- Conclusion
Emergency Database Access: How SREs Connect During Incidents
It is 3 AM. PagerDuty fires. The application is returning 500 errors and users are filing support tickets. You need emergency database access right now — not in 20 minutes after you boot your work laptop, connect to the VPN, find the right SSH key, and remember which bastion host routes to production. Right now.
This scenario plays out thousands of times every night across engineering teams worldwide. According to the 2025 Datadog State of On-Call Report, the average SRE is paged 4.3 times per week, with 31% of pages occurring outside business hours. And according to PagerDuty's 2025 State of Digital Operations, the mean time to resolve (MTTR) for critical incidents averaged 2.1 hours — with the majority of that time spent not on the actual fix, but on getting access to the systems needed to diagnose the problem.
SRE database access during incidents should not depend on which device you have, which network you are on, or whether you remembered to update your VPN certificate last month. It should just work.
The 3 AM Problem Every On-Call Engineer Knows
Here is how most on-call database access actually plays out:
- PagerDuty alert fires on your phone.
- You open your phone, read the alert — looks like a database issue. Slow queries, connection pool exhaustion, maybe a deadlock.
- You need to check the database. Your work laptop is at the office. Or in your bag downstairs. Or the battery is dead.
- You grab whatever device is nearby — your personal laptop, a tablet, your partner's computer.
- You try to connect. You cannot. The VPN client is not installed. The SSH keys are on your work machine. The database credentials are in a
.envfile on a device you do not have. - You spend 15 to 30 minutes getting access before you can even look at the database.
A 2024 survey by Rootly found that 72% of incident response time is spent on non-repair activities — communication, escalation, access provisioning, and context-gathering. The actual fix is often the shortest part. When on-call database access requires a specific device with specific software and specific credentials, you are building a system that fails at the exact moment it needs to work.
The problem compounds in distributed teams. If the on-call engineer is traveling, or at a family event, or anywhere that is not their normal work setup, the access chain breaks. Google's SRE book documents this pattern explicitly: systems that require specific physical resources for emergency access create single points of failure in the incident response process.
Why Device-Dependent Tools Fail During Incidents
Most database access workflows were designed for day-to-day development, not emergencies. They assume you are sitting at your configured workstation with all your tools available.
MySQL Workbench, DBeaver, TablePlus, DataGrip — all excellent tools for daily use. All useless if they are not installed and configured on the device you have at 3 AM. They require local installation, saved connection profiles, and often SSH key files or VPN connectivity to reach production databases.
SSH tunnels are the gold standard for secure database access, but they depend on having the right private key on the right machine. According to the 2025 GitGuardian State of Secrets Sprawl, 12.8 million secrets were exposed in public GitHub repositories in 2024, partly because developers copy SSH keys and credentials across machines to solve exactly this problem — needing access from a device that does not have the right keys.
VPN clients add another dependency. They need to be installed, configured, and current. Certificate-based VPNs require periodic renewal. Split-tunneling configurations vary by device. According to Zscaler's 2025 VPN Risk Report, 92% of organizations acknowledge that VPNs introduce security vulnerabilities, yet teams continue to rely on them because they do not have an alternative path to internal resources.
The pattern is clear: every tool that must be pre-installed and pre-configured on a specific device is a tool that will fail during an emergency on a different device.
Pre-Whitelisted Static IP: Always Ready, Zero Setup
The biggest friction point during incident response database access is not authentication — it is the network. Most production databases are behind firewalls that only accept connections from whitelisted IP addresses. This is correct security practice, as we have covered in our guide on database IP whitelisting. But it means that any new access path — a different device, a different network — requires a firewall change before you can even attempt to connect.
During an incident, nobody should be editing firewall rules.
A static IP gateway solves this by routing all database connections through a single, known IP address. You whitelist that IP once during setup, and it remains whitelisted permanently. When the 3 AM page fires, the IP is already in your firewall rules. There is no provisioning step, no firewall change, no waiting for propagation.
DBEverywhere operates on this model — a single static IP that you whitelist in your database provider (AWS RDS security groups, DigitalOcean trusted sources, Google Cloud SQL authorized networks, or any other provider). Once whitelisted, every engineer on your team can reach the database through that gateway from any device, any network, at any time. The IP never changes.
This approach is architecturally identical to how bastion hosts work, but without the requirement to SSH into the bastion first. The gateway is the bastion, and the access method is a web browser.
Browser-Based Access: Any Device, Anywhere
If the access tool runs in a browser, the device problem disappears. Any laptop, tablet, or even phone with a web browser becomes a database management station.
Browser-based database tools are not new — phpMyAdmin has been the standard MySQL web interface since 1998, and Adminer supports MySQL, PostgreSQL, SQLite, MS SQL, and Oracle in a single PHP file. What is new is running these tools as a hosted service with authentication, session management, and that pre-whitelisted static IP.
The advantage during incidents is immediate:
- No installation. Open a browser, log in, connect. Works on your personal laptop, a borrowed computer, or a hotel business center PC.
- No local credentials. You authenticate to the gateway, then provide database credentials (or use saved connections if your team has configured them). No
.envfiles, no SSH keys on the device. - No VPN. The gateway handles the network routing. Your browser connects to the gateway over HTTPS. The gateway connects to your database from the whitelisted IP.
- No port forwarding. No
ssh -L 3306:db-host:3306commands to remember or misconfigure at 3 AM.
For teams that keep database credentials in a secrets manager like HashiCorp Vault or AWS Secrets Manager, the workflow becomes: open browser, authenticate to the gateway, retrieve credentials from the secrets manager on your phone, paste them in, and you are connected. Total time from PagerDuty alert to database access: under two minutes.
Read-Only Emergency Accounts: Diagnose Without Risk
During an incident, the first priority is diagnosis, not remediation. You need to understand what is happening before you change anything. A read-only database account designed specifically for emergency access removes the risk of accidental damage during the highest-stress moments.
Here is how to set it up for MySQL:
-- Emergency read-only account for incident response
CREATE USER 'oncall_readonly'@'%' IDENTIFIED BY 'strong-random-password';
GRANT SELECT, SHOW DATABASES, SHOW VIEW, PROCESS ON *.* TO 'oncall_readonly'@'%';
GRANT REPLICATION CLIENT ON *.* TO 'oncall_readonly'@'%';
-- PROCESS lets you run SHOW PROCESSLIST to see active queries
-- REPLICATION CLIENT lets you check replication lag
For PostgreSQL:
CREATE ROLE oncall_readonly;
GRANT CONNECT ON DATABASE production TO oncall_readonly;
GRANT USAGE ON SCHEMA public TO oncall_readonly;
GRANT SELECT ON ALL TABLES IN SCHEMA public TO oncall_readonly;
ALTER DEFAULT PRIVILEGES IN SCHEMA public GRANT SELECT ON TABLES TO oncall_readonly;
CREATE USER oncall_engineer WITH PASSWORD 'strong-random-password';
GRANT oncall_readonly TO oncall_engineer;
The key grants beyond basic SELECT are PROCESS (MySQL) or pg_stat_activity access (PostgreSQL), which let you see currently running queries — often the first thing you check during a performance incident. The 2025 Percona Database Performance Survey found that slow or blocked queries were the root cause in 47% of database-related incidents, making SHOW PROCESSLIST or pg_stat_activity the single most important diagnostic tool available.
Store this emergency credential in your team's secrets manager with a clear label: "On-Call Emergency Read-Only — Production". Make sure every on-call engineer knows where to find it.
Runbook Queries: What to Check First
When you finally have emergency database access, you need to know what to run. At 3 AM, you are not at your cognitive best. Pre-written runbook queries eliminate the need to think through diagnostic steps under pressure.
MySQL — Check active connections and running queries:
-- What is running right now?
SHOW PROCESSLIST;
-- How many connections are open, and what is the limit?
SHOW STATUS LIKE 'Threads_connected';
SHOW VARIABLES LIKE 'max_connections';
-- Is replication lagging?
SHOW SLAVE STATUS\G
-- Which tables are locked?
SHOW OPEN TABLES WHERE In_use > 0;
-- InnoDB engine status (deadlocks, buffer pool, I/O)
SHOW ENGINE INNODB STATUS\G
PostgreSQL — Equivalent diagnostics:
-- Active queries, sorted by duration
SELECT pid, now() - pg_stat_activity.query_start AS duration,
query, state
FROM pg_stat_activity
WHERE state != 'idle'
ORDER BY duration DESC;
-- Connection count vs. limit
SELECT count(*) FROM pg_stat_activity;
SHOW max_connections;
-- Replication lag
SELECT now() - pg_last_xact_replay_timestamp() AS replication_lag;
-- Locked queries
SELECT blocked_locks.pid AS blocked_pid,
blocking_locks.pid AS blocking_pid,
blocked_activity.query AS blocked_query
FROM pg_catalog.pg_locks blocked_locks
JOIN pg_catalog.pg_locks blocking_locks
ON blocking_locks.locktype = blocked_locks.locktype
AND blocking_locks.relation = blocked_locks.relation
AND blocking_locks.pid != blocked_locks.pid
JOIN pg_catalog.pg_stat_activity blocked_activity
ON blocked_activity.pid = blocked_locks.pid;
Save these queries somewhere accessible during incidents — your team wiki, a pinned Slack message, or if you use a paid tier on a browser-based tool like DBEverywhere, as saved connections with notes attached. The point is that no one should be writing diagnostic SQL from memory at 3 AM.
Building Your Emergency Access Runbook
A complete SRE database access runbook ties together the access path, credentials, and diagnostic steps into a single, followable document. Here is a template:
1. Access path. "Open browser. Go to [gateway URL]. Log in with your team SSO / magic link. Select the production connection."
2. Credentials. "Emergency read-only credentials are in [secrets manager] under the path production/oncall-readonly. If the secrets manager is unavailable, credentials are in the sealed envelope in [physical location] or the break-glass document in [backup location]."
3. First five checks.
- SHOW PROCESSLIST — look for long-running or blocked queries.
- Connection count vs. max_connections — are we at the limit?
- Replication lag — is the replica falling behind?
- Disk usage — is the database volume full?
- Recent schema changes — did a migration run before the incident started?
4. Escalation. "If the issue requires write access (killing queries, adjusting configuration), escalate to [person/role] who has production write credentials. Do not use write credentials without explicit approval during an incident."
5. Post-incident. "Screenshot or export query results for the post-mortem. Note the exact time of diagnosis and any queries you ran."
The difference between teams that resolve incidents in 15 minutes and teams that take 2 hours is not skill — it is preparation. According to Jeli's 2025 Incident Analysis Report, teams with documented and rehearsed runbooks resolved incidents 60% faster than teams without them. The access path is the foundation that every other step depends on.
FAQ
What if I cannot reach my VPN during an incident?
This is exactly the scenario that makes device-dependent tooling dangerous during emergencies. If your VPN client is not installed on the device you have, or the VPN server itself is experiencing issues (which happens — VPN infrastructure can fail during the same events that cause database incidents), you are locked out. A browser-based gateway with a pre-whitelisted static IP eliminates this dependency. You only need a web browser and an internet connection. No client software, no certificates, no split-tunnel configuration.
Should on-call engineers have write access to production databases?
Not by default. The initial response to any incident should be diagnosis, and diagnosis only requires read access. Write access — killing queries, modifying data, changing configuration — should require explicit escalation and approval. This protects against accidental damage during high-stress, low-sleep situations. Create a separate write-access account that requires a second person's approval to use, and document this escalation path in your runbook.
How do I get emergency database access set up before the next incident?
Start with three steps today. First, create a read-only database account specifically for on-call use and store the credentials in your secrets manager. Second, whitelist a reliable access path — either your VPN's static IP or a browser-based gateway like DBEverywhere (the free tier gives you 5 sessions per month, which covers most on-call needs). Third, write down the five diagnostic queries your team runs most often during incidents and put them somewhere every on-call engineer can access from any device. This takes about 30 minutes and will save hours during your next incident.
Can I use my phone for emergency database access?
With browser-based tools, yes. phpMyAdmin and Adminer both render in mobile browsers, though the experience is better on a tablet or laptop. For true emergencies where a phone is all you have, you can at least run diagnostic queries — SHOW PROCESSLIST, check connection counts, verify replication status. It is not ideal for complex investigation, but it is enough to determine severity and decide whether you need to wake up additional team members or if the issue can wait.
How do we practice emergency database access before an actual incident?
Run a game day. Once a quarter, simulate a 3 AM page: the on-call engineer must connect to a staging database using only a device that is not their normal work machine, follow the runbook, and run the diagnostic queries. Time the entire process from alert to first query result. If it takes more than five minutes, something in your access chain needs fixing. Google, Netflix, and Amazon all run regular disaster recovery exercises — there is no reason your team cannot do the same for database access specifically.
Conclusion
Emergency database access is not a feature you evaluate during business hours at your configured workstation. It is a capability you rely on at 3 AM, groggy, on whatever device is within arm's reach, when production is down and users are impacted.
The pattern that works: a pre-whitelisted static IP that never changes, browser-based access that requires no local installation, read-only emergency accounts that allow safe diagnosis, and runbook queries so you do not have to think through diagnostic steps under pressure.
The pattern that fails: device-dependent tools, VPN clients that may not be installed, SSH keys that live on one specific laptop, and credentials scattered across .env files and Slack DMs.
DBEverywhere was built for exactly this scenario — a static IP you whitelist once, browser-based phpMyAdmin and Adminer access from any device, and session management that keeps your connections secure. The free tier includes 5 sessions per month, which is enough to cover on-call rotations. The paid tier at $5/month adds unlimited sessions, 8-hour timeouts for longer investigations, and saved connections so your team can pre-configure production access paths.
Set up your emergency access path before the next incident. The 3 AM version of you will be grateful.
Try DBEverywhere Free
Access your database from any browser. No installation, no Docker, no SSH tunnels.
Get Started