Back to Blog

Database Administrator Interview Questions 2026

HiringInformation TechnologyIT Hiring

Table of Contents

Database Administrator Interview Questions 2026

Last updated: May 18, 2026 | By Tom Kenaley

Database administrator interview questions in 2026 should target three archetypes: production DBA, cloud platform DBA, and data engineer with DBA duties. Strong loops run fourteen technical questions plus three hands-on scenarios scoped to the actual stack.

The bad loop has fifty questions, no real scenarios, and a debrief built around gut feeling rather than written rubric scores, which is how a search ends up open for sixty days with three rejected candidates and no clear pattern to fix. The hiring team takes three hours, the candidate leaves frustrated, the debrief ends with “we like her but we still aren’t sure she can run a real failover.” We see that pattern every quarter on our IT desk, and the fix is almost always to throw out 70% of the warmup questions and add a single real scenario. The shape below is the version that closes inside our 17-day median for IT.

Tom Kenaley, co-founder at KORE1. The screening loop is where most DBA searches go sideways even when the JD is correct. We get pulled into the second round of intake calls when a client has interviewed four candidates and rejected all of them with notes that read “great background but the panel wasn’t impressed by her SQL.” Translation: the panel asked twenty leetcode-style questions and never asked her to read a slow query plan. That is the kind of feedback we hear once a week. KORE1 charges on the placement, not the interview design, so the rubric below is one we share with clients before we send a single candidate over.

This post pairs with our companion piece on the database administrator job description template, and the same three-archetype split runs through both. If the JD is honest about which archetype the team is hiring, the question set practically writes itself. We place this work through our database administration staffing practice and our broader IT staffing services, with a 92% twelve-month retention rate on direct-hire placements across the 30+ U.S. metros we serve.

Senior database administrator at a three-monitor workstation reviewing a slow query execution plan and dashboards in a modern data operations center

The Three Archetypes Decide the Question Set

Production DBAs answer questions about operations, recovery, and tuning. Cloud platform DBAs answer questions about Terraform, replication topology, and cost. Data engineers with DBA duties answer questions about Snowflake design, dbt models, and OLTP query patterns. The same warmup question across all three archetypes is a wasted slot, because the production DBA wants to talk about engines and recovery, the platform DBA wants to talk about Terraform and replication topology, and the hybrid wants to talk about warehouses and dbt.

Anchor the loop to the engine the new hire will own in the first 90 days. If the team has decided it is hiring a production Postgres DBA, the loop spends 60% of its weight on Postgres internals, recovery, and replication. The remaining 40% is split across system design and behavioral signal. If the loop instead reads like a SQL trivia quiz, the strong candidates fail it on disinterest, the weaker candidates pass it on rote memorization, and the panel walks out of the debrief unable to tell those two failure modes apart.

Here is the question budget we recommend before any kickoff.

ArchetypeDeep Technical (Q count)Hands-on ScenariosBehavioral / OwnershipTotal Loop Time
Production DBA14 (engine internals, recovery, tuning)3 (slow query, failed failover, broken backup)4 to 5 (on-call history, change management)2.5 to 3 hours, ideally split across two days
Cloud Platform DBA10 (Terraform, IaC, replication, cost model)3 (cross-region failover design, RDS migration, cost-cut deep dive)4 to 5 (cross-team work, design review history)2.5 to 3 hours across two days
Data Engineer + DBA8 (warehouse, dbt, OLTP indexing)3 (Snowflake cost spike, dbt model review, transactional index plan)4 to 5 (stakeholder management, data quality)2 to 2.5 hours across two days

14 Production DBA Interview Questions That Filter Real Senior Operators

These are the questions we run with clients hiring a production DBA on PostgreSQL, MySQL, SQL Server, or Oracle. Each question has a snippet of what a strong answer sounds like, plus notes on what the red-flag version sounds like. The numbers next to each question are roughly the order we run them, not a strict script.

1. Walk me through how you would diagnose a slow query that ran fine yesterday.

Open-ended on purpose. The strong answer goes: pull the plan, compare it to a known-good baseline, check if statistics are stale, check if a parameter sniffing or plan cache eviction happened, check if a new index is being used incorrectly, look at the workload mix on the host. Bonus if they ask which engine before answering. Red flag: jumps straight to “add an index” without asking what changed.

2. A standby fell behind in WAL replication overnight. What do you check?

This is the question that separates senior from mid-level fastest. Strong answer covers network throughput, disk IO on the standby, autovacuum or maintenance windows, archive command failures, and whether the primary has a long-running transaction holding xmin. Mid-level candidates name two of those. Senior candidates name five and ask whether the standby is for read load or just DR.

3. What is your backup strategy for a 4 TB OLTP database with a 5-minute RPO?

Specific numbers in the question force a specific answer. Strong candidates discuss base backups plus WAL streaming, point-in-time recovery, and verification cadence (“we restore to a sandbox every week and run a known checksum”). They will mention the difference between physical and logical backups. They will not handwave the verification step. The candidates who say “we use the daily snapshot from the cloud provider” without naming a verification process or a recovery rehearsal cadence are the ones who blow up during a real disaster while the rest of the team scrambles to find a usable backup somewhere in the chain.

4. Explain the difference between an index seek and an index scan, and when a scan is the right plan.

Looks elementary. Filters out a lot of candidates who memorized the textbook answer but cannot apply it. Strong answer notes that on a small table, a scan can beat a seek due to cache locality. Notes that when the optimizer expects to read more than 5 to 10 percent of a table, a scan plus filter often beats an index lookup. Names the engine because the threshold is engine-specific.

5. Describe a real on-call incident that you owned end to end.

The story question. Listen for: clear statement of what broke, what they tried first, what worked, what they learned, what changed in the runbook afterward. The candidates who cannot tell you the runbook change are not the ones you want on call. The candidates who tell you the story in past tense with specific timestamps are.

6. How do you decide between adding an index, rewriting the query, or denormalizing a table?

Triangle of trade-offs. Strong candidates say it depends on read-write ratio, working set size, application-layer caching, and how much downtime an index build costs. They will mention CONCURRENTLY in Postgres or ONLINE in SQL Server. They will mention that denormalization is a last resort because the write amplification is rarely worth it for OLTP.

7. Walk me through a parameter sniffing problem you have actually debugged.

SQL Server biased question. Substitute “plan cache pollution” for other engines. Listen for OPTION(RECOMPILE) discussion, plan guides, or the modern fix of using forced parameterization on a specific query family. The vague answer is the wrong answer here. A real candidate has a war story.

8. What do you do the first week if you inherit a database environment with no documentation?

Operations instinct question. The right answer starts with reading the alerting and on-call rotation, then the most recent four weeks of incidents, then the schema and the busiest tables, then the configuration drift between the standbys and primary. Wrong answer: “I would document everything.” That is the answer of someone who has never inherited a real environment.

9. Walk me through how you would size a connection pool for a new application going from 50 to 5,000 requests per second.

This question catches the candidates who only think in queries per second and miss the connection-per-backend cost. Strong answer notes that for Postgres specifically, every backend connection consumes work_mem and the kernel has a process cap, so 5,000 RPS does not mean 5,000 connections. PgBouncer or RDS Proxy comes up. So does the math between active connections and effective parallelism. Junior candidates assume the pool size equals the RPS. They are wrong.

10. What is the trade-off between synchronous and asynchronous replication?

The classic. Strong answer ties it to RPO, application latency, and what the standby is for. Names the protocol-level cost. Mentions semi-synchronous as a middle option. Mentions that quorum commit is a thing on Postgres and that SQL Server Always On has its own model. Weak answer just says “sync is safer, async is faster.”

11. How do you handle a long-running transaction that is bloating your tables?

Postgres bias again. The senior answer involves locating the offending session, understanding why xmin is being held, deciding whether to kill the session or wait, and then thinking about VACUUM strategy after. Bonus points for naming pg_stat_activity, for naming the autovacuum tuning parameters they would adjust, for distinguishing between the cosmetic table bloat that vacuum cleans up and the index bloat that needs a REINDEX CONCURRENTLY, and for knowing the difference matters in production.

12. Describe how you partition a 2 billion row fact table.

This blurs into data engineering. Strong DBA answer covers the partition key choice (time-based vs hash), the maintenance overhead of partition switching, the impact on the query optimizer, and the failure modes when partitions are pruned wrong. Names the engine. Talks about the implementation cost of moving from a non-partitioned to a partitioned model.

13. What is your stance on stored procedures versus application-side query logic?

Open religious war. We do not care which side they land on. We care that they can explain the trade-off without sneering. Strong candidates note that some engines optimize procedures better, that procedures lock you into the engine, that application-side logic ages better with refactors, that the right answer depends on team composition, and that the worst answer is the one that pretends the trade-off is not real. Listen for nuance.

14. Tell me about a time you pushed back on a request from the application team or product.

Closes the loop. A DBA who has never pushed back is a DBA who has not lived through a real production incident. We want a story with a specific request, a specific objection, and a specific outcome. The outcome can be “we did it their way anyway and it broke and we rolled back” and that is still the right kind of answer.

Two cloud platform DBAs at a whiteboard sketching a multi-region database replication topology and failover plan

10 Cloud Platform DBA Interview Questions for the IaC-First Role

Different track, different stack center of mass. Cloud platform DBAs live half in Terraform and half in the cloud provider console. The loop has to test both halves.

15. Walk me through your last RDS, Cloud SQL, or Azure SQL migration.

Open the conversation here. Listen for the cutover plan, the rollback plan, the dry-run cadence, the application-side connection string changes, and how they validated success after cutover. Bonus for naming logical replication for zero-downtime cutover. Bonus for telling you what broke during the dry run that they fixed before production.

16. How would you design cross-region replication for a mission-critical database?

Whiteboard or talk-out-loud. Strong candidates start with the RTO and RPO requirements before drawing anything. They will discuss synchronous within region, asynchronous between regions, the failover automation, the cost of cross-AZ versus cross-region traffic, and the read-replica routing strategy. Mention the egress cost is the senior signal.

17. What is the most expensive line in your cloud database bill last year, and how did you reduce it?

Senior cloud platform DBAs own a real cost line item. The answer should be specific: provisioned IOPS on a workload that did not need them, oversized instance class on a noisy neighbor situation, cross-AZ replication traffic that could have been a single-AZ setup with a read replica in the other AZ, or the storage cost of point-in-time backups that nobody had reviewed since 2022. A candidate who cannot answer this has never owned the bill.

18. Show me a Terraform module you wrote for provisioning a managed database. Read me through the choices.

The take-home that is not a take-home. Have the candidate pull up an actual file in screen share or paste a redacted version. Look at: the variable boundaries, the use of locals, the lifecycle blocks around password rotation, the backup configuration, and how parameter groups are managed. The candidates who only have a five-line module with hardcoded values are not the seniors you want.

19. How do you handle secrets and password rotation for a fleet of fifty managed databases?

HashiCorp Vault, AWS Secrets Manager, or the equivalent. The right answer talks about rotation cadence, application-side reload patterns, the dual-secret pattern during rotation, and how to avoid the famous outage where a rotation succeeded but no service knew the new password. There is a real story behind that pattern and senior candidates have a version of it.

20. What is your strategy for upgrading a managed Postgres or MySQL fleet across major versions?

Strong answer: read the breaking-change notes, build a sandbox, replay representative traffic, run pg_upgrade or its managed equivalent on a non-prod fleet, time the actual downtime window, communicate to dependent teams a week early, and have a documented rollback. Weak answer: “we click the upgrade button in the console.”

21. Explain how connection pooling changes between an on-premises Postgres and RDS Postgres.

Tests both worlds. RDS Proxy is the headline difference. The candidate who has run both setups can talk about why RDS Proxy is sometimes the right answer and sometimes adds latency that an external PgBouncer would not. Specific numbers from real workloads is the senior signal.

22. How do you size IOPS for a write-heavy workload on managed cloud storage?

Provisioned IOPS gets expensive fast. The right answer talks about understanding the actual write pattern (random vs sequential), the queue depth, the burst credit model on gp3 versus io2, and the cost cliff at provisioned IOPS. Mentions the relationship between IOPS and throughput. Mentions that EBS-optimized instance types matter.

23. Describe a database-related cost optimization you owned that saved real money.

Specific dollar figure expected. Strong candidate names a number and the mechanism. Right-sizing an oversized instance, moving cold partitions to cheaper storage, reducing the backup retention window after auditing what was actually needed, or migrating a workload from io2 to gp3 after measuring the actual IOPS pattern. The story is what we care about.

24. What is your view on multi-cloud database strategy?

We ask this to surface religion. Most senior platform DBAs have a strong view in one direction or the other, and we care that they can defend it with specifics about portability, cost, the operational cost of maintaining expertise on two providers, and the regulatory pressure that occasionally forces the question whether the team likes it or not. The pragmatic answer (“single cloud for the main workload, second cloud only when a real business reason demands it”) is usually the senior answer. Strong objection to multi-cloud as a default is also fine.

8 Data Engineer + DBA Hybrid Interview Questions

This archetype lives on the data platform team. The loop is shorter because the warehouse side is well-trodden and the OLTP side is the secondary skill. Anchor 4 questions on warehouse design and dbt, 4 on transactional database fundamentals.

25. Explain how Snowflake, BigQuery, or Databricks handles a 100 GB join differently than Postgres.

Engine architecture question. The strong candidate names the columnar storage, the MPP execution model, the difference between scaling compute and scaling storage independently, and the cost model. Notes that what is a $0.20 query on Snowflake can be a $5 query if a partition predicate is missing. Bonus for naming clustering keys and how they affect pruning.

26. Walk me through a dbt model you are proud of and one you would rewrite today.

Story plus self-critique. The candidate who has never written a dbt model they regret is a candidate who has not shipped many dbt models. We want the regret with specifics. A bad incremental strategy, a SCD type 2 that was actually a type 1, a fan-out join that took the warehouse down, all of those are honest answers.

27. How do you index a 500 million row transactional table that is queried by both an OLTP application and an analytics export?

Tests the bridge between the two worlds. Strong candidate splits the workload. Mentions covering indexes for OLTP, partition pruning or row-level partitioning for the analytics export, and the cost of additional indexes on write throughput. Mentions filtered or partial indexes if available on the engine.

28. How do you design a slowly-changing dimension table that the warehouse and the application both read?

Modeling question. Strong candidate covers the SCD type choice (1, 2, 6), the surrogate key generation, the natural key, the effective-dated logic, and the read-time cost. Names the engine because the answer is different on Snowflake than on Postgres.

29. What is your approach to data quality testing inside the data platform?

Open-ended on purpose. Strong candidates name dbt tests, Great Expectations, or a homegrown framework, and they can distinguish between schema tests (data type, nullability) and value tests (referential integrity, distribution drift, freshness). Bonus for talking about which failures should fail the build versus which should warn.

30. Describe how you would investigate a Snowflake or BigQuery cost spike.

Mirror of question 17, on the warehouse side. The right answer is methodical: query history filtered to the spike window, top warehouse by credits, look at the largest single queries, check whether a model schedule changed, check whether an upstream data volume changed. Senior candidates name the SQL.

31. How do you decide what belongs in the warehouse versus what stays in the application database?

The architectural question. Strong answer talks about latency requirements, query patterns (point lookups versus aggregations), and the cost of duplication. Mentions reverse-ETL as the path for getting warehouse data back into the application. Notes that the wrong split shows up as a Slack message at 2 a.m.

32. Tell me about a time you owned a data outage that the business actually noticed.

Closes the loop the same way as question 5 did for the production DBA. We want the story with specifics. The candidates who treat data quality issues as the analytics team’s problem are the candidates who do not last long in this role.

Hiring manager and DBA candidate working through a hands-on scenario interview at a glass conference table

The Three Hands-On Scenarios We Run Every Time

The loop without scenarios is the loop that produces ambiguous debriefs. The loop with three good scenarios produces clean signal. We rotate these three across the panel so each interviewer owns one.

Scenario A: The slow query at 9 a.m.

Hand the candidate a real-ish execution plan from your environment with the schema and table sizes stripped of anything sensitive, the original query lightly anonymized, and the runtime numbers preserved so the gap between yesterday’s plan and today’s plan is visible. Tell them the query was sub-second yesterday and 90 seconds this morning. Give them 25 minutes. Watch how they read the plan. Watch whether they ask for stats, for the schema, for the volume of data that changed overnight, and for the parameter values the plan was compiled against, because the senior candidates always ask at least three of those before they say anything else. The candidates who jump to “add an index” without reading the plan first are the ones whose tuning takes three iterations in production.

Scenario B: The failed failover at 2 a.m.

Verbal scenario. “The primary went down at 2 a.m. The standby was promoted by the automation. Half an hour later the app is throwing 500s on what should be reads. What do you check?” Listen for the order. Strong candidates check replication lag first, then connection routing, then whether the standby has the same configuration as the old primary, then whether a long-running session somewhere in the application tier is holding locks and starving the rest of the read workload. Weak candidates start by restarting things.

Scenario C: The cost spike on the second of the month.

“The cloud database bill came in 38% higher than last month. The team has not pushed a code change in two weeks. Where do you look?” The strong candidate methodically rules out compute, storage, IOPS, and cross-AZ traffic. The mid-level candidate guesses. The junior candidate freezes. This is the scenario we run when the role is cloud-platform-leaning even if the title says production DBA, because cost ownership is non-negotiable in 2026.

Questions to Cut From Your DBA Loop

The bad question list is shorter than the good one but every entry on it has cost a client a search.

  • Pure SQL trivia: “What is the difference between INNER JOIN and LEFT OUTER JOIN?” Anyone who needs help here is not a DBA. Anyone who is a DBA has answered this question fifteen times and is annoyed.
  • Leetcode-style algorithm puzzles: Production DBAs do not get paid to invert binary trees. Cloud platform DBAs do not either. Skip them.
  • Gotcha syntax questions: “What does this PL/SQL block do?” with intentionally obfuscated code. You are testing patience, not skill.
  • Vendor certification quizzes: “List all the parameters in pg_hba.conf in order.” The senior candidate looks them up like a sane person.
  • “What is your weakness?”: This question has not produced signal in either direction since 2010. Cut it.
  • “Where do you see yourself in five years?”: Replace with “What does the next two years look like if you take this role and the team treats you well?” That gets you a real answer.

How Candidates Score Themselves vs How We Score Them

One pattern worth flagging. Senior DBAs underrate themselves on warmup questions and overrate themselves on platform questions where they have less direct experience. The panel can over-index on the warmup performance and miss the real signal. Run the platform questions before the warmup ones if you can, or weight them differently in the rubric.

The rubric we ship to clients is simple. Each question gets a score of 1 to 4. A 1 is “cannot answer at all.” A 4 is “answered the question and then went deeper than the interviewer expected.” For a senior production DBA hire, we want average scores of 3.0 or higher on the technical questions, at least one 4 on either the recovery or the on-call story, and zero 1s on anything in the first 14. Use a salary band you can defend with our salary benchmark assistant before the offer goes out.

Things People Ask Before They Run Their Next DBA Loop

How long should the full DBA interview loop run?

2.5 to 3 hours of candidate time across two days. The two-day split matters because a strong candidate who had a rough first session often performs better when they have slept on the technical scenarios and can come back fresh. Single-day loops burn out everyone in the room, the candidate first.

Should we use a live SQL test instead of scenario-based questions?

Live SQL is fine for the data engineer + DBA hybrid role where writing SQL is genuinely a daily activity. Skip it for production DBA hires where the work is operational and the typing is mostly in psql. A scenario about a slow query plan is a higher-signal version of the same skill check, and it tests engine knowledge instead of just syntax recall.

Are vendor certifications a strong hiring signal?

Mid-level signal at best. The Oracle Certified Professional and Microsoft Certified: Azure Database Administrator Associate filter out the resume pile from someone who has never touched the platform. The certifications do not separate good senior DBAs from great ones. Real on-call history and migration scars do.

How do we interview a DBA when nobody on the panel knows the engine deeply?

This is the call we get more often than any other. Two options exist. One is to borrow an engineer from a sister team who has more depth on the engine, even if they sit in a different function. The other is to bring in an external technical screener for the hardest two sessions of the loop, usually a contractor with ten or fifteen years on the engine the team is hiring for, paid hourly for the screen and not involved in any other part of the search. We do this for clients on retained searches and the conversion rate on candidates approved by an external screener runs 30 to 40 percent higher than on candidates screened by a non-specialist panel.

Take-home assignment, or no?

Skip them for DBA roles. The take-home that tests something real (write a Terraform module, design a backup strategy) takes a senior candidate six to ten hours to do correctly, and most senior candidates have multiple offers running. A 25-minute scenario in the loop produces the same signal without burning the candidate’s evenings.

How fast can KORE1 produce qualified DBA candidates?

Our median time to first shortlist on the DBA desk in 2025 to 2026 was 9 days from intake call to a shortlist of three to five candidates. Median time to hire was 17 days for the IT desk overall. Cloud platform DBA roles run slightly longer because the candidate pool is thinner. We work direct hire, contract, and contract-to-hire engagements through our direct hire staffing practice.

What is the worst hiring mistake teams make in 2026?

Hiring an old-school production DBA for a role that is 70% Terraform and IaC. The candidate is competent at the database half and miserable at the platform half. They tend to leave inside a year. The opposite mistake also happens. A platform engineer who has never owned a real production database gets hired for what looks like a cloud-platform role but actually involves 3 a.m. recovery work. Pick the archetype first.

Talk to Us Before You Run the Next Loop

If you are about to open a DBA search, the version of this conversation that saves the most time is the 20-minute intake call where we lock the archetype and the engine before sourcing starts. Reach out to our recruiting team and we will walk the JD and the interview rubric with you. No charge to scope. Fee on the placement if we go forward.

If you want context on the Bureau of Labor Statistics outlook for the broader database administrator population, the BLS Occupational Outlook for database administrators projects 8% growth through 2033, which understates the shape change inside the category. The 2024 Stack Overflow Developer Survey shows PostgreSQL passing MySQL as the most-used database among professional developers, which is a real shift in the candidate pool. The JetBrains Developer Ecosystem 2024 survey shows similar momentum for managed cloud Postgres specifically.

Leave a Comment