14/03/2024

The Nightmare of Database Deadlocks

A database deadlock occurs when two or more transactions in a database system hold locks on resources the others need. Each transaction waits for the others to release their locks, and just like two drivers who are not sure who has the right of way, none can proceed. 

It would be lovely to be able to enumerate the causes of a database deadlock and address all of them. Sadly, it’s way too complex. It can be said, though, that a database deadlock is most likely to stem from poor application design, lack of proper transaction management, or the inevitable complexity of highly concurrent systems. 

Transactions that lock resources in inconsistent orders are particularly prone to deadlocks. For example, if Transaction A locks Resource 1 and then tries to lock Resource 2, while Transaction B locks Resource 2 and then tries to lock Resource 1, a deadlock ensues if both transactions occur simultaneously.

How Does Oracle Solve the Problem of Concurrent Changes?

That’s the neat part – it doesn’t, at least not at the base. Instead of addressing the core of the problem, it allows the execution of one and withdraws the other.

When Oracle detects a database deadlock situation, it doesn’t leave the transactions hanging. Instead, it chooses one of the transactions as the “victim” and rolls it back, freeing up the resources it had locked. This action allows the other transactions to proceed. The decision on which transaction to roll back is typically based on internal criteria designed to minimize work loss and maintain system integrity.

Upon resolving a deadlock, Oracle logs the incident by generating an ORA-00600 error in the alert log, alongside a trace file containing detailed information about the deadlock situation. This trace file includes the SQL statements involved, the objects locked, and the transactions’ identifiers, providing valuable insights for diagnosing the root cause of the deadlock.

The DBA’s Dilemma

Deadlocks are a huge challenge for DBAs, primarily due to their elusive nature and the complexity involved in diagnosing and resolving them. Why? – That’s what we’re going to answer in a second. 

The Invisible Enemy

One of the main reasons deadlocks are a nightmare for DBAs is their invisibility to developers. They occur deep within the database’s infrastructure, consuming server resources and affecting the database’s performance without any apparent cause from the application layer. This makes them a ghostly problem: present and impactful, yet difficult to spot and exorcise.

The invisibility issue is compounded by the fact that deadlocks are typically a symptom of application-level problems—such as poor design or transaction management—manifesting within the database environment. This places DBAs on the front lines. They are often the first to encounter these issues, yet the root cause lies in the application code, outside their direct control.

The Challenge of Replication and Diagnosis

Replicating and diagnosing deadlocks is yet another challenge. It’s mainly because they are hard to predict and reproduce in a non-production environment. Deadlocks often result from very specific timing and sequences of operations that are difficult to mimic deliberately. In the developer’s local environment, where the load and concurrency are significantly lower than in production, deadlocks might never show.

This unpredictability means that in 99.99% of cases (Source: Trust Us), developers do not anticipate deadlocks occurring. As a result, applications are frequently not designed with deadlock handling in mind. That’s why we sometimes have strange, one-off errors that are difficult to trace back to a deadlock, further obscuring the issue.

The Operational Nightmare

When deadlocks occur, they can cause unexpected behavior in applications, leading to errors or performance issues that are hard to diagnose. Since the application wasn’t prepared to handle deadlocks, these issues can seem random, leading to a frustrating experience for both users and the technical teams trying to resolve them.

Moreover, the database often takes the blam for these application-induced problems, putting DBAs in a challenging position. They must not only identify and resolve the symptoms of the deadlock but also communicate with development teams to address the issues in the application code.

Managing Deadlocks

Let’s devote a minute of silence to all of the DBAs who are currently struggling with deadlocks as there’s usually little they can do. 

When deadlocks strike, DBAs find their hands tied by the constraints of database management systems. What they can do involves generating and analyzing extensive logs that record the minutiae of database transactions. These logs, often great in size, are the key to understanding the deadlock. The key however, comes with no lock, or instruction manual for this matter. It offers no direct solution to the problem.

Naturally, experienced DBAs have their own sets of scripts for diagnosing and resolving such cases, but this is almost a secret knowledge.

Session Termination

Among the limited interventions available, one of the most decisive actions available to a DBA is the termination of sessions involved in a deadlock. This drastic measure, as described by database expert Jonathan Lewis, involves choosing a “victim” transaction to roll back, thus breaking the deadlock and allowing other transactions to proceed. 

The choice of which session to terminate is not an easy one. It involves careful consideration of the work lost by rolling back a transaction and the potential impact on the overall system and business operations. Sometimes, the decision is clear-cut, such as when one transaction significantly outweighs another in terms of processing time or importance. Other times, it’s a judgment call, made under the pressure of maintaining system integrity and minimizing disruption.

DBAs Call For Help and DBPLUS Will Answer

Among many functionalities of DBPLUS PERFORMANCE MONITOR, the “Locks” and “Logs” tabs can be quite useful when it comes to dissecting deadlock scenarios.

Core Features of DBPLUS in Deadlock Management:

  • “Locks” Tab: allows users to view a comprehensive history of locks. The list includes table locks, library locks, and specific locked objects. 
  • Visual Presentation: Charts and graphs, such as those showing the number of locked sessions and the total lock durations, help read through the deadlock problem. Although these visuals suggest simplicity in resolution, the true solution often lies in deeper application redesign.
  • “Logs” Tab: Access to Oracle’s alertlog through DBPLUS provides data without the need for direct server access. This feature is particularly useful for diagnosing issues post-incident, focusing on error entries like the ORA-00600, to better understand and prevent future deadlocks.

Key Points in the “Locks” Tab:

Database Deadlock - Key Points in The 'Locks' Tab
  1. Overview: Introduces the tab and its purpose.
  2. Locked Sessions Chart: Displays the count of currently locked sessions.
  3. Total Lock Duration Chart: Shows the cumulative duration of locks across all sessions.
  4. Blocking and Blocked Session Details: Provides information about sessions causing and affected by locks.
  5. SQL of Blocked Session: Lists the SQL commands involved in the blocked session.
  6. Detailed Session Information: Offers comprehensive details about the sessions involved in a deadlock.

Additional Features:

  • Locked Objects Screen (Pic02): Displays which objects are currently locked, enhancing the understanding of deadlock sources.
Database Deadlock - Locked Objects Screen
  • Session Termination Capability: Authorized users can terminate sessions directly within DBPLUS to quickly resolve deadlocks, though this option is typically disabled by default.
  • Oracle Alertlog Access (Pic03): Facilitates the review of Oracle alert logs for the past 30 days with a focus on error entries, aiding in the detailed analysis of deadlock incidents.
Database Deadlock - Alertlog