20/03/2024

‘OR’ and ‘IS NULL’ Might Be Holding You Back

Where “OR” and “IS NULL” condition frolic unchecked, queries often find themselves stumbling down the performance rabbit hole—a place where CPU cycles burn faster than a comet tail and I/O operations multiply with the fervor of rabbits in spring.

Our goal is clear-cut—to guide database professionals toward faster, more efficient query execution. We’re rolling up our sleeves to identify and address the reasons these particular patterns drag down performance.

“OR” == BAD, why?

The “OR” condition, when used excessively in queries, can lead to several inefficiencies. One of the primary issues is that it can cause Oracle to evaluate the query multiple times, once for each condition separated by “OR”. This is because “OR” effectively broadens the search criteria. This could prevent the database from utilizing indexes effectively. As a result, instead of a single, efficient path to retrieve the required data, Oracle may end up scanning larger portions of the table or index. This leads to increased I/O and CPU usage, and consequently, slower query performance.

Moreover, queries “enriched” by multiple “OR” conditions can become harder to read and maintain. Clarity is lost as more conditions are added, making it challenging to understand the query’s intention at a glance.

Optimizing Queries: Moving Away from “OR”

The “IN” condition provides a concise and efficient way to search for multiple values within a column. Compared to chaining several “OR” conditions together, “IN” can significantly reduce the complexity of the query and improve its execution plan. For example, consider the difference between these two approaches:

Using “OR”:

SELECT * FROM employees

WHERE department_id = 10

OR department_id = 20

OR department_id = 30;

Using “IN”:

SELECT * FROM employees

WHERE department_id IN (10, 20, 30);

Both queries aim to retrieve employees from departments 10, 20, and 30. However, the “IN” version is not only more readable but also allows Oracle’s query optimizer to use indexes more effectively and potentially reduce the number of scans required to fetch the desired data.

The Issues with “IS NULL”

“IS NULL” conditions present a different set of challenges. It’s problematic primarily due to how NULL value is treated by Oracle’s indexing mechanisms, specifically B-TREE indexes. In Oracle, a NULL value signifies the absence of any data. While this can be semantically appropriate for certain scenarios, it introduces inefficiencies in query processing.

B-TREE indexes do not store entries for entirely NULL keys, which means that a query condition checking for IS NULL will not be able to leverage these indexes efficiently. As a result, Oracle might resort to full table scans to satisfy such queries, leading to performance degradation, especially in large datasets.

Addressing the “IS NULL” Condition

The basics seem simple. If you know the value then let it be entered. If you don’t know the value then specify that it’s unknown. 

  • Default Values: By assigning a default value, you ensure that every record has a predictable value that can be indexed and searched efficiently. For example, if a column records the date an action was performed, and the action is always expected to occur, setting a default date (like the date of row creation) instead of allowing NULLs can improve index performance.
  • Placeholder Values for Unknown Data: In cases where a column might not have a meaningful default value—for instance, optional user information—it’s great to use a placeholder value to represent unknown or not applicable data. This avoids NULLs and ensures that the column remains index-friendly. For instance, using a specific date far in the past or a designated string value like ‘UNKNOWN’ or ‘N/A’ can be more efficient than NULLs.
  • Function-Based Indexes: Babette Turner highlighted it quite well. Leveraging function-based indexes can be a powerful strategy to optimize queries involving NULL values. By creating an index on a function that transforms NULL values into a searchable key, you can maintain index efficiency. Lets take an example. An index created on NVL(mycolumn, ‘ISNULL’) transforms NULLs into a constant value, ‘ISNULL’, which can be efficiently indexed and searched.