Unexpected Empty Results When IS NOT NULL Queries Conflict With NULL-Based Partial Indexes

Understanding NULL Comparison Semantics in Partial Indexes

The core issue arises from an interaction between SQLite’s three-valued logic system, partial index definitions, and query optimization behaviors. When a partial index is defined using a conditional expression that compares column values against NULL (e.g., WHERE v2 > NULL), this creates an index that remains perpetually empty due to NULL’s special handling in comparison operations. Subsequent queries attempting to retrieve those same rows using explicit NULL checks (WHERE v2 IS NOT NULL) may return empty result sets because the query optimizer preferentially uses the defective partial index, unaware that the index contains no valid entries.

This behavior stems from SQLite’s adherence to SQL-92 NULL handling standards combined with specific optimizer decisions when partial indexes exist. The v2 > NULL expression in the index definition always evaluates to NULL (not false), causing the partial index filter to reject all rows during insertion. When executing SELECT * FROM v0 WHERE v0.v2 IS NOT NULL, the query planner sees an existing index that potentially covers the v2 column and attempts to use it, despite the index containing zero entries. The conflict between the intended NULL check semantics and the optimizer’s index selection strategy leads to incorrect empty results.

Key components contributing to this behavior include:

  1. Three-valued logic implementation: SQL operations involving NULL produce UNKNOWN results rather than boolean true/false
  2. Partial index persistence mechanics: Index maintenance occurs at write-time using the index filter’s truth value
  3. Query optimizer cost analysis: Preferential use of existing indexes over full table scans when statistics suggest potential efficiency gains
  4. Expression normalization: How the parser handles comparison operations against NULL literals in DDL statements

Primary Causes of Empty Result Set Errors

1. Invalid Partial Index Filter Expressions
Comparisons against NULL using standard operators (>, <, =, etc.) always yield NULL in SQLite, making WHERE v2 > NULL a permanent filter that excludes all rows from the index. This creates a non-error-producing empty index that appears valid to the schema parser but contains no queryable data. The query optimizer still considers this index as a potential access path for relevant queries, leading it to scan an empty data structure instead of falling back to the base table.

2. Optimizer Index Selection Heuristics
SQLite’s query planner uses index presence statistics that don’t account for empty indexes. When multiple access paths exist, the optimizer:

  • Prioritizes indexes over full table scans
  • Considers index coverage for WHERE clause predicates
  • Estimates lower I/O costs for indexed access

In this scenario, the partial index’s existence tricks the optimizer into believing it can efficiently resolve v2 IS NOT NULL queries through index scanning. The empty index returns zero rows, overriding the actual table data containing valid records that satisfy the condition.

3. NULL Handling in Index Predicates
Partial indexes evaluate their WHERE clause during row insertion/update operations using exactly the same NULL semantics as normal queries. Unlike typical application logic where developers might expect col > NULL to behave like a range check, SQL standards dictate that any comparison with NULL yields NULL. This means the partial index filter:

WHERE v2 > NULL -- Evaluates to NULL for all rows

Never evaluates to TRUE, resulting in an index that systematically excludes all table rows regardless of their actual v2 values. The index becomes a persistent empty structure that still participates in query planning.

4. IS NOT NULL vs. NULL Comparisons
The v2 IS NOT NULL predicate in the SELECT statement uses different NULL-checking semantics than the index’s v2 > NULL filter. While IS NOT NULL correctly identifies non-null column values, comparison operators like > cannot produce meaningful results when NULL appears on either side. This semantic mismatch between the index definition and query predicate creates a logical trap where valid rows exist in the table but remain inaccessible through the optimizer’s preferred index access path.

Comprehensive Diagnostic and Resolution Protocol

Step 1: Validate Partial Index Membership
Execute an index content inspection query to verify whether the partial index contains any entries:

SELECT count(*) FROM sqlite_stat1 WHERE tbl='v0' AND idx='v4';

A zero count confirms the index contains no data, indicating a flawed filter predicate. For persistent debugging, use EXPLAIN QUERY PLAN to observe index usage patterns:

EXPLAIN QUERY PLAN 
SELECT * FROM v0 WHERE v2 IS NOT NULL;

Look for SCAN TABLE v0 USING INDEX v4 in the output, which confirms the optimizer is attempting to use the empty index.

Step 2: Correct Partial Index Predicates
Replace NULL comparisons with proper NULL-checking syntax in index definitions:

-- Original problematic index
CREATE UNIQUE INDEX v4 ON v0(v1) WHERE v2 > NULL;

-- Corrected version using IS NOT NULL
CREATE UNIQUE INDEX v4 ON v0(v1) WHERE v2 IS NOT NULL;

This adjustment ensures the partial index:

  1. Includes all rows where v2 contains non-null values
  2. Maintains accurate coverage statistics for the optimizer
  3. Properly supports queries filtering on v2 nullability

Step 3: Force Table Scan for Immediate Validation
Override the optimizer’s index selection to verify base table contents:

SELECT * FROM v0 NOT INDEXED WHERE v2 IS NOT NULL;

If this returns the expected row(s), it confirms the index is causing the empty result issue rather than missing data.

Step 4: Analyze Comparison Semantics
Test NULL comparison behavior in isolation to demonstrate why the original index fails:

SELECT 
  10 > NULL AS direct_compare,
  NULL IS NULL AS null_check,
  10 IS NOT NULL AS not_null_check;

This returns NULL, 1, 1 showing:

  • Direct comparisons with NULL yield NULL (not FALSE)
  • Proper NULL checks return boolean values

Step 5: Implement Index Condition Fixes
For existing databases with problematic indexes:

-- Drop defective index
DROP INDEX v4;

-- Recreate with corrected WHERE clause
CREATE UNIQUE INDEX v4 ON v0(v1) WHERE v2 IS NOT NULL;

-- Refresh optimizer statistics
ANALYZE;

Post-correction, verify index population with:

SELECT * FROM v0 WHERE v2 IS NOT NULL;

Which should now correctly return all non-null v2 rows.

Step 6: Optimizer Hint Utilization
When unable to immediately modify indexes, use SQLite’s optimizer hints to bypass defective indexes:

SELECT * FROM v0 WHERE v2 IS NOT NULL NOT INDEXED;

Or force specific index usage if other valid indexes exist:

SELECT * FROM v0 INDEXED BY other_valid_index WHERE v2 IS NOT NULL;

Step 7: Schema Validation Automation
Implement proactive checks to prevent NULL comparison misuse:

-- Query to detect partial indexes with NULL comparisons
SELECT 
  name AS index_name,
  sql 
FROM sqlite_master 
WHERE 
  type = 'index' AND
  sql LIKE '%> NULL%' ESCAPE '\' AND
  sql LIKE '%WHERE%';

Regularly run this query against your schema to identify indexes susceptible to the empty result issue.

Step 8: Query Plan Analysis Enhancement
Integrate EXPLAIN QUERY PLAN into development workflows to detect premature index usage:

EXPLAIN QUERY PLAN
SELECT * FROM v0 WHERE v2 IS NOT NULL;

Validate that the output shows either:

  • SCAN TABLE v0 for correct full table scans
  • SEARCH TABLE v0 USING INDEX v4 only when the index is properly populated

Step 9: NULL Handling Education
Develop team training materials emphasizing:

  • SQL’s three-valued logic (TRUE, FALSE, NULL)
  • Proper use of IS NULL/IS NOT NULL operators
  • Index predicate best practices (avoid direct NULL comparisons)
  • Query optimizer behavior with partial indexes

Step 10: Version-Specific Workarounds
For environments running SQLite versions affected by commit fc98218c (prior to 8cc23931):

  1. Apply the official patch or upgrade to a fixed version
  2. Review all partial indexes created during the affected period
  3. Rebuild indexes using corrected WHERE clauses
  4. Monitor query plans for residual optimizer issues

Long-Term Prevention Strategies

1. Static Analysis Integration
Incorporate SQL linters into CI/CD pipelines to flag:

  • Direct NULL comparisons in index predicates
  • Missing IS NOT NULL checks in query conditions
  • Partial indexes without verifiable coverage

2. Index Usage Monitoring
Regularly audit index utilization statistics:

SELECT * FROM sqlite_stat1 WHERE stat = '0 rows';

Identify and investigate indexes showing zero row usage, which may indicate defective predicates.

3. NULL-Aware Index Design Patterns
Adopt standardized index templates for NULL handling:

-- Non-null value index
CREATE INDEX idx_col_notnull ON table(col) WHERE col IS NOT NULL;

-- NULL-specific index
CREATE INDEX idx_col_null ON table(col) WHERE col IS NULL;

This bifurcated approach ensures proper coverage and optimizer statistics for both null and non-null value queries.

4. Query Plan Regression Testing
Implement automated tests that:

  1. Execute critical queries
  2. Capture EXPLAIN QUERY PLAN outputs
  3. Validate expected access paths (index scans vs table scans)
  4. Alert on optimizer strategy changes

5. Index Predicate Documentation Standards
Require detailed comments for all partial indexes:

CREATE INDEX idx_example ON table(col) 
WHERE condition -- Purpose: Filter non-null values for X queries

This practice forces conscious evaluation of NULL handling during index creation.

By systematically applying these diagnostic procedures, correction methods, and prevention strategies, developers can eliminate empty result errors caused by improper NULL handling in partial indexes while optimizing SQLite’s query planner behavior for reliable data retrieval.

Related Guides

Leave a Reply

Your email address will not be published. Required fields are marked *