Floating-Point Precision and Storage in SQLite

Floating-Point Representation and Storage in SQLite

SQLite, like many other databases, uses the IEEE 754-2008 standard for representing floating-point numbers. This standard defines how floating-point numbers are stored in binary format, which is crucial for understanding why certain values, such as 0.2, cannot be represented exactly. In SQLite, the REAL datatype is used to store floating-point numbers, and it adheres to the IEEE 754 double-precision format, which uses 64 bits to represent a number. This format includes a sign bit, an exponent, and a significand (also known as the mantissa). The precision of the floating-point number is determined by the number of bits allocated to the significand, which in the case of double-precision is 52 bits.

When you insert a value like 0.2 into a column with NUMERIC affinity, SQLite will attempt to store it as a REAL if it cannot be represented as an integer. However, due to the nature of binary floating-point representation, 0.2 cannot be represented exactly. Instead, it is stored as the closest possible approximation, which is why you see 0.20000000298 when you insert 0.2F from your C# program. This approximation is a result of the conversion from the decimal representation (0.2) to the binary representation used by the IEEE 754 standard.

The typeof operator in SQLite returns the storage class of a value, which can be integer, real, text, blob, or null. When you insert 0.2 into a column with NUMERIC affinity, SQLite stores it as a REAL, and thus typeof returns 'real'. This is consistent with the behavior described in the SQLite documentation, where NUMERIC affinity columns will store values as REAL if they cannot be represented as integers.

Accumulation of Floating-Point Errors in Arithmetic Operations

One of the key issues highlighted in the discussion is the accumulation of floating-point errors during arithmetic operations. When you perform repeated additions of a floating-point number, such as adding 0.2 multiple times, the small errors in each addition can accumulate, leading to significant discrepancies over time. This is a well-known phenomenon in floating-point arithmetic and is not specific to SQLite.

For example, consider the following recursive SQL query that adds 0.2 repeatedly:

WITH RECURSIVE seq(x) AS (
  VALUES(0.2) 
  UNION ALL
  SELECT x + 0.2 FROM seq
)
SELECT x FROM seq LIMIT 6 OFFSET 110;

The result of this query shows that after 110 additions, the value of x is no longer exactly 22.0, but rather 22.199999999999953104179439. This discrepancy is due to the accumulation of small errors in each addition. Each time 0.2 is added, the result is an approximation, and these approximations add up over time.

The IEEE 754 standard guarantees that each individual arithmetic operation is as accurate as possible, but it does not guarantee that the result of multiple operations will be exact. This is why, in the example above, the error accumulates with each addition. The error is proportional to the number of operations performed, and in this case, after 110 additions, the error is significant enough to be noticeable.

Mitigating Floating-Point Errors with Guard Bits and Exact Rounding

To mitigate the accumulation of floating-point errors, you can use techniques such as guard bits and exact rounding. Guard bits are additional bits used during arithmetic operations to maintain extra precision, which can help reduce the error in intermediate results. Exact rounding refers to the process of rounding the result of an arithmetic operation to the nearest representable floating-point number, using a specific rounding mode (e.g., round to nearest, ties to even).

In SQLite, you can implement exact rounding by using a custom function that rounds the result of each arithmetic operation to a specified number of decimal digits. For example, the sigdigits() function can be used to round the result of each addition to 14 decimal digits, which is approximately 49 bits of precision. This ensures that the result of each operation is as accurate as possible, reducing the accumulation of errors over multiple operations.

Here is an example of how you can use the sigdigits() function to reduce the accumulation of errors in the recursive addition example:

WITH RECURSIVE seq(n, x, y, z) AS (
  VALUES (1, 0.2, 0.2, 0.2) 
  UNION ALL 
  SELECT n + 1, x + 0.2, sigdigits(x + 0.2), (n + 1) * 0.2 FROM seq
)
SELECT n AS Count,
   format('%!.26f', z) AS "Single Multiplication",
   format('%!.26f', x) AS "Straight Sum",
   ulps(z, x) AS "ULPS Straight Sum",
   format('%!.26f', y) AS "Guarded Sum",
   ulps(z, y) AS "ULPS Guarded Sum"
FROM seq
LIMIT 116;

In this query, the sigdigits() function is used to round the result of each addition to 14 decimal digits, which reduces the accumulation of errors. The ulps() function is used to measure the difference between the exact result (obtained by multiplying the count by 0.2) and the result obtained by repeated addition. The results show that the guarded sum (using sigdigits()) has significantly less error compared to the straight sum (without rounding).

Conclusion

Understanding how floating-point numbers are stored and manipulated in SQLite is crucial for avoiding unexpected results in your queries. The IEEE 754 standard provides a precise way to represent floating-point numbers, but it also introduces limitations, such as the inability to represent certain decimal numbers exactly. This can lead to the accumulation of errors during arithmetic operations, which can be mitigated by using techniques such as guard bits and exact rounding.

By being aware of these issues and applying the appropriate techniques, you can ensure that your SQLite queries produce accurate and reliable results, even when dealing with floating-point numbers. Whether you are inserting values from a C# program or performing complex arithmetic operations, understanding the nuances of floating-point representation will help you avoid common pitfalls and achieve the desired outcomes in your database applications.

Related Guides

Leave a Reply

Your email address will not be published. Required fields are marked *