Data Truncation After Power Loss Despite fsync on MicroSD with SQLite
Understanding File Persistence Guarantees with fsync and Storage Hardware Limitations
Issue Overview: File Truncation to Zero Bytes After Power Loss in Embedded Systems
The core problem revolves around an embedded Linux device using a MicroSD card with vfat filesystem where application settings files occasionally truncate to zero bytes after abrupt power loss. The application follows a strict write-and-sync workflow:
- User modifies device startup settings stored in a binary file.
- Application writes data using
fwrite()
followed by explicitfsync()
on the file descriptor. - After ~60 seconds of inactivity, power is cut to simulate real-world power failure scenarios.
- On reboot, the settings file sometimes appears empty (size 0), despite prior synchronization attempts.
The developer augmented their code with an additional fsync()
on the directory file descriptor containing the settings file, referencing Linux manual page recommendations. However, uncertainty remains about whether this addresses the root cause, given the peculiar behavior of MicroSD cards and vfat filesystem quirks.
Key technical factors at play:
- SQLite’s fsync Implementation: While SQLite does sync directories during transactional commits (contrary to initial assumptions), custom file I/O routines in this application bypass SQLite’s built-in mechanisms.
- MicroSD Card Write Caching: SD card controllers often employ internal write buffering to optimize performance and wear leveling, decoupling OS-level fsync guarantees from physical NAND persistence.
- vfat Filesystem Limitations: The absence of journaling and metadata redundancy in FAT-based filesystems increases corruption risks during unflushed directory updates.
- Kernel-Level Storage Drivers: Older Linux kernels (e.g., 3.10) may lack robust fsync handling for specific storage controllers, particularly on embedded platforms.
This scenario exposes a critical gap between software-level synchronization and hardware-level data durability. Even with rigorous fsync usage, storage media characteristics and filesystem design can undermine persistence guarantees.
Root Causes: Why fsync Alone Fails to Ensure MicroSD Data Integrity
The truncation issue stems from a multi-layered failure chain involving filesystem semantics, hardware behavior, and OS-driver interactions.
1. Directory Metadata Sync Omission in Custom I/O Workflows
While SQLite’s winSync
implementation (in os_unix.c
) explicitly syncs directories via openDirectory()
and fsync()
, applications using raw file I/O (e.g., fwrite
+ fsync
) must manually replicate this behavior. The Linux fsync(2)
manpage clarifies that syncing a file alone doesn’t guarantee directory entry persistence. When a file’s size or location changes, the parent directory’s metadata must also be flushed.
In this case, the application’s initial code omitted directory syncing, risking metadata inconsistency. However, even after adding directory fsync()
, deeper issues persist due to:
- VFS Layer Abstraction Gaps: The vfat driver’s translation of directory operations to on-disk structures may delay or batch metadata updates.
- Embedded Kernel Inefficiencies: Kernel 3.10’s
vfat
driver might not propagatefsync
to the block device promptly, especially for removable media.
2. MicroSD Card Controller Caching and Write Reordering
SD cards internally manage data writes via flash translation layers (FTLs), which perform:
- Write Buffering: Accumulate sectors in volatile RAM before programming NAND.
- Wear Leveling: Remap logical blocks to physical pages, delaying writes.
- Error Correction: Retry failed writes transparently.
These processes introduce non-determinism between host-side fsync
and physical persistence. A Class 10 MicroSD card’s throughput (10MB/s sequential) doesn’t correlate with sync durability, as random small writes (e.g., 10KB settings file) are often batched. Power loss during FTL operations can corrupt logical block addressing (LBA) tables, leading to file truncation.
3. FAT Filesystem Vulnerability to Metadata Corruption
The vfat filesystem’s lack of journaling makes directory entries susceptible to partial writes. Key risk points:
- FAT Table and Directory Entry Coherence: File size updates require modifying both the FAT (file allocation table) and the file’s directory entry. A power failure mid-write can leave the directory entry pointing to an invalid cluster chain.
- Cluster Preallocation: Growing files may allocate new clusters before updating the directory entry. Aborted writes can leave "orphaned" clusters, which vfat’s
fsck
might misinterpret as a zero-length file.
4. Embedded Linux Storage Stack Latencies
Older kernels (3.10) exhibit suboptimal handling of block device synchronization:
- USB/SD Host Controller Drivers: Some drivers report
fsync
completion before data reaches the SD card’s buffer, violating POSIX semantics. - Writeback Cache Policies: Kernel I/O schedulers may defer or reorder writes to optimize throughput, defeating fsync’s purpose.
- Missing Barriers: Absent explicit storage barriers (e.g.,
BLKFLSBUF
ioctl), cached writes linger in RAM or controller buffers.
Resolving Data Loss: From Filesystem Tuning to Hardware Mitigations
Step 1: Validate Directory Sync in Application Code
Ensure directory fsync
occurs after file sync, covering all metadata changes:
void fsync_wrap(FILE *f, const char *dirpath) {
int fd = fileno(f);
if (fsync(fd) != 0) { /* handle error */ }
DIR *dir = opendir(dirpath);
if (!dir) { /* handle error */ }
int dirfd = dirfd(dir);
if (fsync(dirfd) != 0) { /* handle error */ }
closedir(dir);
}
Testing Sync Effectiveness:
strace
Monitoring: Usestrace -e trace=fsync,close ./app
to confirm both file and directoryfsync
invocations.- DebugFS Inspection: After simulated power loss, mount the SD card and check file/directory inodes via
debugfs -R 'stat /path/to/file' /dev/sdX
.
Step 2: Enforce Strict Cache Policies on MicroSD Mount
Override default vfat mount options to minimize caching:
mount -t vfat /dev/mmcblk0p1 /mnt -o sync,noatime,errors=remount-ro
sync
: Disables write caching, forcing immediate block device writes (slower but safer).noatime
: Prevents access-time updates, reducing metadata writes.errors=remount-ro
: Avoids RW mounts after corruption.
Caution: sync
mounts degrade throughput significantly. Benchmark with bonnie++
to assess viability.
Step 3: Leverage SQLite’s Built-in Durability Features
Avoid custom file I/O for settings storage. Instead, use SQLite with:
PRAGMA journal_mode = TRUNCATE;
PRAGMA synchronous = EXTRA;
synchronous=EXTRA
: LikeFULL
but also syncs directory entries post-commit.- Write-Ahead Log (WAL): Not recommended for vfat due to lock file issues.
SQLite’s winSync
implementation already handles directory syncing via:
if( isDirectory ){
rc = fsync(dirfd);
}
Step 4: Mitigate SD Card Controller Caching
A. Force Cache Flushes via ioctl
:
#include <sys/ioctl.h>
#include <linux/fs.h>
int block_fd = open("/dev/mmcblk0", O_RDONLY);
ioctl(block_fd, BLKFLSBUF); // Flush block device buffers
close(block_fd);
B. Use sync
and fdatasync
Redundantly:
sync(); // Global sync
fsync(fd);
syncfs(fd); // Sync filesystem containing fd
C. Hardware Write Barriers:
If the SD card supports UHS-II or eMMC protocols, enable write barriers in the kernel:
echo 1 > /sys/block/mmcblk0/queue/write_cache
Step 5: Switch to a Journaling Filesystem
Replace vfat with a journaling FS like ext4 or f2fs:
mkfs.ext4 -O ^has_journal -E lazy_itable_init=0,lazy_journal_init=0 /dev/mmcblk0p1
mount -o data=journal,sync /dev/mmcblk0p1 /mnt
data=journal
: Journals both data and metadata (safer but slower).sync
: Combine with ext4’s journal for atomic updates.
Note: Some SD cards perform poorly with journaling due to excessive write amplification.
Step 6: Simulate Power Failures with QEMU
Model power loss scenarios using QEMU’s dynamic instrumentation:
A. Build a Custom ARM VM:
qemu-system-arm -M virt -cpu cortex-a15 -drive if=none,file=sdcard.img,format=raw,id=sd -device sd-card,drive=sd
B. Inject Power Failures Post-fsync:
Use QEMU’s qtest
protocol to script power cuts:
echo "sysbus_write 0x12345678 0x1" | nc -U /tmp/qemu-test
C. Automate Testing with Assertions:
Post-reboot, verify file integrity:
[[ $(stat -c %s /mnt/settings.bin) -eq $EXPECTED_SIZE ]] || exit 1
Step 7: Hardware-Level Safeguards
- Supercapacitors: Add a 5V 1F supercap to sustain power during SD card flush (requires PCB modifications).
- Write Completion Pins: Some industrial SD cards (e.g., Swissbit) expose a GPIO pin indicating flush completion.
- MMC vs SD Cards: Consider eMMC modules with built-in power-fail protection.
Step 8: Kernel and Driver Updates
Upgrade to a modern kernel (≥5.10) with improved fsync
handling:
- MMC Driver Fixes: Patches for
mmc_blk
ensuringREQ_FUA
(Force Unit Access) compliance. - Filesystem Barriers: Enable
CONFIG_BLK_DEV_INTEGRITY
for storage barrier support.
Step 9: Forensic Analysis of Corrupted Files
Use fatcat
to inspect SD card images post-corruption:
fatcat sdcard.img -i /settings.bin -x / > settings.hex
Check for:
- FAT Chain Mismatches: Cluster pointers leading to free space.
- Directory Entry Flags: Invalid attributes or start clusters.
Step 10: Alternative Storage Architectures
If SD card reliability remains inadequate:
- NOR Flash + UBI/UBIFS: Wear-leveling-aware filesystems for raw NAND/NOR.
- Battery-Backed RAM Disks: Store critical data in tmpfs with periodic SD sync.
- Replicated Write Logs: Append-only logs on separate media (e.g., EEPROM).
This guide systematically addresses the interplay between application-level synchronization, filesystem design, and hardware limitations. By combining rigorous fsync practices, storage media hardening, and thorough power-fail testing, developers can achieve robust data integrity even in unreliable embedded environments.