What's wrong with synchronous metadata updates
What's synchronous metadata update?
It is a term from filesystem design. The file system resides partly in
volatile buffers and partly in persistent storage like disks. The file
system could write everything to disk as soon as the write happens.
However, such a completely synchronous update results in a very slow
file system.
Therefore, most file systems don't update everything
synchronously. In particular, the Berkeley FFS (also known as UFS) by
default waits for some time before updating the data, but writes
metadata (i-nodes, directories etc.) synchronously. This policy is
known as synchronous metadata updates.
What's wrong with them?
Many people (especially BSD users) mistakenly believe file systems
with that synchronous metadata updates are safer than systems that
don't write their metadata synchronously (e.g., Linux' ext2fs by
default). Here's an anecdote that demonstrates the problem:
I was working on an Alpha under Digital OSF/1, with the filesystem
being UFS (alas!). I had written one hour on a new file in Emacs, and
just had done a save-buffer (C-x C-s), when the machine crashed. When
it came up again after the file system check, my work was gone. The
file I was working on, the autosave file, the backup file, all of them
either did not exist or were empty.
What had happened? That can be explained very nicely with
synchronous metadata updates: When I did the save-buffer, the file I
was working on was written to the file system; the metadata was
written to the disk, the data was kept in memory and was lost, and
therefore the file system check made this file empty. The autosave
file was deleted; the data was still on disk, but without associated
metadata. The backup file had never been written, because this was a
newly created file and the first save.
But isn't asynchronous metadata updates less safe against
corrupted file systems?
I don't see why it should be. Any file system will try to keep the
state on disk consistent for most of the time. During metadata
updates, an inconsistent state can arise, and if a crash happens at
that time, we have to hope that the file system check does a good
job. Note that this inconsistency on disk does not depend on whether
the metadata is written synchronously or asynchronously. It can arise
for both systems.
On the practical side, due to hardware problems, I have had about
50 crashes under Linux some time ago. I have never lost more than
about a half-minute of work I had done (and several minutes of waiting
until the system came up again).
(Several years later:) In the mean-time I have learned that FFS writes
meta-data in such an order that the damage is limited (IIRC to one
wrong block). In contrast, ext2 relies on more sophistication in
fsck. In any case, synchronous updates are not the only way to ensure
that writes are ordered in a specific way, as demonstrated by soft
updates (see below), which don't even require fsck upon crashing.
What would be the correct way?
You better read some papers on operating systems. There are several
approaches:
- Log-Structured file
systems
- write everything to a log and write checkpoints when
the filesystem is consistent. When recovering from a crash everything
written after the last checkpoint is discarded.
- Journaling
file systems
- write the intended changes to a log to make them
persistent, later to the homne locations. When recovering from a
crash, the log is replayed up to the last checkpoint. Note that while
it is possible to do the right thing in a JFS, many JFSs (e.g., IBM's
JFS) just ensure metadata consistency and suffer from all the
drawbacks discussed above.
- Soft
updates
- allocate the blocks for the new data first, then write
the data, then the metadata, and then free the old data. One would
have to build a dependence graph of blocks to be written, and write
them out according to a topological ordering. Note that, Soft Updates
usually don't perform all changes in the same order as the application
does them, and has not commiting checkpoints, so the consistency
guarantees are not as nice as with good log-structured or journaling
file systems.
Bibliography,
mainly about Distributed and Log-structured File Systems | In-order
semantics (a desirable file system feature wrt data consistency) |
Ideas for a log-structured file system
Anton Ertl