fsstress: don't abort when stat(".") returns EIO
First, start with the premise that fstests is run with a nonzero limit
on the size of core dumps so that we can capture the state of
misbehaving fs utilities like fsck and scrub if they crash.
When fsstress is compiled with DEBUG defined (which is the default), it
will periodically call check_cwd to ensure that the current working
directory hasn't changed out from underneath it.
If the filesystem is XFS and it shuts down, the stat64() calls will
start returning EIO. In this case, we follow the out: label and call
abort() to exit the program. Historically this did not produce any core
dumps because $PWD is on the dead filesystem and the write fails.
However, modern systems are often configured to capture coredumps using
some external mechanism, e.g. abrt/systemd-coredump. In this case, the
capture tool will succeeds in capturing every crashed process, which
fills the crash dump directory with a lot of useless junk. Worse, if
the capture tool is configured to pass the dumps to fstests, it will
flag the test as failed because something dumped core.
This is really silly, because basic stat requests for the current
working directory can be satisfied from the inode cache without a disk
access. In this narrow situation, EIO only happens when the fs has shut
down, so just exit the program. Apply the same exit-on-EIO logic to
post-operation cleanup if we fail to go up one directory.
We really should have a way to query if a filesystem is shut down that
isn't conflated with (possibly transient) EIO errors. But for now this
is what we have to do. :(
Signed-off-by: Darrick J. Wong <[email protected]>
Reviewed-by: Christoph Hellwig <[email protected]>
Signed-off-by: Zorro Lang <[email protected]>
1 file changed