generic/778: fix severe performance problems
This test takes 4800s to run, which is horrible. AFAICT it starts out
by timing how much can be written atomically to a new file in 0.2
seconds, then scales up the file size by 3x. On not very fast storage,
this can result in file_size being set to ~250MB on a 4k fsblock
filesystem. That's about 64,000 blocks.
The next thing this test does is try to create a file of that size
(250MB) of alternating written and unwritten blocks. For some reason,
it sets up this file by invoking xfs_io 64,000 times to write small
amounts of data, which takes 3+ minutes on the author's system because
exec overhead is pretty high when you do that.
As a result, one loop through the test takes almost 4 minutes. The test
loops 20 times, so it runs for 80 minutes(!!) which is a really long
time.
So the first thing we do is observe that the giant slow loop is being
run as a single thread on an empty filesystem. Most of the time the
allocator generates a mostly physically contiguous file. We could
fallocate the whole file instead of fallocating one block every other
time through the loop. This halves the setup time.
Next, we can also stuff the remaining pwrite commands into a bash array
and only invoke xfs_io once every 128x through the loop. This amortizes
the xfs_io startup time, which reduces the test loop runtime to about 20
seconds.
Finally, replace the 20x loop with a _soak_loop_running 5x loop because
5 seems like enough. Anyone who wants more can set TIME_FACTOR or
SOAK_DURATION to get more intensive testing. On my system this cuts the
runtime to 75 seconds.
Cc: [email protected] # v2025.10.20
Fixes: ca954527ff9d97 ("generic: Add sudden shutdown tests for multi block atomic writes")
Signed-off-by: Darrick J. Wong <[email protected]>
Reviewed-by: Christoph Hellwig <[email protected]>
Signed-off-by: Zorro Lang <[email protected]>
1 file changed