]> git.dujemihanovic.xyz Git - linux.git/commit
erofs: fix infinite loop due to a race of filling compressed_bvecs
authorGao Xiang <hsiangkao@linux.alibaba.com>
Thu, 25 Jan 2024 12:00:39 +0000 (20:00 +0800)
committerGao Xiang <hsiangkao@linux.alibaba.com>
Fri, 26 Jan 2024 10:07:36 +0000 (18:07 +0800)
commitcc4b2dd95f0d1eba8c691b36e8f4d1795582f1ff
tree2b1abf87898fb27fca42625994a8dd11c6a4e40b
parent97cf5d53b4812dcb52c13fda700dad5aa8d3446c
erofs: fix infinite loop due to a race of filling compressed_bvecs

I encountered a race issue after lengthy (~594647 secs) stress tests on
a 64k-page arm64 VM with several 4k-block EROFS images.  The timing
is like below:

z_erofs_try_inplace_io                  z_erofs_fill_bio_vec
  cmpxchg(&compressed_bvecs[].page,
          NULL, ..)
                                        [access bufvec]
  compressed_bvecs[] = *bvec;

Previously, z_erofs_submit_queue() just accessed bufvec->page only, so
other fields in bufvec didn't matter.  After the subpage block support
is landed, .offset and .end can be used too, but filling bufvec isn't
an atomic operation which can cause inconsistency.

Let's use a spinlock to keep the atomicity of each bufvec.  More
specifically, just reuse the existing spinlock `pcl->obj.lockref.lock`
since it's rarely used (also it takes a short time if even used) as long
as the pcluster has a reference.

Fixes: 192351616a9d ("erofs: support I/O submission for sub-page compressed blocks")
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Reviewed-by: Yue Hu <huyue2@coolpad.com>
Reviewed-by: Sandeep Dhavale <dhavale@google.com>
Link: https://lore.kernel.org/r/20240125120039.3228103-1-hsiangkao@linux.alibaba.com
fs/erofs/zdata.c