To quote the author:
Ensure that every write is flushed to memory and afterward reads are
from memory.
Since the algorithm rely on the fact that accessing to not existent
memory lead to write at addr / 2 without this modification accesses to
aliased (not physically present) addresses are cached and wrong size is
returned.
This was discovered while working on a TI AM625 based board where cache
is normally enabled, see commit
c02712a74849 ("arm: mach-k3: Enable
dcache in SPL").