arm: stm32mp: activate data cache in SPL and before relocation
Activate the data cache in SPL and in U-Boot before relocation.
In arch_cpu_init(), the function early_enable_caches() sets the early
TLB, early_tlb[] located .init section, and set cacheable:
- for SPL, all the SYSRAM
- for U-Boot, all the DDR
After relocation, the function enable_caches() (called by board_r)
reconfigures the MMU with new TLB location (reserved in
board_f.c::reserve_mmu) and re-enable the data cache.
This patch allows to reduce the execution time, particularly
- for the device tree parsing in U-Boot pre-reloc stage
(dm_extended_scan_fd =>dm_scan_fdt)
- in I2C timing computation in SPL (stm32_i2c_choose_solution())
For example, the result on STM32MP157C-DK2 board is:
1,6s gain for trusted boot chain with TF-A
2,2s gain for basic boot chain with SPL
For information, as TLB is added in .data section, the binary size
increased and the SPL load time by ROM code increased (30ms on DK2).
But early malloc can't be used for TLB because arch_cpu_init()
is executed before the early poll initialization done in spl_common_init()
called by spl_early_init() So it too late for this use case.
And if I initialize the MMU and the cache after this function it is
too late, as dm_init_and_scan and fdt parsing is also called in
spl_common_init().
And .BSS can be used in board_init_f(): only stack and global can use
before BSS init done in board_init_r().
So .data is the better solution without hardcoded location but if you
have size issue for SPL you can deactivate cache for SPL only
(with CONFIG_SPL_SYS_DCACHE_OFF).
Reviewed-by: Patrice Chotard <patrice.chotard@st.com> Signed-off-by: Patrick Delaunay <patrick.delaunay@st.com>