Is the M4 system and code bus cache enabled by default? There are two function in lmem.c ( LMEM_EnableSystemCache
and LMEM_EnableCodeCache
) but i am not sure how to use them? Could you please give an example? And are there any implications/restrictions when cache is enabled? For example i am using both TCML and TCMU for code. Could this be an issue when cache is enabled?
Thanks in advance.
You don’t need caching for the tightly coupled memory. TCML & TCMU are designed for the Cortex-M4 pipeline memory access, meaning they can provide instructions and data to the M4 core at clock speed without wait states.
Cashing is required when you need to access OCRAM or DDR.
Without caching OCRAM roughly 8 time slower than TCM. While DDR is approximately 4 time slower than OCRAM
Dear @qojote
The caches are enabled by default (see function SystemInit()
in system_MCIMX7D_M4.c). The functions in lmem.c can be called by your application if you want to modify the cache settings at runtime.
Caching is only supported for particular memory regions. The i.MX7 reference manual explains the details:
4.2.9.3.5 Cache Function
Only the memories below are supported:
- address[31] - DDR space, first 2M (0x8000_0000 - 0x801F_FFFF)
- address[30:29] - FlexSPI channel A, first 2M (0x6000_0000 - 0x601F_FFFF)
- address[30:29]+[21] - FlexSPI channel A, second 2M (0x6020_0000 - 603F_FFFF)
- address[29]+[21] - OCRAM (0x2020_0000 - 0x203F_FFFF)
Code accesses are optimized if they are locad from TCM_L. Code accesses in TCM_U happen at reduced performance.
The following post in the NXP community discusses some details about performance impacts of different memory types:
Regards, Andy
Thanks for the information. I have found the code that intializes the system cache (line 178 in system_MCIMX7D_M4.c) but i cannot find the equivalent call to enable the code cache (e,g, writing to LMEM_PCCCR)? Maybe it’s enabled by default? Please let me precise my second question: Is it allowed (and clever) to use both TCMU and TCML for code and OCRAM for data? Thanks in advance.
Dear @qojote
We took over the basic implementation of NXP. I looked again at the code and I can confirm, code cache is not enabled by default, so you need to call LMEM_EnableCodeCache()
to optimize code execution, if the code is located outside the TCM_U / TCM_L area.
You reach the best performance, if you place code in TCM_L, and data in TCM_U. For other configurations I’m afraid you will have to test what is the best setup. The memory performance is …
- for code: TCM_L > TCM_U > OCRAM > DRAM
and
- for data: TCM_U > TCM_L > OCRAM > DRAM
It is for sure allowed to use both TCM_L and TCM_U for code, and this is also not a bad idea. However, I cannot say whether you would get more performance for example by using TCM_L and OCRAM for code, while leaving data in TCM_U.
Regards, Andy