A customer is asking very thorough questions regarding “the measures against bit corruption (caused by Read Disturb)”. This question has gone all the way from “What measures Toradex takes to prevent memory corruption”, which I explained the following:
- In the file system layer, UBIFS takes some precautions on the continuous usage of the flash through wear leveling, by avoiding repeatedly writting the same blocks again and again.
- The MTD drivers keep control of the Bad Block tables and update accordingly if a erase or a write operation cannot be performed (Correct me if I’m wrong)
However, the next step has been to check what happens when the error is already there (to which I assume ECC should take place), and they wanted proof that any kind of correction method is applied.
I saw that the following are used in iMX6ULL config:
CONFIG_MTD_NAND_ECC = y CONFIG_MTD_NAND = y CONFIG_MTD_NAND_IDS = y CONFIG_MTD_NAND_GPMI_NAND = y CONFIG_MTD_NAND_MXC = y
I’ve been checking our code in
drivers/mtd/nand where I believe that all these are covered.
I could find what I believe it is the generic ECC code called whenever a fail is found (
nand_ecc.c) but I couldn’t find a further reference for when this is called:
imx6ull-colibri.dtsi, in the GPMI node I could only see the
nand-ecc-mode = "hw"property
nand_correct_datareference can be found only under the
- Under gpmi-nand.c, I could found BCH references in the device data (
bch_max_ecc_strength = 40), but I don’t believe to be used.
Sorry to ask you this but at this point a detailed answer would be required with exact points of where this is applied and used. Do we actually offer SW ECC in the iMX6ULL?
- Do we have any error correcting measures in case an error is found while reading memory?
- Do we have any kind of confirmation method used at read than an error is found at least (maybe with the help of the OOB or any parity bits?)?
- Any additional memory error counter-measures that we are implementing in our BSP worth mentioning?
Many thanks and regards,