Another FlashDisk corruption on VF61 with CE6

This issue is identical to the one I reported some months ago with CE6 1.6b4.

Based on that suggestions, I upgraded CE6 to 1.7b6, since it has some bug fixes related to possible FlashDisk corruption.

I hope this was an effective solution, but it’s not (unfortunately).

One of the devices, delivered to our customer on July (and so programmed with CE6 1.7b6; not upgraded from 1.6b4) some days ago stopped working.

The customer sent back the device to me and so I did my investigation.

The corrupted file is the same QtGui4.dll (like the other issue) and this sounds strange to me.

The only thing I can say is that this file is larger than 7 MB (I think it’s the larger file I put on FlashDisk).

I think you image this issue is critical for my company and so I need all the necessary support from Toradex side to investigate and find a quick and effective solution.

Hello to everyone, and sorry for pushing, but this is a high priority task from my side

Hi,
We are looking into it

Hi, do you have news?
I think you understand that this is a major issue for my company.

Hi @vix,

Did you also try with other points mentioned in the previous answer along with updating your image?

Sorry, but I don’t understand which points.
Can you clarify, please?

Hello @sahil.tx
can you write some updates here, please?

Hi @vix,

Could you please wait for some more time as we are still looking into it?

Hi @vix ,

I received the module that you sent us. I did some analysis and i think i found out the issue:

It looks like that there is 1 sector with corrupted (1 bit flip) meta data information. This sector is still an old one that was written before the image update to 1.7b6. For compatibility reasons the new flash driver (with metadata ECC) we introduced in 1.7b4, is still able to read metadata without ECC, but will always write new metadata with ECC.

The issue now is that if you still have data that was written without the metadata ECC and you never overwrite it, it can still happen that a bit flips there and the whole sector gets corrupted.
To avoid this we recommend to rewrite all data. The best method is to do a full format of the NAND Flash Store from the Control Panel, repartition and recopy all files.

Hi @germano.tx,
thank you very much for your analysis.

Now I need to be 100% sure on how to proceed to avoid think kind of issue on other modules.

In the issue I posted some months ago I asked if it is ok if I restore the whole filesystem from a backup using UpdateTool.
And the answer was yes.

This is what I did.
Is it enough, or not?

The user of my device doesn’t have access to Control Panel.
How can I do a full format of the NAND with a script?

Is it safe if I use a filesystem backup taken when the OS was 1.7b2?
Or should I install a 1.7b6 from scratch, reconfigure the OS based on my needs and take a new backup?

I’ve just uploaded QtGui4.dll https://share.toradex.com/rfn4jprnmtdg69v

The sector with the bit flip is definitely part of QtGui4.dll (at offset 1140736, about 14% of the file), so i’m sure now that that was the error we were looking for.

I write what I learned: when a CE6 image 1.7b4 (or greater) writes the filesystem, it adds ECC.

Image 1.7b4 adds ECC but it’s necessary that the filesystem is written by this image.

Follow this steps to upgrade from an old image (1.7b2 or previous), and to add ECC:

  • upgrade bootloaded and CE6 image
  • reboot, so that the new image is running
  • write the filesystem - ECC is added

Hello @germano.tx

one thing that came into my mind.

I’ve just changed my update script, but is there a way so that I can check if ECC is enabled on the filesystem?

Or can I send you an update package to you so that you can check on the module I sent back to you?