Debug text "WARNING: FreeBlocks(xx)" and ECC.Corrected

I am trying to analyze some debug print of a Colibri T20 module running WinCE7 (custom built), when troubleshooting some strange behavior.

When running some applications, the text

WARNING: FreeBlocks(xx)

starts appearing, where xx is a number starting at 40 and then counting down for every print.
I have not yet seen it come all the way down to zero, but I want to know exactly what this is referring to?

My initial thought was flash memory, but there is plenty of space left in the USR partition. Also, the countdown resets at reboot.

Also, every once in a while when doing disk operations I get the text

Ecc.Corrected(1) Blk xxxx PgOffset yy, status: 0xzzzzzz

What does the status bits mean? Can I find a table somewhere?
Seems to be quite different for different addresses.

Dear @jran

In Image V1.4 we updated some algorithms of the flash wear leveling. For this reason we built in some verbose debug messages mainly for us, which you are seeing now. Both of them are normal and no reason to worry about:

  1. The FreeBlocks warning starts appearing after you have written a sufficient total amount of data to the flash to fill it once. if the number shown is small enough we are triggering a cleanup task. Every number above 0 indicates the system is still fine.

  2. The Ecc.Corrected(N) indicates that the flash controller detected N flipped bits which had to be fixed through error correction code. The status displayed is a direct readout of the BCH16 status register, which is documented in Nvidia’s T20 reference manual. However, there’s probably no information that is really relevant to you.
    Flipped bits are normal in any flash memory, up to 16 bits can be corrected with the BCH16 ECC.

Before Image V1.4 we used a weaker error correction algorithm (RS4), which we would still support in image V1.4. To be on the safe side, please verify in the debug messages while booting that your system is using BCH16, especially if you updated your module from an earlier OS release.

Regards, Andy

Thank you @andy.tx for your response.

Unfortunately it did not help us get any further in our troubleshooting.
The issue we are facing is that we have had a number of faulty units returned from customers after some time in use. The problem with these units is that they don’t boot.
Reflashing the file system solves the problem.

On these devices, we are running an old BSP 1.2.
After enabling debug print, we can see the following print during boot:

Toradex Bootloader 1.1 for Colibri Built Jul 12 2013

Press [SPACE] to enter Bootloader Menu

Unknown Product ID Version not set.  SerialNo not set.
RAM: 512 MB, CarveOut: 64 MB
Locating kernel image in flash...Done(5224)
Jumping to image at 0x00016000...




Toradex Windows CE 7.0 IMAGE 1.2 for Colibri Built Jan 29 2014
L2 cache enabled
MainMemoryEndAddress adjusted from 0x86800000 to 0x9C000000
Main Phys Mem: 0x00000000:0x1BFFFFFF
Carveout Phys: 0x1C000000:0x1FFFFFFF
Cold boot selected
SMP: Active CPUs = 2
Chip Id: 0x20 (Handheld SOC) Major: 0x1 Minor: 0x4 SKU: 0x8
NVRM Initialized shmoo database
PllClocks(Mhz): X=1000, M=666, C=581, P=216, A=24.576
SysClocks(Mhz): CPU=1000, AVP=232, SysBus=232, Mem=333, EMem=666
GraphicClocks(Mhz): Host=108, 3D=111, 2D=111, Epp=111, Mpe=111, Vde=232
Loading FlashFileSystem(NAND)...WARNING: Saved FreeBlock count in TAT (180) doesn't match with TT Table (0)!!
Done(137)
No free blocks Available-find out the reason, bank = 0
ERROR: FreeBlocks = 0 (FreeBeforeFlush = -1)

Error: No free Blk,  Region[0]=0 NvNandHandle: FtlStartLba=431, FtlEndLba=3844 FtlStartPba=468, FtlEndPba=4092
Misc start
NumOfBanksOnBoard = 1
NoOfILBanks = 1
PhysBlksPerBank = 4096
ZonesPerBank = 2
PhysBlksPerZone = 2048
PhysBlksPerLogicalBlock = 1
TotalLogicalBlocks = 3413
TotEraseBlks = 4096
NumOfBlksForTT = 16
PgsRegForTT = 4
TtPagesRequiredPerZone = 2
NumOfBlksForTAT = 8
BlksRequiredForTT = 1
PgsAlloctdForTT = 64
ExtraPagesForTTMgmt = 60
LastTTPageUsed = 50
CurrentTatLBInUse = 2
bsc4PgsPerBlk = 6
Misc end
TAT Handler start
tatBlocks[0] bank = 0, block = 4068
tatBlocks[1] bank = 0, block = 4069
tatBlocks[2] bank = 0, block = 4070
tatBlocks[3] bank = 0, block = 4071
tatBlocks[4] bank = 0, block = 4072
tatBlocks[5] bank = 0, block = 4073
tatBlocks[6] bank = 0, block = 4074
tatBlocks[7] bank = 0, block = 4075
ttBlocks[0] bank = 0, block = 4076
ttBlocks[1] bank = 0, block = 4077
ttBlocks[2] bank = 0, block = 4078
ttBlocks[3] bank = 0, block = 4079
ttBlocks[4] bank = 0, block = 4080
ttBlocks[5] bank = 0, block = 4081
ttBlocks[6] bank = 0, block = 4082
ttBlocks[7] bank = 0, block = 4083
ttBlocks[8] bank = 0, block = 4084
ttBlocks[9] bank = 0, block = 4085
ttBlocks[10] bank = 0, block = 4086
ttBlocks[11] bank = 0, block = 4087
ttBlocks[12] bank = 0, block = 4088
ttBlocks[13] bank = 0, block = 4089
ttBlocks[14] bank = 0, block = 4090
ttBlocks[15] bank = 0, block = 4091
tat Block bank = 0, block = 4070
TtAllocBlk[0] bank = 0, block = 4077
lastUsedTTBlock bank = 0, block = 4077
TAT Handler end
++++++++++++++++++
TT 32-bit entry format in dump :
=============
Region: b31-b30
BlockNotUsed: b29
BlockGood: b28
DataReserved: b27
SystemReserved: b26
TatReserved: b25
TtReserved: b24
PhysBlkNum: b23-b0
============
Dumping page 0
 **SuperBlock    0 : *0xFFFFFFFF [  -1] [TT  RSVD] [TAT RSVD] [SYS-RSVD] [DATA-RES] [GOOD BLK] [FREE BLK] Region3
Dumping page 1
Dumping page 2
Dumping page 3
 **SuperBlock 4095 : *0xFFFFFFFF [  -1] [TT  RSVD] [TAT RSVD] [SYS-RSVD] [DATA-RES] [GOOD BLK] [FREE BLK] Region3
Total=4096,Free=0,Bad=0,Reserve Data=0,System=4096,Tat=0,Tt=0,Illegal=0,Region0=0,Region1=0,Region2=0,Region3=4096
Exception 'Raised Exception' (0xc0000094): Thread-Id=0100000a(pth=c0819358), Proc-Id=00400002(pprc=862c25e0) 'NK.EXE', VM-active=00400002(pprc=862c25e0                                                                  ) 'NK.EXE'
PC=eff2e754(k.coredll.dll+0x0003e754) RA=8009462c(kernel.dll+0x0000e62c) SP=d1aff710, BVA=00000000
Exception 'Raised Exception' (0x116): Thread-Id=0100000a(pth=c0819358), Proc-Id=00400002(pprc=862c25e0) 'NK.EXE', VM-active=00400002(pprc=862c25e0) 'NK                                                                  .EXE'
PC=eff0eef8(k.coredll.dll+0x0001eef8) RA=8009462c(kernel.dll+0x0000e62c) SP=d1afee28, BVA=ffffffff
No free blocks Available-find out the reason, bank = 0
ERROR: FreeBlocks = 0 (FreeBeforeFlush = -1)

Error: No free Blk,  Region[0]=0 NvNandHandle: FtlStartLba=431, FtlEndLba=3844 FtlStartPba=468, FtlEndPba=4092
Misc start
NumOfBanksOnBoard = 1
NoOfILBanks = 1
PhysBlksPerBank = 4096
ZonesPerBank = 2
PhysBlksPerZone = 2048
PhysBlksPerLogicalBlock = 1
TotalLogicalBlocks = 3413
TotEraseBlks = 4096
NumOfBlksForTT = 16
PgsRegForTT = 4
TtPagesRequiredPerZone = 2
NumOfBlksForTAT = 8
BlksRequiredForTT = 1
PgsAlloctdForTT = 64
ExtraPagesForTTMgmt = 60
LastTTPageUsed = 50
CurrentTatLBInUse = 2
bsc4PgsPerBlk = 6
Misc end
TAT Handler start
tatBlocks[0] bank = 0, block = 4068
tatBlocks[1] bank = 0, block = 4069
tatBlocks[2] bank = 0, block = 4070
tatBlocks[3] bank = 0, block = 4071
tatBlocks[4] bank = 0, block = 4072
tatBlocks[5] bank = 0, block = 4073
tatBlocks[6] bank = 0, block = 4074
tatBlocks[7] bank = 0, block = 4075
ttBlocks[0] bank = 0, block = 4076
ttBlocks[1] bank = 0, block = 4077
ttBlocks[2] bank = 0, block = 4078
ttBlocks[3] bank = 0, block = 4079
ttBlocks[4] bank = 0, block = 4080
ttBlocks[5] bank = 0, block = 4081
ttBlocks[6] bank = 0, block = 4082
ttBlocks[7] bank = 0, block = 4083
ttBlocks[8] bank = 0, block = 4084
ttBlocks[9] bank = 0, block = 4085
ttBlocks[10] bank = 0, block = 4086
ttBlocks[11] bank = 0, block = 4087
ttBlocks[12] bank = 0, block = 4088
ttBlocks[13] bank = 0, block = 4089
ttBlocks[14] bank = 0, block = 4090
ttBlocks[15] bank = 0, block = 4091
tat Block bank = 0, block = 4070
TtAllocBlk[0] bank = 0, block = 4077
lastUsedTTBlock bank = 0, block = 4077
TAT Handler end
++++++++++++++++++
TT 32-bit entry format in dump :
=============
Region: b31-b30
BlockNotUsed: b29
BlockGood: b28
DataReserved: b27
SystemReserved: b26
TatReserved: b25
TtReserved: b24
PhysBlkNum: b23-b0
============
Dumping page 0
 **SuperBlock    0 : *0xFFFFFFFF [  -1] [TT  RSVD] [TAT RSVD] [SYS-RSVD] [DATA-RES] [GOOD BLK] [FREE BLK] Region3
Dumping page 1
Dumping page 2
Dumping page 3
 **SuperBlock 4095 : *0xFFFFFFFF [  -1] [TT  RSVD] [TAT RSVD] [SYS-RSVD] [DATA-RES] [GOOD BLK] [FREE BLK] Region3
Total=4096,Free=0,Bad=0,Reserve Data=0,System=4096,Tat=0,Tt=0,Illegal=0,Region0=0,Region1=0,Region2=0,Region3=4096
Exception 'Raised Exception' (0xc0000094): Thread-Id=0100000a(pth=c0819358), Proc-Id=00400002(pprc=862c25e0) 'NK.EXE', VM-active=00400002(pprc=862c25e0                                                                  ) 'NK.EXE'
PC=eff2e754(k.coredll.dll+0x0003e754) RA=8009462c(kernel.dll+0x0000e62c) SP=d1aff710, BVA=00000000
Exception 'Raised Exception' (0x116): Thread-Id=0100000a(pth=c0819358), Proc-Id=00400002(pprc=862c25e0) 'NK.EXE', VM-active=00400002(pprc=862c25e0) 'NK                                                                  .EXE'
PC=eff0eef8(k.coredll.dll+0x0001eef8) RA=8009462c(kernel.dll+0x0000e62c) SP=d1afee28, BVA=ffffffff

We have done extensive testing to try to recreate the issue on a working device (for example heavy repetitive disk writing, filling the flash memory over and over, removing power during disk write etc.) but have not had any luck in recreating the issue.

Do you have any ideas what might be causing this, and how to solve it?
Or anything else we could try.

Dear @jran

The error messages you are seeing are all related to the NAND flash. There’s two independent issues I will cover separately below:

General Flash Errors

Every NAND flash has a limited number of allowable write cycles, and it is normal that some bits show wrong values over time. Single bit errors are corrected using error correction codes (ECC), and proper ECC usage is required to achieve the specified number of write cycles.
As mentioned before we used a weak ECC in images before V1.4 (your log shows V1.2), so I expect that you see early flash errors. It is possible that single bit errors disappear if the data is rewritten, or if the temperature is different. I would expect theser errors to reappear soon in a real usecase.

I strongly recommend you update to a newer OS version to increase the flash reliability.

Product ID / SerialNo not set

The module has lost the factory-programmed information, which is stored in the same physical NAND flash as the OS and your data. You will need to restore this information.

The latest versions of Nvflash and the UpdateTool will automatically ask you to enter the information.
Nvflash will ask you for a 16-digit number. You can either scan the 2D barcode, or enter it manually. You will be asked to enter a 16-digit number. The first 8 digits are your product identification, which is (Colibri T20 512MB V1.1A => 00211100), the second 8 digits are the serial number which is printed on the 2D barcode label.

Regards, Andy