.NET CF Init error

We have seen this .NET CF Init Error on our colibri T30. We are using BSP 2.3 Windows CE 7

Looking through the posts on this before, (for instance .NET CF Initialization Error With VF61, WEC2013 and CF 3.9 - #2 by Troubadix75 ) we are a bit confused. We checked for differences in the filesystem and some of the registry keys but we cannot find anything that sticks out as different in the filesystem. We weren’t sure which registry keys to check so we compared HKLM\SOFTWARE\Microsoft and couldn’t see missing keys.

We don’t normally have access to the debug com port on our instruments. So if you want me to run the devfshealth command, I will need to take the T30 out of our instrument and put that in our Colibri carrier board.

We have seen also this post here .NET CF Initialization Error With VF61, Win EC7 and CF 3.5 and found it very enlightening. That might explain why sometimes our units in the field have the same error directly after bootup. (we have a device plugged into our UARTA that is sending human readable info… and we theorize that sometimes it is triggering the bootloader on an instrument that is being restarted )

But in this case, we have all of our compact framework apps already loaded. and they seemed to just quit and when we tried to restart them, the .NET CF Init error was reported.

Please help.

Brad

Could you specify which exactly .NET framework did you install? Have you tried to re-install it?

We are using .NET CF version 3.5. I will try and re-install it this morning and see if that repairs it.

We have three test units running right now and we compared the GAC_mscorlib_v3_5_0_0_cneutral_1.dll files but they all seem to be the same.

I tried to install .NET CF 3.5 using the Toradex .net cf installer and it didn’t work.

The message shown was approximately
“Toradex NETCF was not install succesfully. Try to remove it using remove programs in the control panel and install again”

So we took the T30 out of the instrument. The instrument was in a temperature test chamber at 50 degrees Celsius. I put the module in the Colibri evaluation board and am able to get into the bootloader and try and boot it.

Below is the output from the image when it tries to boot. I suppose that because the zoneMask is set to error, we see less information. But I don’t even see any error messages or indication that it is trying to load the flash file system. There are no exceptions or errors here.

Toradex Windows CE 7.0 2.3 for Tegra Built Jul  1 2019 15:59:58
INFO:OALLogSetZones: dpCurSettings.ulZoneMask: 0xb
L2 cache enabled
MainMemoryEndAddress adjusted from 0x9F000000 to 0x9FE00000
Main Phys Mem: 0x80000000:0x9FDFFFFF
Carveout Phys: 0x9FE00000:0x9FFFFFFF
Cold boot selected
SMP: Active CPUs = 4
Extended Mem : 0xA0000000:0xBFFFFFFF
Chip Id: 0x30 (Handheld SOC) Major: 0x1 Minor: 0x3 SKU: 0xb0
ATE prog ver 4.0
                Speedo: CPU: 327 (Corner: 1), Core: 212 (Corner: 0)
                                                                   NVRM Initialized shmoo database
PllClocks(Mhz): X=1300, M=800, C=600, P=408, A=24.576
SysClocks(Mhz): CPU=1300, AVP=240, SysBus=240, Mem=400, EMem=800
GraphicClocks(Mhz): Host=133, 3D=133, 2D=133, Epp=133, Mpe=133, Vde=408

Should I try and see if I can use platform builder to load and boot an image from RAM? Or should I just re-flash the module?

Also I just saw that BSP 2.4 is released, should we try that?

Brad

I tried to boot the T30 by going into the bootloader and selecting D to download the image into RAM and boot under target control from Platform Builder

I get several stack traces and one assertion fail and then the boot fails with a data abort

FSDMGR!StoreDisk_t::GetStoreInfo(tagSTOREINFO * 0x00000000)  line 180 + 32 bytes
FSDMGR!PartitionDisk_t::Mount(void * 0xb0667ed0)  line 519
FSDMGR!StoreDisk_t::LoadPartition(const wchar_t * 0xb082fd44, int 0x00610050, int 0x00740072)  line 722 + 12 bytes
FSDMGR!StoreDisk_t::LoadExistingPartitions(int 0x00000042, int 0x03f00007)  line 750 + 20 bytes
FSDMGR!StoreDisk_t::MountPartitions(void * 0xb082fe54, int 0x00000000)  line 842
FSDMGR!StoreDisk_t::Mount(void * 0x03e60007, int 0xb0667a40)  line 849 + 16 bytes
FSDMGR!MountStore(const wchar_t * 0xb0667a40, const _GUID * 0xefece418, const wchar_t * 0x00004444, StoreDisk_t * * 0x00005555)  line 776 + 16 bytes
FSDMGR!PNPMountThread(void * 0x00004444)  line 1107
K.COREDLL!ThreadBaseFunc(unsigned long (void *)* 0x00000000, void * 0x00000000)  line 1240
00000004()

ASSERT ( !IsDetached( ) );

I am not sure what this really means, but hopefully I can get some usefull information

Sorry but I don’t understand your current situation. On a first post screen shot indicates that module boots successfully to WinCE and you were able to start a lots of application but got an erorr related to .NET frameworks. Your latest posts make an impression that module is not able to boot to WInCE at all. Is it correct.

Could you please try to flash the latest BSP release (2.4).
Then enable debug messages over serial and collect full log from debug UART right from module power on,

The problem started after the unit was booted and running for some days. That was what the screenshot was showing, the error message AFTER The unit was already started.

A couple of days later we tried to reboot the device to try and see if a reboot would reload the compact framework. But the reboot did not work, the device would not boot up.

So something happened to the device while the device was running that caused it to be unbootable.

Do you really need us to flash the device with BSP2.4? I already gave you a trace from BSP2.3 from the debug UART right from power on. I even tried to download the OS to ram from Platform Builder and boot the device that way and gave you a stack trace from where that failed.

Brad

Sorry I forgot to say something.

We acutally did reflash the device with an image based on BSP 2.4. AFTER we reflashed the device, it booted normally and we didnt see the error again. I also ran a chkdisk on the internal emmc flash disk and it returned no information about corrupted sectors at all.

So there is really no useful information after the information that I gave you… the device was recovered after I reflased it.

Recovering the device by reflashing it is not a solution.

Please do not mark this closed and we need more of your (or someone else there)'s help.

We cannot have devices in the field that just seem to lose their compact framework. Unfortunately our company has spent a lot of time and money on this compact framework business and we cannot have it not working.

What good is that going to do on a recovered device that doesn’t have the problem anymore? This problem goes away after we reflash the module.

Most likely, one or several files were corrupted or deleted. That’s why .NET CF wasn’t able to initialize properly and WICE didn’t boot after restart. If you had enabled debug messages over serial and collected a log from the debug UART, it would have been possible to provide more specific information.

Could you please describe your system in detail, focusing on possible write operations to internal storage?

Toradex Windows CE 7.0 2.3 for Tegra Built Jul 1 2019 15:59:58
INFO:OALLogSetZones: dpCurSettings.ulZoneMask: 0xb
L2 cache enabled
MainMemoryEndAddress adjusted from 0x9F000000 to 0x9FE00000
Main Phys Mem: 0x80000000:0x9FDFFFFF
Carveout Phys: 0x9FE00000:0x9FFFFFFF
Cold boot selected
SMP: Active CPUs = 4
Extended Mem : 0xA0000000:0xBFFFFFFF
Chip Id: 0x30 (Handheld SOC) Major: 0x1 Minor: 0x3 SKU: 0xb0
ATE prog ver 4.0
Speedo: CPU: 327 (Corner: 1), Core: 212 (Corner: 0)
NVRM Initialized shmoo database
PllClocks(Mhz): X=1300, M=800, C=600, P=408, A=24.576
SysClocks(Mhz): CPU=1300, AVP=240, SysBus=240, Mem=400, EMem=800
GraphicClocks(Mhz): Host=133, 3D=133, 2D=133, Epp=133, Mpe=133, Vde=408

That was the output of the debug UART when we had the problem.

After flashing 2.4 it booted normally, do you want the output of that?

*Most likely, one or several files were corrupted or deleted. *

That is what we were thinking as well, but which files? Is the situation recoverable without reflashing the whole operating system? Can we check the health of the volumes if the system is new started?

The concerning aspect of this bug was the fact that this corruption happened AFTER boot. That is new for us. We had seen these CF Init errors on bootup and we are in the process of trying to secure our bootloader to try and prevent this below

Do you have RX pin of UART connected to pullup? If not than you can have an issue of registry erasing. .NET CF Initialization Error With VF61, Win EC7 and CF 3.5 - #5 by luka.tx

But that doesn’ t seem to be the problem here.

Could you please describe your system in detail, focusing on possible write operations to internal storage?

Sure. We have a custom carrier that holds one Colibri T30 IT with the 4GB of emmc storage there and the carrier board has a receptecle for a micro-SD card. We use a 16GB class 10 made by Kingston.
We write quite a few small files every minute to the SD card. And we also write a large amount of logging information to this card every 10-30 seconds. Even though we buffer this info, we are slowing seeing that this could corrupt this card.

Our image and the CF framework is installed on the emmc. We use the default toradex hive based registry. We only really write to the emmc in case we can’t write to the SD and the system is about to reboot. The rest of the writes are volumes in RAM, and those are mostly signal files that our processes are looking for. So we are kind of surprised about the emmc being corrupted. As far as we know, the amount of writes should be relatively small.

Is that enough detail? Or is there something I might have missed?

Although the internal eMMC controller should take care of data integrity, in some cases, disk writing just before a reboot may lead to data corruption.

Please note that exFAT is not a transactional file system, which means that it does not guarantee transactional safety in the same way that some other file systems do, such as NTFS or ZFS

While exFAT does have some mechanisms for recovering from errors and ensuring data integrity, it does not provide the same level of transactional safety as some other file systems. Therefore, if transactional safety is a critical requirement for your application or use case, you may want to consider using a different file system that provides stronger guarantees in this area.

Please also doublecheck your settings to exclude Write back caching on disk and/or File levels.