We’re seeing an exception in the field which causes a system reset, but it only happens once every 8 to 10 days. It appears to be related to the i2c driver. We are using Win CE 7 image 2.3 on a T20 with Toradex library version 2.4.
The product which shows this problem uses only one i2c port in order to conform to the standard Colibri pin-out to provide an upgrade path to the iMX6/7 family of modules. Whenever we need to access the RTC on the board to update the time, or to call RTCSYNC we close the i2c port in our software, then let the OS access the i2c port to talk to the RTC, then we re-open the i2c port. We do this every 6 hours to call RTCSYNC to prevent the clock in the T20 from drifting too far. We’ve doing this with other products for years without a problem, but those other products use a dedicated i2c port for the RTC so we don’t need to close and reopen it.
The exception occurs sometimes when attempting to reopen the i2c port after calling RTCSYNC. The basic steps we follow are:
- Get unnamed mutex
- Call I2c_Close()
- Call I2c_Deinit()
- Use CreateProcess() to spawn RTCSYNC.exe
- Wait for the RTCSYNC process to complete
- Release resources from CreateProcess()
- Call I2c_Init()
- Call I2c_Open()
- Release unnamed mutex
- Wait for 6 hours
- Go back to step 1 and repeat
The release build of the code crashes with exception code 0xc0000005.
We were able to catch the problem while running the debugger after 8 days and saw these messages during the call to I2c_Init():
I2C Error: .\src\NvI2C.c, 36: Failed to get address of NvRmOpen
I2C Error: .\src\NvI2C.c, 36: Failed to get address of NvRmI2cClose
I2C Error: .\src\NvI2C.c, 36: Failed to get address of NvRmI2cTransaction
Unhandled exception at 0x00000000 in C18Prime.exe: 0xC0000005: Access violation reading location 0x00000000
We tried to accelerate the problem by going through the steps listed above every couple of seconds instead of once every 6 hours, but that did not actually make the problem happen more often.
Some questions to get the conversation started:
- Is it necessary to close and reopen the i2c port before calling RTCSYNC? I think it is based on some experiments
- Is it worth upgrading the library to version 2.5? There is nothing relevant in the change log
- Should we use the mutex named “I2C” to synchronize our code with RTCSYNC? We used to use that mutex with the old style Toradex library, but stopped using it when we moved to the new library. The whole procedure listed above is protected by an unnamed mutex.
Here is a copy of the call stack after the exception occurs. I think this shows that the exception is happening inside the Toradex library as part of the call to I2c_Init().
If your app is are using same I2C channel as an RTC you should use the “I2C” mutex.
It’s enough to do the I2c_Close() to release a mutex. You should call the 2c_Init() only once and do the I2c_Deinit() only when you close your app.
It’s highly recommended to update your libs to version 2,5
Thanks for your quick reply @alex.tx. I have upgraded my library to version 2.5, I have added a named mutex called “I2C”, and I now only call I2c_Init() and I2c_Deinit() once in my application. This code does work, but I need to let it run for 10 days or more to know if this fixes the problem reported in the field.
I do have one question for you about the sentence “It’s enough to do the I2c_Close() to release a mutex” which you wrote. Does that mean that the i2c library from Toradex uses the global, named mutex with the name “I2C” and that I2c_Open() will get that named mutex and I2c_Close() will release that named mutex?
If I don’t use the named mutex in my code and assume that I2c_Open() and I2c_Close() do use that named mutex, then I get this error when I try to open the port again after calling RTCSYNC without calling I2c_Deinit():
I2C Error: .\src\i2c_teg.c, 887: TimeOut waiting for Mutex
I remember now that I saw that error message when I first moved to the new Toradex library, and that I found that if I called I2c_Deinit() before spawning RTCSYNC then I could avoid that error message. So it seems that to avoid having to call I2c_Deinit() every time that RTCSYNC is spawned that it is actually necessary to create the named mutex “I2C” in my code and then get and release it when opening and closing the i2c port.
Does that make sense? i.e. when you wrote “you should use the ‘I2C’ mutex” what you really meant was “you must use the ‘I2C’ mutex”?
@alex.tx I realized that my first attempt to fix this didn’t actually work and I learned something important while figuring it out. I thought I’d share that learning with the community in case anyone else falls into this trap. The root cause of my issue was that I was calling
I2c_Close() from a different thread than the thread which had called
I2c_Open(). Your comment about
I2c_Open() getting the global I2C mutex and
I2c_Close() releasing it made me realize that
I2c_Close() must be called from the same thread because you can’t release a mutex from a different thread than the one that owns it.
What was happening to me was that I called
I2c_Close() in order to release the i2c port for RTCSYNC to use it. But the call to
I2c_Close() was failing because the previous call to
WaitForSingleObject() to acquire that mutex came from a different thread. I then found that I had to call
I2c_Deinit() after calling
CloseHandle() for the global I2C mutex, which does release the mutex even though the call comes from the wrong thread. I then had to call
I2c_Init() after RTCSYNC was finished and then call
After I realized what was going on I was able to simplify my code so that one thread owns the i2c port, and if another thread needs to close it then it signals to the main thread to ask it to close the port. That way I only ever call
I2c_Deinit() from one thread. Now I can successfully call RTCSYNC by simply closing the port, calling RTCSYNC and then re-opening the port. This also means that I don’t need to explicitly use the global I2C mutex in my code, I just let
I2c_Close() handle that for me.
It seems that the library functions to read and write to the i2c port are thread-safe (I use a mutex to limit access to one thread at a time), but
I2c_Close() are not thread-safe and can only be called from one thread. Are there any other parts of the Toradex library which are not thread-safe? I do already only call the init and open functions for the other library features from my main thread and I keep them open until my application shuts down, it’s only the i2c port that I need to close while my application is running. So I think the rest of my library usage is OK.
At all our WinCE libraries xxx_Init() (I2c_Init(), Gpio_Init() etc) does not modify any hardware register but load all required components like driver DLL, etc. This is an expensive operation so you should call xxx_ Deinit() only if you don’t need to access this hardware block anymore. Usually it’s done when application is terminated.
As soon as HW is inited when you need to actually access it you should use call xxx_Open() then use that HW and call xxx_Close(). In case of single thread access you can call xxx_Close() just before xxx_Deinit(). In case of multithread application each thread should follow this pattern:
do atomic HW access (from a Hardware point of view.) Like reading or writing device register.
Open and close operations are lightweight, so it’s OK to invoke them often. This way it’s a completely “thread-safe”