RTC not working on viola board

Dear Toradex:

We have noticed that the RTC on Viola plus 1.2A is not working and not storing the time and date.
The effect is reproducible

With the battery inserted for first time, some boards give us this message:

[    1.101197] rtc-ds1307 0-0068: rtc core: registered m41t00 as rtc0
[    1.111920] snvs_rtc 400a7034.snvs-rtc-lp: rtc core: registered 400a7034.snvs-rtc-l as rtc1
[    2.299854] rtc-ds1307 0-0068: setting system clock to 2001-01-01 00:01:07 UTC (978307267)

Other boards give us the following:

[    1.101118] rtc-ds1307 0-0068: rtc core: registered m41t00 as rtc0
[    1.111837] snvs_rtc 400a7034.snvs-rtc-lp: rtc core: registered 400a7034.snvs-rtc-l as rtc1
[    2.217886] rtc-ds1307 0-0068: hctosys: unable to read the hardware clock

With the system date synchronized, we put it onto the hardware clock with hwclock -w

root@zic-5110:~# hwclock -c
hw-time      system-time         freq-offset-ppm   tick
 978308238   1502459601.401509
 978308250   1502459613.402800               108      1
 978308261   1502459624.406764               228      2
 978308272   1502459635.404667                93      1
 978308283   1502459646.405208                82      1
1502459656   1502459656.496427          -1000000   -10000
1502459667   1502459667.496238          -1000000   -10000
1502459678   1502459678.496018          -1000000   -10000

Now the rtc has the correct time, if we unplugg the carrier board from the power supply, wait ten minutes (or less) and then power on the board, the dmesg shows the following:

[    1.101457] rtc-ds1307 0-0068: rtc core: registered m41t00 as rtc0
[    1.112199] snvs_rtc 400a7034.snvs-rtc-lp: rtc core: registered 400a7034.snvs-rtc-l as rtc1
[    2.306225] rtc-ds1307 0-0068: setting system clock to 2001-01-01 00:02:09 UTC (978307329)

Showing that the RTC has reset to the default value, we tested this with several types of batteries:
Panasonic BR1225
Maxell CR1216
Maxell CR1220

with identical results.

Worse things happens on some boards when when we disconnect the power and reconnect it within five seconds, the boot process gets stucks with systemd messages:

[    2.642635] random: systemd urandom read with 42 bits of entropy available
[    2.665832] systemd[1]: systemd 215 running in system mode. (+PAM +AUDIT +SELINUX +IMA +SYSVINIT +LIBCRYPTSETUP +GCRYPT +ACL +XZ -SECCOMP -APPARMOR)
[    2.690365] systemd[1]: Detected architecture 'arm'.
[    2.727011] systemd[1]: Set hostname to <RTT74>.
[    3.268849] systemd[1]: Cannot add dependency job for unit display-manager.service, ignoring: Unit display-manager.service failed to load: No such file or directory.
[    3.305644] systemd[1]: Time has been changed
[    3.314881] systemd[1]: Time has been changed
[    3.323865] systemd[1]: Time has been changed
[    3.332812] systemd[1]: Time has been changed
[    3.341702] systemd[1]: Time has been changed
[    3.350463] systemd[1]: Time has been changed
[    3.359054] systemd[1]: Time has been changed
[    3.367431] systemd[1]: Time has been changed
[    3.375633] systemd[1]: Time has been changed
[    3.383597] systemd[1]: Time has been changed
[    3.391403] systemd[1]: Time has been changed
[    3.399144] systemd[1]: Time has been changed
[    3.406672] systemd[1]: Time has been changed

If we disconnect the power for ten minutes it boots again…

It appears to be a kernel bug, is there an update of the linux kernel? any workarround?

thanks

The RTC sometimes provides a random but valid (in the sense of it could be real) date/time when powering up the first time. This was a bug in the RTC driver and is fixes in our newer BSP see:
http://developer.toradex.com/software/linux/linux-software/release-details?view=all&issue=25933

These are the two changes, currently only in our Linux 4.4 tree, but it should be fairly easy to backport:
http://git.toradex.com/cgit/linux-toradex.git/commit/?h=toradex_vf_4.4&id=41e7fd58038ff9110b0ccf39f46de927d36401d7
http://git.toradex.com/cgit/linux-toradex.git/commit/?h=toradex_vf_4.4&id=f9fc0f2f0ab206a613230f81c751f683072cced1

The Time has been changed is a subsequent error which happens when the the random time read from the RTC is beyond the current UNIX time epoch. The issue should not appear once the RTC readout of an invalid time is prevented with the above fix.