PREEMPT_RT with Linux 4.4.21 on VF61 with BSP 2.6.1 - Problems with M4 using Remoteproc

Hi there,

I have applied the PREEMPT_RT patch with no problems for the linux-toradex (kernel 4.4.21) supplied as part of the BSP 2.6.1beta, the same which contains the modules, device-trees and sources to the FreeRTOS support for the M4 core.

As for that, no problems at all. The system runs well. But I had problems with the autoload of the vf610_cm4_rproc module in this PREEMPT_RT Kernel for the VF61.

Enabling the module at startup, at most of the times the boot freezes or at the loading of systemv - random seed services/driver, or at the loading of the M4 Core itself. A few times the system is capable of booting with no problems.

Another problem is that, even loading the modules after the Linux boot, I cannot have a full operation of the imx_rproc_tty module, in other words, with a “modprobe imx_rproc_tty”, I don’t get a /dev/ttyRPMSG to use.

Could anyone give me a hand on this?
I think the source of the problem could be how the v610_cm4_rproc or the random modules deal with the PREEMPT_RT “way”.

Thanks.

Sorry, but to confirm, do you see problems even without the PREEMPT_RT patch or is this only with the PREEMPT_RT? We never tested remoteproc/FreeRTOS with a PREEMPT_RT enabled kernel.

Hi sanchayan.tx,
The problems appears only with the PREEMPT_RT Kernel.

I raised a ticket internally for tracking this (however we have not scheduled this yet) and also tested the -rt31 patch at my end. Enabling kernel debug options shows issues with not necessarily remoteproc but quite a few other places as well.

Thanks for your attention.

Well… That’s odd.
I’ll activate Kernel Debug options here too.

Based in your reproduction of this issue, is it only occurring when the remoteproc is enabled? Just to confirm.

Our plan is to use the FreeRTOS on M4 and Linux+PREEMPT_RT.

I also got stack traces with VF61 AC97 sound driver, tty and virtual console even when remoteproc was not configured to load with the configuration file in /etc/modules-load.d.

Ok!

Thanks.

Hi @sanchayan.tx,

Did you got any news about this topic/ticket?

Now I’m using the 4.4.39 kernel with rt50 patch. I

've disabled the remoteproc, usb and audio drivers/modules, and even yet I’m still having a “random” boot stuck issue.

In other words, I could say that almost in 5 out of 10 times the boot of my 4.4.39-rt Kernel got stuck.

Thanks for your attention.

And another curious thing: the watchdog is not rebooting the system, even after 60 seconds or more.
How could I activate it?

@sanchayan.tx, what are the kernel debug options that you’ve enabled?

Do you have a stacktrace when the kernel gets stuck? Is it always at the same line?

Do you have the lockup detector enabled (CONFIG_LOCKUP_DETECTOR), and if yes do you get a trace from it after a while?

There is a possible deadlock with the serial driver, we haven’t seen that happening in our BSP so far, but I guess with the RT patch applied it could make a difference. Unfortunately, the patch does not apply directly on our tree.

Hi @stefan.tx,

Sometimes the boot process goes well, nothing happens.

But, most cases in which the boot freezes were at the systemd random generation process, like in this part of the boot process:

    [    1.929358] systemd[1]: systemd 226 running in system mode. (+PAM -AUDIT -SELINUX +IMA -APPARMOR +SMACK +SYSVINIT +UTMP -LIBCRYPTSETUP -GCRYPT +GNUTLS +ACL +XZ -LZ4 -SECCOMP +BLKID -ELFUTILS +KMOD +IDN)
    [    1.949113] systemd[1]: Detected architecture arm.
    
    Welcome to The Ångström Distribution v2015.12!
    
    [    1.978135] systemd[1]: Set hostname to <colibri-vf>.
    [    2.033913] random: systemd: uninitialized urandom read (16 bytes read, 50 bits of entropy available)
    [    2.189426] random: systemd-sysv-ge: uninitialized urandom read (16 bytes read,

Nothing more after this line, or even few steps after it.

And other few times the boot got stuck at the ethernet setup process, like in this part:

         Starting WPA supplicant...
         Starting Hostname Service...
[  OK  ] Started Network Name Resolution.
[   11.256413] fec 400d1000.ethernet eth0: Freescale FEC PHY driver [Micrel KSZ8041] (mii_bus:phy_addr=400d1000.etherne:00, irq=-1)
[   11.281984] IPv6: ADDRCONF(NETDEV_UP):

Even here, nothing more is notified, and I have to manually reset the module.

The CONFIG_LOCKUP_DETECTOR was previously configured as =y in my defconfig.

Unfortunately, even with this option enable, I don’t get any trace from it after this step in which the boot freezes.
The machine literally stop responding and is necessary to do a “manual” reset.

Hm, do you have a display connected to it and a fbdev console?

In case the lockup happens due to the UART, it would be expected that the UART does not show anything anymore. But if that is the case, the fbdev console should still show something, at least the lockup detection stuck trace…

@andrecurvello, I could reproduce your issue. It is clearly serial console related, when using setenv setup setenv setupargs console=tty1 to disable the serial console, the kernel did not freeze anymore.

I could fix the issue by using a newer version of the driver (copy the file drivers/tty/serial/fsl_lpuart.c from Linux v4.9). We will update the v4.4 version of that driver to reflect all changes we pushed to v4.9 soon.

@andrecurvello, there is now a patch on our -next branch which resynchronizes the lpuart driver to the latest upstream version.

Hi @stefan.tx,

I’ve tested it here.

With the Preemption Model set for "Preemptible Kernel (Basic RT), it runs OK, no problems at boot process.

But with the Preemption Model set for "Fully Preemptible Kernel (RT), it keeps freezing sometimes at boot process, more or less at the same point I’ve showed before (systemd setup, etc…).

Is there any way I could record a log or other kind of information that could help in this problem?

I guess that is with the latest patch from -next? On what percentage of boots does it happen for you now?

Can you verify that it does not happen when using setenv setup setenv setupargs console=tty1?

Hi @stefan.tx,

I’ve tested it here with the commando you suggested on uBoot:
setenv setup setenv setupargs console=tty1

It worked well.

I’ve tested here with repeated boot sequences, and all of them occurred OK.

What could it be?

If you can live without serial console, I guess this would be a viable work around…

However, I booted the system several times here, and could not reproduce the issue with the latest patches on -next. Are you absolutely sure you integrated the patch mentioned above? E.g. when running dmesg | head the Linux kernel banner shows the git hash that commit? ( fa3f45c2) Does it happen on every boot for you?