Apalis imx6 mainline performance

I am having realtime latency issues on an imx6q toradex apalis som using the linux-yocto-rt 4.8.12 kernel.

I have the cpu-freq governer set to performance.

Worst latency:

  • ondemand governer : >5000us
  • performance governer: >3000us

But if I continuously toggle the hotpluggable cpu cores on and off
then I sometimes get one of the warning messages shown below.

After this my realtime performance is much improved.

Worst latency:

  • ondemand governer : ~250us
  • performance governer: ~130us

Does anyone know what’s going on here?

Thanks,
Mark

#
# WARNING 1 - was getting this with tickless kernel
#
root@atlas:~# sh toggle_one.sh 1
[   27.047710] ===============================
[   27.047712] [ INFO: suspicious RCU usage. ]
[   27.047719] 4.8.12-rt7-yocto-preempt-rt #17 Tainted: G        W      
[   27.047721] -------------------------------
[   27.047727] /home/myuser/poky-morty-toradex/build/tmp/work-shared/atlas/kernel-source/kernel/sched/fair.c:5412 suspicious rcu_dereference_check() usage!
[   27.047730] 
[   27.047730] other info that might help us debug this:
[   27.047730] 
[   27.047735] 
[   27.047735] RCU used illegally from offline CPU!
[   27.047735] rcu_scheduler_active = 1, debug_locks = 0
[   27.047741] 3 locks held by swapper/1/0:
[   27.047771]  #0:  ((cpu_died).wait.lock){......}, at: [<80165994>] complete+0x1c/0x44
[   27.047795]  #1:  (&p->pi_lock){-...-.}, at: [<80149cb0>] try_to_wake_up+0x34/0x320
[   27.047813]  #2:  (rcu_read_lock){......}, at: [<80152808>] select_task_rq_fair+0x100/0xb50
[   27.047816] 
[   27.047816] stack backtrace:
[   27.047823] CPU: 1 PID: 0 Comm: swapper/1 Tainted: G        W       4.8.12-rt7-yocto-preempt-rt #17
[   27.047829] Hardware name: Freescale i.MX6 Quad/DualLite (Device Tree)
[   27.047834] Backtrace: 
[   27.047860] [<8010c518>] (dump_backtrace) from [<8010c6bc>] (show_stack+0x18/0x1c)
[   27.047873]  r7:00000000 r6:200f0093 r5:00000000 r4:80b2e008
[   27.047883] [<8010c6a4>] (show_stack) from [<8037a280>] (dump_stack+0xb4/0xe8)
[   27.047898] [<8037a1cc>] (dump_stack) from [<8016d758>] (lockdep_rcu_suspicious+0xbc/0x11c)
[   27.047913]  r9:80b13c84 r8:00000003 r7:809382e4 r6:00001524 r5:8093bdd8 r4:e48ab700
[   27.047925] [<8016d69c>] (lockdep_rcu_suspicious) from [<80153114>] (select_task_rq_fair+0xa0c/0xb50)
[   27.047936]  r7:80a43d40 r6:80b5ede8 r5:00000000 r4:00000000
[   27.047947] [<80152708>] (select_task_rq_fair) from [<80149ed0>] (try_to_wake_up+0x254/0x320)
[   27.047963]  r10:8093c144 r9:800f0093 r8:80b5edfe r7:00000000 r6:80152708 r5:e2c2e564
[   27.047968]  r4:e2c2e040
[   27.047982] [<80149c7c>] (try_to_wake_up) from [<8014a69c>] (wake_up_process+0x18/0x1c)
[   27.047997]  r10:8093c144 r9:80b13b4c r8:80b5edfe r7:80b13afc r6:200f0093 r5:80b1d540
[   27.048002]  r4:e3827d38
[   27.048012] [<8014a684>] (wake_up_process) from [<8016523c>] (swake_up_locked+0x28/0x40)
[   27.048020] [<80165214>] (swake_up_locked) from [<801659ac>] (complete+0x34/0x44)
[   27.048027]  r5:80b1d540 r4:80b1d544
[   27.048038] [<80165978>] (complete) from [<8010df94>] (arch_cpu_idle_dead+0x44/0xbc)
[   27.048049]  r7:80b13afc r6:00000002 r5:00000001 r4:80b13aa8
[   27.048059] [<8010df50>] (arch_cpu_idle_dead) from [<80165d68>] (cpu_startup_entry+0x174/0x1ac)
[   27.048065]  r5:00000000 r4:e48b6000
[   27.048074] [<80165bf4>] (cpu_startup_entry) from [<8010e168>] (secondary_start_kernel+0x15c/0x18c)
[   27.048081]  r7:80b5f9a0 r4:80b272c0
[   27.048089] [<8010e00c>] (secondary_start_kernel) from [<101015ec>] (0x101015ec)
[   27.048101]  r7:80b5f9a0 r6:10c0387d r5:00000051 r4:7488c06a
[   27.048594] CPU1: shutdown



#
# WARNING 2 - seem to be getting this with HZ1000
#
root@atlas:~# sh toggle_all.sh
disabling /sys/devices/system/cpu/cpu1/online
[   16.751149] CPU1: shutdown
enabling /sys/devices/system/cpu/cpu1/online
disabling /sys/devices/system/cpu/cpu2/online
[   17.932491] CPU2: shutdown
enabling /sys/devices/system/cpu/cpu2/online
disabling /sys/devices/system/cpu/cpu3/online
[   19.068119] CPU3: shutdown
enabling /sys/devices/system/cpu/cpu3/online
[   19.086985] 
[   19.086988] ======================================================
[   19.086990] [ INFO: possible circular locking dependency detected ]
[   19.087000] 4.8.12-rt7-yocto-preempt-rt #19 Tainted: G        W      
[   19.087003] -------------------------------------------------------
[   19.087007] sh/595 is trying to acquire lock:
[   19.087039]  (&p->pi_lock){-...-.}, at: [<80149cf0>] try_to_wake_up+0x34/0x320
[   19.087041] 
[   19.087041] but task is already holding lock:
[   19.087065]  (hrtimer_bases.lock){-.....}, at: [<80187e98>] __hrtimer_run_queues+0xfc/0x280
[   19.087067] 
[   19.087067] which lock already depends on the new lock.
[   19.087067] 
[   19.087069] 
[   19.087069] the existing dependency chain (in reverse order) is:
[   19.087077] 
[   19.087077] -> #3 (hrtimer_bases.lock){-.....}:
[   19.087097]        [<80799150>] _raw_spin_lock_irqsave+0x58/0x6c
[   19.087107]        [<80187780>] lock_hrtimer_base+0x30/0x60
[   19.087117]        [<80188494>] hrtimer_start_range_ns+0x24/0x1d8
[   19.087132]        [<801613d4>] enqueue_task_rt+0x2c4/0x318
[   19.087140]        [<8014bc24>] __sched_setscheduler+0x6c8/0xcf8
[   19.087149]        [<8014c2d4>] _sched_setscheduler+0x80/0x88
[   19.087156]        [<8014ca84>] sched_setscheduler+0x14/0x18
[   19.087171]        [<80125c5c>] ktimer_softirqd_set_sched_params+0x30/0x78
[   19.087183]        [<801461c4>] smpboot_thread_fn+0x114/0x2b4
[   19.087196]        [<801425a4>] kthread+0xd8/0xf8
[   19.087210]        [<801079f8>] ret_from_fork+0x14/0x3c
[   19.087219] 
[   19.087219] -> #2 (&rt_b->rt_runtime_lock){-.....}:
[   19.087231]        [<80798f28>] _raw_spin_lock+0x48/0x58
[   19.087241]        [<801612e4>] enqueue_task_rt+0x1d4/0x318
[   19.087249]        [<8014bc24>] __sched_setscheduler+0x6c8/0xcf8
[   19.087256]        [<8014c2d4>] _sched_setscheduler+0x80/0x88
[   19.087263]        [<8014ca84>] sched_setscheduler+0x14/0x18
[   19.087273]        [<80125c5c>] ktimer_softirqd_set_sched_params+0x30/0x78
[   19.087281]        [<801461c4>] smpboot_thread_fn+0x114/0x2b4
[   19.087290]        [<801425a4>] kthread+0xd8/0xf8
[   19.087298]        [<801079f8>] ret_from_fork+0x14/0x3c
[   19.087306] 
[   19.087306] -> #1 (&rq->lock){-...-.}:
[   19.087315]        [<80798f28>] _raw_spin_lock+0x48/0x58
[   19.087323]        [<80158db8>] task_fork_fair+0x28/0x168
[   19.087330]        [<8014b0a4>] sched_fork+0x13c/0x304
[   19.087339]        [<8011e710>] copy_process.part.3+0x41c/0x15d8
[   19.087347]        [<8011fa24>] _do_fork+0xa0/0x568
[   19.087354]        [<8011ff4c>] kernel_thread+0x34/0x3c
[   19.087362]        [<80793700>] rest_init+0x24/0x170
[   19.087375]        [<80a00d38>] start_kernel+0x304/0x360
[   19.087382]        [<1000807c>] 0x1000807c
[   19.087390] 
[   19.087390] -> #0 (&p->pi_lock){-...-.}:
[   19.087402]        [<8016d24c>] lock_acquire+0x70/0x90
[   19.087411]        [<80799150>] _raw_spin_lock_irqsave+0x58/0x6c
[   19.087421]        [<80149cf0>] try_to_wake_up+0x34/0x320
[   19.087431]        [<8014a6dc>] wake_up_process+0x18/0x1c
[   19.087440]        [<80126028>] wakeup_timer_softirqd+0x30/0x38
[   19.087450]        [<80126690>] raise_softirq_irqoff+0x58/0x60
[   19.087462]        [<80187f14>] __hrtimer_run_queues+0x178/0x280
[   19.087471]        [<80188a28>] hrtimer_interrupt+0xb0/0x1fc
[   19.087480]        [<80189024>] hrtimers_dead_cpu+0x194/0x1b8
[   19.087489]        [<80120b1c>] cpuhp_down_callbacks+0x44/0x88
[   19.087497]        [<80793b5c>] _cpu_down.constprop.6+0x1f4/0x3a8
[   19.087505]        [<80121ea0>] do_cpu_down+0x3c/0x60
[   19.087513]        [<80122344>] cpu_down+0x14/0x18
[   19.087529]        [<8045dfe8>] cpu_subsys_offline+0x14/0x18
[   19.087537]        [<80458a48>] device_offline+0x8c/0xbc
[   19.087543]        [<80458b54>] online_store+0x4c/0x7c
[   19.087561]        [<80457548>] dev_attr_store+0x20/0x2c
[   19.087577]        [<802849d0>] sysfs_kf_write+0x50/0x54
[   19.087586]        [<80283f28>] kernfs_fop_write+0xc8/0x1f4
[   19.087595]        [<8020dc68>] __vfs_write+0x20/0x40
[   19.087602]        [<8020dea0>] vfs_write+0xa8/0x1b8
[   19.087610]        [<8020e08c>] SyS_write+0x44/0x98
[   19.087619]        [<80107960>] ret_fast_syscall+0x0/0x1c
[   19.087622] 
[   19.087622] other info that might help us debug this:
[   19.087622] 
[   19.087634] Chain exists of:
[   19.087634]   &p->pi_lock --> &rt_b->rt_runtime_lock --> hrtimer_bases.lock
[   19.087634] 
[   19.087638]  Possible unsafe locking scenario:
[   19.087638] 
[   19.087640]        CPU0                    CPU1
[   19.087642]        ----                    ----
[   19.087647]   lock(hrtimer_bases.lock);
[   19.087652]                                lock(&rt_b->rt_runtime_lock);
[   19.087657]                                lock(hrtimer_bases.lock);
[   19.087661]   lock(&p->pi_lock);
[   19.087663] 
[   19.087663]  *** DEADLOCK ***
[   19.087663] 
[   19.087668] 10 locks held by sh/595:
[   19.087688]  #0:  (sb_writers#7){.+.+.+}, at: [<80211de4>] __sb_start_write+0x94/0xc0
[   19.087704]  #1:  (&of->mutex){+.+.+.}, at: [<80283eec>] kernfs_fop_write+0x8c/0x1f4
[   19.087720]  #2:  (s_active#59){.+.+.+}, at: [<80283ef4>] kernfs_fop_write+0x94/0x1f4
[   19.087738]  #3:  (device_hotplug_lock){+.+.+.}, at: [<80457ce4>] lock_device_hotplug_sysfs+0x14/4
[   19.087751]  #4:  (&dev->mutex){......}, at: [<804589fc>] device_offline+0x40/0xbc
[   19.087766]  #5:  (cpu_add_remove_lock){+.+.+.}, at: [<80121e84>] do_cpu_down+0x20/0x60
[   19.087780]  #6:  (cpu_hotplug.lock){++++++}, at: [<80121bdc>] cpu_hotplug_begin+0x0/0xd8
[   19.087795]  #7:  (cpu_hotplug.lock#2){+.+.+.}, at: [<80121c68>] cpu_hotplug_begin+0x8c/0xd8
[   19.087808]  #8:  (&hp->lock){+.+.+.}, at: [<80121044>] takedown_cpu+0x7c/0x170
[   19.087826]  #9:  (hrtimer_bases.lock){-.....}, at: [<80187e98>] __hrtimer_run_queues+0xfc/0x280
[   19.087829] 
[   19.087829] stack backtrace:
[   19.087839] CPU: 0 PID: 595 Comm: sh Tainted: G        W       4.8.12-rt7-yocto-preempt-rt #19
[   19.087843] Hardware name: Freescale i.MX6 Quad/DualLite (Device Tree)
[   19.087848] Backtrace: 
[   19.087869] [<8010c518>] (dump_backtrace) from [<8010c6bc>] (show_stack+0x18/0x1c)
[   19.087882]  r7:00000000 r6:20010093 r5:00000000 r4:80b2e008
[   19.087892] [<8010c6a4>] (show_stack) from [<8037a2c0>] (dump_stack+0xb4/0xe8)
[   19.087904] [<8037a20c>] (dump_stack) from [<80168474>] (print_circular_bug+0x1d8/0x310)
[   19.087920]  r9:00000000 r8:e2d13700 r7:80c8a60c r6:80b622fc r5:80c8b92c r4:80c8752c
[   19.087931] [<8016829c>] (print_circular_bug) from [<8016cb68>] (__lock_acquire+0x15f8/0x1918)
[   19.087947]  r10:e2d13700 r9:81329598 r8:0000000a r7:80b2e12c r6:e2d13dc0 r5:e2d13de0
[   19.087953]  r4:80c8a60c r3:e2d13dc0
[   19.087963] [<8016b570>] (__lock_acquire) from [<8016d24c>] (lock_acquire+0x70/0x90)
[   19.087978]  r10:00000001 r9:80b2e12c r8:00000001 r7:00000001 r6:60010093 r5:00000000
[   19.087982]  r4:ffffe000
[   19.087997] [<8016d1dc>] (lock_acquire) from [<80799150>] (_raw_spin_lock_irqsave+0x58/0x6c)
[   19.088010]  r8:00000000 r7:00000000 r6:80149cf0 r5:20010093 r4:e487ae64
[   19.088027] [<807990f8>] (_raw_spin_lock_irqsave) from [<80149cf0>] (try_to_wake_up+0x34/0x320)
[   19.088036]  r6:00000003 r5:e487ae64 r4:e487a940
[   19.088048] [<80149cbc>] (try_to_wake_up) from [<8014a6dc>] (wake_up_process+0x18/0x1c)
[   19.088063]  r10:00000001 r9:80b2e12c r8:00000000 r7:00000004 r6:71ac9bca r5:e577530c
[   19.088067]  r4:00000008
[   19.088080] [<8014a6c4>] (wake_up_process) from [<80126028>] (wakeup_timer_softirqd+0x30/0x38)
[   19.088091] [<80125ff8>] (wakeup_timer_softirqd) from [<80126690>] (raise_softirq_irqoff+0x58/0x6)
[   19.088104] [<80126638>] (raise_softirq_irqoff) from [<80187f14>] (__hrtimer_run_queues+0x178/0x2)
[   19.088111]  r5:e577530c r4:e5775568
[   19.088123] [<80187d9c>] (__hrtimer_run_queues) from [<80188a28>] (hrtimer_interrupt+0xb0/0x1fc)
[   19.088138]  r10:e5775320 r9:e5775360 r8:e5775244 r7:e57753a0 r6:ffffffff r5:00000003
[   19.088142]  r4:e5775200
[   19.088154] [<80188978>] (hrtimer_interrupt) from [<80189024>] (hrtimers_dead_cpu+0x194/0x1b8)
[   19.088169]  r10:e57752cc r9:e5775380 r8:e5775300 r7:00000000 r6:e57993c0 r5:00000001
[   19.088173]  r4:00000000
[   19.088184] [<80188e90>] (hrtimers_dead_cpu) from [<80120b1c>] (cpuhp_down_callbacks+0x44/0x88)
[   19.088199]  r10:00000079 r9:00000000 r8:e5796284 r7:00000044 r6:80b1f8f4 r5:00000003
[   19.088203]  r4:00000000
[   19.088213] [<80120ad8>] (cpuhp_down_callbacks) from [<80793b5c>] (_cpu_down.constprop.6+0x1f4/0x)
[   19.088229]  r9:80a3e2cc r8:64d58000 r7:80b13c84 r6:80a3e284 r5:00000000 r4:00000003
[   19.088239] [<80793968>] (_cpu_down.constprop.6) from [<80121ea0>] (do_cpu_down+0x3c/0x60)
[   19.088253]  r10:e41860cc r9:00000000 r8:00000000 r7:e38e8700 r6:00000000 r5:00000003
[   19.088257]  r4:80b1fdd4
[   19.088267] [<80121e64>] (do_cpu_down) from [<80122344>] (cpu_down+0x14/0x18)
[   19.088290]  r7:e38e8700 r6:e5796044 r5:00000000 r4:e5796010
[   19.088307] [<80122330>] (cpu_down) from [<8045dfe8>] (cpu_subsys_offline+0x14/0x18)
[   19.088319] [<8045dfd4>] (cpu_subsys_offline) from [<80458a48>] (device_offline+0x8c/0xbc)
[   19.088329] [<804589bc>] (device_offline) from [<80458b54>] (online_store+0x4c/0x7c)
[   19.088346]  r7:e38e8700 r6:e5796010 r5:00000002 r4:00000000
[   19.088363] [<80458b08>] (online_store) from [<80457548>] (dev_attr_store+0x20/0x2c)
[   19.088378]  r6:e38e8700 r5:80457528 r4:80458b08
[   19.088395] [<80457528>] (dev_attr_store) from [<802849d0>] (sysfs_kf_write+0x50/0x54)
[   19.088406]  r5:80457528 r4:00000002
[   19.088419] [<80284980>] (sysfs_kf_write) from [<80283f28>] (kernfs_fop_write+0xc8/0x1f4)
[   19.088432]  r7:e38e8700 r6:e2ddff78 r5:00000000 r4:e41860c0
[   19.088442] [<80283e60>] (kernfs_fop_write) from [<8020dc68>] (__vfs_write+0x20/0x40)
[   19.088458]  r10:00000000 r9:00000002 r8:00000000 r7:e2ddff78 r6:0143db30 r5:e3834600
[   19.088462]  r4:80283e60
[   19.088470] [<8020dc48>] (__vfs_write) from [<8020dea0>] (vfs_write+0xa8/0x1b8)
[   19.088477]  r5:e3834600 r4:00000002
[   19.088485] [<8020ddf8>] (vfs_write) from [<8020e08c>] (SyS_write+0x44/0x98)
[   19.088500]  r9:00000002 r8:0143db30 r7:00000000 r6:00000000 r5:e3834600 r4:e3834600
[   19.088511] [<8020e048>] (SyS_write) from [<80107960>] (ret_fast_syscall+0x0/0x1c)
[   19.088526]  r9:e2dde000 r8:80107b48 r7:00000004 r6:45c9a160 r5:0143db30 r4:00000002



root@atlas:~# cat toggle_one.sh 
[ -z $1 ] && echo "usage: $0 <cpu-num>" && exit 1
FILE=/sys/devices/system/cpu/cpu$1/online
while true
do
        echo "off"
        echo 0 > $FILE
        sleep 1
        echo "on"
        echo 1 > $FILE
        sleep 1
done


root@atlas:~# cat toggle_all.sh 
for i in {1..3}
do
        FILE=/sys/devices/system/cpu/cpu${i}/online
        echo "disabling $FILE"
        echo 0 > $FILE
        sleep 1
        echo "enabling $FILE"
        echo 1 > $FILE
done

I think the kernel config options below have resolved the issue.

The cpu-freq options are important but I already knew this and they were not resolving the issue by themselves.

I believe enabling CONFIG_QORIQ_CPUFREQ
and/or disabling the PROVE and DEBUG options have resolved the issue.

Cyclictest worst-case latency is now reading as 81us

Config changes:

< CONFIG_UNINLINE_SPIN_UNLOCK=y
580c579,601
< # CONFIG_CPU_FREQ is not set
---
> CONFIG_CPU_FREQ=y
> # CONFIG_CPU_FREQ_STAT is not set
> CONFIG_CPU_FREQ_DEFAULT_GOV_PERFORMANCE=y
> # CONFIG_CPU_FREQ_DEFAULT_GOV_POWERSAVE is not set
> # CONFIG_CPU_FREQ_DEFAULT_GOV_USERSPACE is not set
> # CONFIG_CPU_FREQ_DEFAULT_GOV_ONDEMAND is not set
> # CONFIG_CPU_FREQ_DEFAULT_GOV_CONSERVATIVE is not set
> # CONFIG_CPU_FREQ_DEFAULT_GOV_SCHEDUTIL is not set
> CONFIG_CPU_FREQ_GOV_PERFORMANCE=y
> # CONFIG_CPU_FREQ_GOV_POWERSAVE is not set
> # CONFIG_CPU_FREQ_GOV_USERSPACE is not set
> # CONFIG_CPU_FREQ_GOV_ONDEMAND is not set
> # CONFIG_CPU_FREQ_GOV_CONSERVATIVE is not set
> # CONFIG_CPU_FREQ_GOV_SCHEDUTIL is not set
> 
> #
> # CPU frequency scaling drivers
> #
> # CONFIG_CPUFREQ_DT is not set
> # CONFIG_ARM_BIG_LITTLE_CPUFREQ is not set
> CONFIG_ARM_IMX6Q_CPUFREQ=y
> # CONFIG_ARM_KIRKWOOD_CPUFREQ is not set
> CONFIG_QORIQ_CPUFREQ=y
618a640
> CONFIG_PM_OPP=y
2756c2778
< # CONFIG_CLK_QORIQ is not set
---
> CONFIG_CLK_QORIQ=y
3122,3123c3144,3145
< CONFIG_DEBUG_SPINLOCK=y
< CONFIG_DEBUG_MUTEXES=y
---
> # CONFIG_DEBUG_SPINLOCK is not set
> # CONFIG_DEBUG_MUTEXES is not set
3125,3127c3147,3148
< CONFIG_DEBUG_LOCK_ALLOC=y
< CONFIG_PROVE_LOCKING=y
< CONFIG_LOCKDEP=y
---
> # CONFIG_DEBUG_LOCK_ALLOC is not set
> # CONFIG_PROVE_LOCKING is not set
3129d3149
< # CONFIG_DEBUG_LOCKDEP is not set
3133,3135c3153
< CONFIG_USING_GET_LOCK_PARENT_IP=y
< CONFIG_TRACE_IRQFLAGS=y
< CONFIG_STACKTRACE=y
---
> # CONFIG_STACKTRACE is not set
3147,3148c3165
< CONFIG_PROVE_RCU=y
< # CONFIG_PROVE_RCU_REPEATEDLY is not set
---
> # CONFIG_PROVE_RCU is not set