I am having realtime latency issues on an imx6q toradex apalis som using the linux-yocto-rt 4.8.12 kernel.
I have the cpu-freq governer set to performance.
Worst latency:
- ondemand governer : >5000us
- performance governer: >3000us
But if I continuously toggle the hotpluggable cpu cores on and off
then I sometimes get one of the warning messages shown below.
After this my realtime performance is much improved.
Worst latency:
- ondemand governer : ~250us
- performance governer: ~130us
Does anyone know what’s going on here?
Thanks,
Mark
#
# WARNING 1 - was getting this with tickless kernel
#
root@atlas:~# sh toggle_one.sh 1
[ 27.047710] ===============================
[ 27.047712] [ INFO: suspicious RCU usage. ]
[ 27.047719] 4.8.12-rt7-yocto-preempt-rt #17 Tainted: G W
[ 27.047721] -------------------------------
[ 27.047727] /home/myuser/poky-morty-toradex/build/tmp/work-shared/atlas/kernel-source/kernel/sched/fair.c:5412 suspicious rcu_dereference_check() usage!
[ 27.047730]
[ 27.047730] other info that might help us debug this:
[ 27.047730]
[ 27.047735]
[ 27.047735] RCU used illegally from offline CPU!
[ 27.047735] rcu_scheduler_active = 1, debug_locks = 0
[ 27.047741] 3 locks held by swapper/1/0:
[ 27.047771] #0: ((cpu_died).wait.lock){......}, at: [<80165994>] complete+0x1c/0x44
[ 27.047795] #1: (&p->pi_lock){-...-.}, at: [<80149cb0>] try_to_wake_up+0x34/0x320
[ 27.047813] #2: (rcu_read_lock){......}, at: [<80152808>] select_task_rq_fair+0x100/0xb50
[ 27.047816]
[ 27.047816] stack backtrace:
[ 27.047823] CPU: 1 PID: 0 Comm: swapper/1 Tainted: G W 4.8.12-rt7-yocto-preempt-rt #17
[ 27.047829] Hardware name: Freescale i.MX6 Quad/DualLite (Device Tree)
[ 27.047834] Backtrace:
[ 27.047860] [<8010c518>] (dump_backtrace) from [<8010c6bc>] (show_stack+0x18/0x1c)
[ 27.047873] r7:00000000 r6:200f0093 r5:00000000 r4:80b2e008
[ 27.047883] [<8010c6a4>] (show_stack) from [<8037a280>] (dump_stack+0xb4/0xe8)
[ 27.047898] [<8037a1cc>] (dump_stack) from [<8016d758>] (lockdep_rcu_suspicious+0xbc/0x11c)
[ 27.047913] r9:80b13c84 r8:00000003 r7:809382e4 r6:00001524 r5:8093bdd8 r4:e48ab700
[ 27.047925] [<8016d69c>] (lockdep_rcu_suspicious) from [<80153114>] (select_task_rq_fair+0xa0c/0xb50)
[ 27.047936] r7:80a43d40 r6:80b5ede8 r5:00000000 r4:00000000
[ 27.047947] [<80152708>] (select_task_rq_fair) from [<80149ed0>] (try_to_wake_up+0x254/0x320)
[ 27.047963] r10:8093c144 r9:800f0093 r8:80b5edfe r7:00000000 r6:80152708 r5:e2c2e564
[ 27.047968] r4:e2c2e040
[ 27.047982] [<80149c7c>] (try_to_wake_up) from [<8014a69c>] (wake_up_process+0x18/0x1c)
[ 27.047997] r10:8093c144 r9:80b13b4c r8:80b5edfe r7:80b13afc r6:200f0093 r5:80b1d540
[ 27.048002] r4:e3827d38
[ 27.048012] [<8014a684>] (wake_up_process) from [<8016523c>] (swake_up_locked+0x28/0x40)
[ 27.048020] [<80165214>] (swake_up_locked) from [<801659ac>] (complete+0x34/0x44)
[ 27.048027] r5:80b1d540 r4:80b1d544
[ 27.048038] [<80165978>] (complete) from [<8010df94>] (arch_cpu_idle_dead+0x44/0xbc)
[ 27.048049] r7:80b13afc r6:00000002 r5:00000001 r4:80b13aa8
[ 27.048059] [<8010df50>] (arch_cpu_idle_dead) from [<80165d68>] (cpu_startup_entry+0x174/0x1ac)
[ 27.048065] r5:00000000 r4:e48b6000
[ 27.048074] [<80165bf4>] (cpu_startup_entry) from [<8010e168>] (secondary_start_kernel+0x15c/0x18c)
[ 27.048081] r7:80b5f9a0 r4:80b272c0
[ 27.048089] [<8010e00c>] (secondary_start_kernel) from [<101015ec>] (0x101015ec)
[ 27.048101] r7:80b5f9a0 r6:10c0387d r5:00000051 r4:7488c06a
[ 27.048594] CPU1: shutdown
#
# WARNING 2 - seem to be getting this with HZ1000
#
root@atlas:~# sh toggle_all.sh
disabling /sys/devices/system/cpu/cpu1/online
[ 16.751149] CPU1: shutdown
enabling /sys/devices/system/cpu/cpu1/online
disabling /sys/devices/system/cpu/cpu2/online
[ 17.932491] CPU2: shutdown
enabling /sys/devices/system/cpu/cpu2/online
disabling /sys/devices/system/cpu/cpu3/online
[ 19.068119] CPU3: shutdown
enabling /sys/devices/system/cpu/cpu3/online
[ 19.086985]
[ 19.086988] ======================================================
[ 19.086990] [ INFO: possible circular locking dependency detected ]
[ 19.087000] 4.8.12-rt7-yocto-preempt-rt #19 Tainted: G W
[ 19.087003] -------------------------------------------------------
[ 19.087007] sh/595 is trying to acquire lock:
[ 19.087039] (&p->pi_lock){-...-.}, at: [<80149cf0>] try_to_wake_up+0x34/0x320
[ 19.087041]
[ 19.087041] but task is already holding lock:
[ 19.087065] (hrtimer_bases.lock){-.....}, at: [<80187e98>] __hrtimer_run_queues+0xfc/0x280
[ 19.087067]
[ 19.087067] which lock already depends on the new lock.
[ 19.087067]
[ 19.087069]
[ 19.087069] the existing dependency chain (in reverse order) is:
[ 19.087077]
[ 19.087077] -> #3 (hrtimer_bases.lock){-.....}:
[ 19.087097] [<80799150>] _raw_spin_lock_irqsave+0x58/0x6c
[ 19.087107] [<80187780>] lock_hrtimer_base+0x30/0x60
[ 19.087117] [<80188494>] hrtimer_start_range_ns+0x24/0x1d8
[ 19.087132] [<801613d4>] enqueue_task_rt+0x2c4/0x318
[ 19.087140] [<8014bc24>] __sched_setscheduler+0x6c8/0xcf8
[ 19.087149] [<8014c2d4>] _sched_setscheduler+0x80/0x88
[ 19.087156] [<8014ca84>] sched_setscheduler+0x14/0x18
[ 19.087171] [<80125c5c>] ktimer_softirqd_set_sched_params+0x30/0x78
[ 19.087183] [<801461c4>] smpboot_thread_fn+0x114/0x2b4
[ 19.087196] [<801425a4>] kthread+0xd8/0xf8
[ 19.087210] [<801079f8>] ret_from_fork+0x14/0x3c
[ 19.087219]
[ 19.087219] -> #2 (&rt_b->rt_runtime_lock){-.....}:
[ 19.087231] [<80798f28>] _raw_spin_lock+0x48/0x58
[ 19.087241] [<801612e4>] enqueue_task_rt+0x1d4/0x318
[ 19.087249] [<8014bc24>] __sched_setscheduler+0x6c8/0xcf8
[ 19.087256] [<8014c2d4>] _sched_setscheduler+0x80/0x88
[ 19.087263] [<8014ca84>] sched_setscheduler+0x14/0x18
[ 19.087273] [<80125c5c>] ktimer_softirqd_set_sched_params+0x30/0x78
[ 19.087281] [<801461c4>] smpboot_thread_fn+0x114/0x2b4
[ 19.087290] [<801425a4>] kthread+0xd8/0xf8
[ 19.087298] [<801079f8>] ret_from_fork+0x14/0x3c
[ 19.087306]
[ 19.087306] -> #1 (&rq->lock){-...-.}:
[ 19.087315] [<80798f28>] _raw_spin_lock+0x48/0x58
[ 19.087323] [<80158db8>] task_fork_fair+0x28/0x168
[ 19.087330] [<8014b0a4>] sched_fork+0x13c/0x304
[ 19.087339] [<8011e710>] copy_process.part.3+0x41c/0x15d8
[ 19.087347] [<8011fa24>] _do_fork+0xa0/0x568
[ 19.087354] [<8011ff4c>] kernel_thread+0x34/0x3c
[ 19.087362] [<80793700>] rest_init+0x24/0x170
[ 19.087375] [<80a00d38>] start_kernel+0x304/0x360
[ 19.087382] [<1000807c>] 0x1000807c
[ 19.087390]
[ 19.087390] -> #0 (&p->pi_lock){-...-.}:
[ 19.087402] [<8016d24c>] lock_acquire+0x70/0x90
[ 19.087411] [<80799150>] _raw_spin_lock_irqsave+0x58/0x6c
[ 19.087421] [<80149cf0>] try_to_wake_up+0x34/0x320
[ 19.087431] [<8014a6dc>] wake_up_process+0x18/0x1c
[ 19.087440] [<80126028>] wakeup_timer_softirqd+0x30/0x38
[ 19.087450] [<80126690>] raise_softirq_irqoff+0x58/0x60
[ 19.087462] [<80187f14>] __hrtimer_run_queues+0x178/0x280
[ 19.087471] [<80188a28>] hrtimer_interrupt+0xb0/0x1fc
[ 19.087480] [<80189024>] hrtimers_dead_cpu+0x194/0x1b8
[ 19.087489] [<80120b1c>] cpuhp_down_callbacks+0x44/0x88
[ 19.087497] [<80793b5c>] _cpu_down.constprop.6+0x1f4/0x3a8
[ 19.087505] [<80121ea0>] do_cpu_down+0x3c/0x60
[ 19.087513] [<80122344>] cpu_down+0x14/0x18
[ 19.087529] [<8045dfe8>] cpu_subsys_offline+0x14/0x18
[ 19.087537] [<80458a48>] device_offline+0x8c/0xbc
[ 19.087543] [<80458b54>] online_store+0x4c/0x7c
[ 19.087561] [<80457548>] dev_attr_store+0x20/0x2c
[ 19.087577] [<802849d0>] sysfs_kf_write+0x50/0x54
[ 19.087586] [<80283f28>] kernfs_fop_write+0xc8/0x1f4
[ 19.087595] [<8020dc68>] __vfs_write+0x20/0x40
[ 19.087602] [<8020dea0>] vfs_write+0xa8/0x1b8
[ 19.087610] [<8020e08c>] SyS_write+0x44/0x98
[ 19.087619] [<80107960>] ret_fast_syscall+0x0/0x1c
[ 19.087622]
[ 19.087622] other info that might help us debug this:
[ 19.087622]
[ 19.087634] Chain exists of:
[ 19.087634] &p->pi_lock --> &rt_b->rt_runtime_lock --> hrtimer_bases.lock
[ 19.087634]
[ 19.087638] Possible unsafe locking scenario:
[ 19.087638]
[ 19.087640] CPU0 CPU1
[ 19.087642] ---- ----
[ 19.087647] lock(hrtimer_bases.lock);
[ 19.087652] lock(&rt_b->rt_runtime_lock);
[ 19.087657] lock(hrtimer_bases.lock);
[ 19.087661] lock(&p->pi_lock);
[ 19.087663]
[ 19.087663] *** DEADLOCK ***
[ 19.087663]
[ 19.087668] 10 locks held by sh/595:
[ 19.087688] #0: (sb_writers#7){.+.+.+}, at: [<80211de4>] __sb_start_write+0x94/0xc0
[ 19.087704] #1: (&of->mutex){+.+.+.}, at: [<80283eec>] kernfs_fop_write+0x8c/0x1f4
[ 19.087720] #2: (s_active#59){.+.+.+}, at: [<80283ef4>] kernfs_fop_write+0x94/0x1f4
[ 19.087738] #3: (device_hotplug_lock){+.+.+.}, at: [<80457ce4>] lock_device_hotplug_sysfs+0x14/4
[ 19.087751] #4: (&dev->mutex){......}, at: [<804589fc>] device_offline+0x40/0xbc
[ 19.087766] #5: (cpu_add_remove_lock){+.+.+.}, at: [<80121e84>] do_cpu_down+0x20/0x60
[ 19.087780] #6: (cpu_hotplug.lock){++++++}, at: [<80121bdc>] cpu_hotplug_begin+0x0/0xd8
[ 19.087795] #7: (cpu_hotplug.lock#2){+.+.+.}, at: [<80121c68>] cpu_hotplug_begin+0x8c/0xd8
[ 19.087808] #8: (&hp->lock){+.+.+.}, at: [<80121044>] takedown_cpu+0x7c/0x170
[ 19.087826] #9: (hrtimer_bases.lock){-.....}, at: [<80187e98>] __hrtimer_run_queues+0xfc/0x280
[ 19.087829]
[ 19.087829] stack backtrace:
[ 19.087839] CPU: 0 PID: 595 Comm: sh Tainted: G W 4.8.12-rt7-yocto-preempt-rt #19
[ 19.087843] Hardware name: Freescale i.MX6 Quad/DualLite (Device Tree)
[ 19.087848] Backtrace:
[ 19.087869] [<8010c518>] (dump_backtrace) from [<8010c6bc>] (show_stack+0x18/0x1c)
[ 19.087882] r7:00000000 r6:20010093 r5:00000000 r4:80b2e008
[ 19.087892] [<8010c6a4>] (show_stack) from [<8037a2c0>] (dump_stack+0xb4/0xe8)
[ 19.087904] [<8037a20c>] (dump_stack) from [<80168474>] (print_circular_bug+0x1d8/0x310)
[ 19.087920] r9:00000000 r8:e2d13700 r7:80c8a60c r6:80b622fc r5:80c8b92c r4:80c8752c
[ 19.087931] [<8016829c>] (print_circular_bug) from [<8016cb68>] (__lock_acquire+0x15f8/0x1918)
[ 19.087947] r10:e2d13700 r9:81329598 r8:0000000a r7:80b2e12c r6:e2d13dc0 r5:e2d13de0
[ 19.087953] r4:80c8a60c r3:e2d13dc0
[ 19.087963] [<8016b570>] (__lock_acquire) from [<8016d24c>] (lock_acquire+0x70/0x90)
[ 19.087978] r10:00000001 r9:80b2e12c r8:00000001 r7:00000001 r6:60010093 r5:00000000
[ 19.087982] r4:ffffe000
[ 19.087997] [<8016d1dc>] (lock_acquire) from [<80799150>] (_raw_spin_lock_irqsave+0x58/0x6c)
[ 19.088010] r8:00000000 r7:00000000 r6:80149cf0 r5:20010093 r4:e487ae64
[ 19.088027] [<807990f8>] (_raw_spin_lock_irqsave) from [<80149cf0>] (try_to_wake_up+0x34/0x320)
[ 19.088036] r6:00000003 r5:e487ae64 r4:e487a940
[ 19.088048] [<80149cbc>] (try_to_wake_up) from [<8014a6dc>] (wake_up_process+0x18/0x1c)
[ 19.088063] r10:00000001 r9:80b2e12c r8:00000000 r7:00000004 r6:71ac9bca r5:e577530c
[ 19.088067] r4:00000008
[ 19.088080] [<8014a6c4>] (wake_up_process) from [<80126028>] (wakeup_timer_softirqd+0x30/0x38)
[ 19.088091] [<80125ff8>] (wakeup_timer_softirqd) from [<80126690>] (raise_softirq_irqoff+0x58/0x6)
[ 19.088104] [<80126638>] (raise_softirq_irqoff) from [<80187f14>] (__hrtimer_run_queues+0x178/0x2)
[ 19.088111] r5:e577530c r4:e5775568
[ 19.088123] [<80187d9c>] (__hrtimer_run_queues) from [<80188a28>] (hrtimer_interrupt+0xb0/0x1fc)
[ 19.088138] r10:e5775320 r9:e5775360 r8:e5775244 r7:e57753a0 r6:ffffffff r5:00000003
[ 19.088142] r4:e5775200
[ 19.088154] [<80188978>] (hrtimer_interrupt) from [<80189024>] (hrtimers_dead_cpu+0x194/0x1b8)
[ 19.088169] r10:e57752cc r9:e5775380 r8:e5775300 r7:00000000 r6:e57993c0 r5:00000001
[ 19.088173] r4:00000000
[ 19.088184] [<80188e90>] (hrtimers_dead_cpu) from [<80120b1c>] (cpuhp_down_callbacks+0x44/0x88)
[ 19.088199] r10:00000079 r9:00000000 r8:e5796284 r7:00000044 r6:80b1f8f4 r5:00000003
[ 19.088203] r4:00000000
[ 19.088213] [<80120ad8>] (cpuhp_down_callbacks) from [<80793b5c>] (_cpu_down.constprop.6+0x1f4/0x)
[ 19.088229] r9:80a3e2cc r8:64d58000 r7:80b13c84 r6:80a3e284 r5:00000000 r4:00000003
[ 19.088239] [<80793968>] (_cpu_down.constprop.6) from [<80121ea0>] (do_cpu_down+0x3c/0x60)
[ 19.088253] r10:e41860cc r9:00000000 r8:00000000 r7:e38e8700 r6:00000000 r5:00000003
[ 19.088257] r4:80b1fdd4
[ 19.088267] [<80121e64>] (do_cpu_down) from [<80122344>] (cpu_down+0x14/0x18)
[ 19.088290] r7:e38e8700 r6:e5796044 r5:00000000 r4:e5796010
[ 19.088307] [<80122330>] (cpu_down) from [<8045dfe8>] (cpu_subsys_offline+0x14/0x18)
[ 19.088319] [<8045dfd4>] (cpu_subsys_offline) from [<80458a48>] (device_offline+0x8c/0xbc)
[ 19.088329] [<804589bc>] (device_offline) from [<80458b54>] (online_store+0x4c/0x7c)
[ 19.088346] r7:e38e8700 r6:e5796010 r5:00000002 r4:00000000
[ 19.088363] [<80458b08>] (online_store) from [<80457548>] (dev_attr_store+0x20/0x2c)
[ 19.088378] r6:e38e8700 r5:80457528 r4:80458b08
[ 19.088395] [<80457528>] (dev_attr_store) from [<802849d0>] (sysfs_kf_write+0x50/0x54)
[ 19.088406] r5:80457528 r4:00000002
[ 19.088419] [<80284980>] (sysfs_kf_write) from [<80283f28>] (kernfs_fop_write+0xc8/0x1f4)
[ 19.088432] r7:e38e8700 r6:e2ddff78 r5:00000000 r4:e41860c0
[ 19.088442] [<80283e60>] (kernfs_fop_write) from [<8020dc68>] (__vfs_write+0x20/0x40)
[ 19.088458] r10:00000000 r9:00000002 r8:00000000 r7:e2ddff78 r6:0143db30 r5:e3834600
[ 19.088462] r4:80283e60
[ 19.088470] [<8020dc48>] (__vfs_write) from [<8020dea0>] (vfs_write+0xa8/0x1b8)
[ 19.088477] r5:e3834600 r4:00000002
[ 19.088485] [<8020ddf8>] (vfs_write) from [<8020e08c>] (SyS_write+0x44/0x98)
[ 19.088500] r9:00000002 r8:0143db30 r7:00000000 r6:00000000 r5:e3834600 r4:e3834600
[ 19.088511] [<8020e048>] (SyS_write) from [<80107960>] (ret_fast_syscall+0x0/0x1c)
[ 19.088526] r9:e2dde000 r8:80107b48 r7:00000004 r6:45c9a160 r5:0143db30 r4:00000002
root@atlas:~# cat toggle_one.sh
[ -z $1 ] && echo "usage: $0 <cpu-num>" && exit 1
FILE=/sys/devices/system/cpu/cpu$1/online
while true
do
echo "off"
echo 0 > $FILE
sleep 1
echo "on"
echo 1 > $FILE
sleep 1
done
root@atlas:~# cat toggle_all.sh
for i in {1..3}
do
FILE=/sys/devices/system/cpu/cpu${i}/online
echo "disabling $FILE"
echo 0 > $FILE
sleep 1
echo "enabling $FILE"
echo 1 > $FILE
done