[OpenWrt-Devel] atomic sleep bugs - 19.07 (and probably Master too)

Koen Vandeputte koen.vandeputte at ncentric.com
Fri Aug 2 04:40:06 EDT 2019


On 01.08.19 17:27, Koen Vandeputte wrote:
> Hi All,
>
> I've been playing around the last few days stresstesting latest 19.07 
> on different targets (ar71xx, cns3xxx, imx6, ...) with extra kernel 
> debug features enabled.
>
> I'll post some results here as maybe somebody has a clue. :)
>
>
> Some interesting splats already showed up, actually also *breaking* 
> some functionality while the board is running:
>
> on Mikrotik RB2011 (ar71xx)
>
>
> [   16.885207] eth0: link down
> [   16.919752] BUG: sleeping function called from invalid context at 
> net/core/dev.c:5563
> [   17.013669] in_atomic(): 1, irqs_disabled(): 1, pid: 463, name: ip
> [   17.087839] 2 locks held by ip/463:
> [   17.129668]  #0:  (rtnl_mutex){....}, at: [<80378814>] 
> rtnetlink_rcv_msg+0x2d8/0x380
> [   17.222617]  #1:  (&(&ag->lock)->rlock){....}, at: [<80331900>] 
> ag71xx_hw_disable+0x24/0x94
> [   17.322878] CPU: 0 PID: 463 Comm: ip Not tainted 4.14.134 #0
> [   17.390782] Stack : 805e0000 8058aefc 80558ed0 876278ec 80610000 
> 80610000 87d1b2fc 805b7367
> [   17.491032]         80552f04 000001cf 8061386c 87627ccc 87d5b180 
> 00000001 876278a0 6ae07578
> [   17.591283]         00000000 00000000 80b00000 8762779c 0a55a891 
> 00000000 00000007 00000000
> [   17.691535]         00000099 8f9b5648 00000098 00000000 80000000 
> 87d6e58c 87d6e5b0 00000001
> [   17.791785]         8047095c 87627ccc 87d5b180 87d9ae10 00000003 
> 802cfa54 00000000 80610000
> [   17.892036]         ...
> [   17.921352] Call Trace:
> [   17.950676] [<8006cb0c>] show_stack+0x58/0x100
> [   18.003996] [<800aadf4>] ___might_sleep+0x100/0x120
> [   18.062512] [<8035d7f4>] napi_disable+0x30/0xd8
> [   18.116854] [<80331940>] ag71xx_hw_disable+0x64/0x94
> [   18.176421] [<80331994>] ag71xx_stop+0x24/0x38
> [   18.229729] [<8035b1f0>] __dev_close_many+0xcc/0x104
> [   18.289306] [<8036426c>] __dev_change_flags+0xc8/0x1ac
> [   18.350950] [<80364378>] dev_change_flags+0x28/0x70
> [   18.409473] [<80377c30>] do_setlink+0x31c/0x91c
> [   18.463826] [<8037a700>] rtnl_newlink+0x3ec/0x7f8
> [   18.520261] [<80378838>] rtnetlink_rcv_msg+0x2fc/0x380
> [   18.581938] [<8039bac4>] netlink_rcv_skb+0xd4/0x178
> [   18.640439] [<8039b0a0>] netlink_unicast+0x168/0x250
> [   18.700006] [<8039b664>] netlink_sendmsg+0x3d8/0x434
> [   18.759580] [<803404a4>] ___sys_sendmsg+0x1dc/0x290
> [   18.818098] [<80341500>] __sys_sendmsg+0x54/0x84
> [   18.873504] [<8007212c>] syscall_common+0x34/0x58
>
>
> on Mikrotik RB-912 (ar71xx), using ath9k combined with a USB modem 
> together.  When disabling ath9k or unplugging the modem, these bugs 
> don't appear
> Here, the USB modem stops receiving data after a few seconds:
>
>
> [   37.165493] wlan0: Trigger new scan to find an IBSS to join
> [   37.171546] BUG: sleeping function called from invalid context at 
> kernel/irq/manage.c:112
> [   37.180006] in_atomic(): 1, irqs_disabled(): 1, pid: 9, name: 
> kworker/u2:1
> [   37.187110] 6 locks held by kworker/u2:1/9:
> [   37.191434]  #0:  ("%s"wiphy_name(local->hw.wiphy)){....}, at: 
> [<8009f89c>] process_one_work+0x250/0x4c0
> [   37.201298]  #1:  ((&sdata->work)){....}, at: [<8009f89c>] 
> process_one_work+0x250/0x4c0
> [   37.209604]  #2:  (&wdev->mtx){....}, at: [<82917130>] 
> ieee80211_ibss_work+0x40/0x5a4 [mac80211]
> [   37.219077]  #3:  (&local->mtx){....}, at: [<8290d914>] 
> ieee80211_request_ibss_scan+0x4c/0x2b8 [mac80211]
> [   37.229246]  #4:  (&sc->mutex){....}, at: [<82ae4878>] 
> ath9k_ps_restore+0xd54/0x11dc [ath9k]
> [   37.238159]  #5:  (&irq_desc_lock_class){....}, at: [<800c29d8>] 
> __irq_get_desc_lock+0x8c/0xb0
> [   37.247113] CPU: 0 PID: 9 Comm: kworker/u2:1 Tainted: G W 4.14.134 #0
> [   37.255282] Workqueue: phy0 ieee80211_ibss_leave [mac80211]
> [   37.261055] Stack : 82870080 82870158 82860d60 800c0fa0 80610000 
> 00000000 00000001 805b0000
> [   37.269743]         80552e7c 8384db24 00000348 800c1e04 82860d60 
> 00000000 8384db00 21bc325c
> [   37.278434]         00000000 00000000 00000006 8384d9d4 acbbbd0d 
> 00000000 00000007 00000000
> [   37.287125]         00000132 805c0000 00000131 00000000 00000000 
> 828626d8 828703c8 00000348
> [   37.295816]         82870080 82870158 82860d60 828703c8 00000002 
> 802cfa54 00000000 80610000
> [   37.304508]         ...
> [   37.307055] Call Trace:
> [   37.309601] [<8006cb0c>] show_stack+0x58/0x100
> [   37.314219] [<800aadf4>] ___might_sleep+0x100/0x120
> [   37.319280] [<800c3344>] synchronize_irq+0x3c/0xa0
> [   37.324245] [<800c6594>] __irq_disable+0x64/0xb4
> [   37.329024] [<800c35f4>] __disable_irq_nosync+0x3c/0x68
> [   37.334434] [<800c363c>] disable_irq+0x14/0x38
> [   37.339195] [<82ae5b48>] ath9k_calculate_summary_state+0x53c/0x6f4 
> [ath9k]
> [   37.357669] ttyS ttyS0: 1 input overrun(s)
> [   38.170588] BUG: sleeping function called from invalid context at 
> kernel/irq/manage.c:112
> [   38.179059] in_atomic(): 1, irqs_disabled(): 1, pid: 9, name: 
> kworker/u2:1
> [   38.186167] 5 locks held by kworker/u2:1/9:
> [   38.190492]  #0:  ("%s"wiphy_name(local->hw.wiphy)){....}, at: 
> [<8009f89c>] process_one_work+0x250/0x4c0
> [   38.200365]  #1: ((&(&local->scan_work)->work)){....}, at: 
> [<8009f89c>] process_one_work+0x250/0x4c0
> [   38.209925]  #2:  (&local->mtx){....}, at: [<8290d338>] 
> ieee80211_scan_work+0x44/0x578 [mac80211]
> [   38.219497]  #3:  (&sc->mutex){....}, at: [<82ae4878>] 
> ath9k_ps_restore+0xd54/0x11dc [ath9k]
> [   38.228410]  #4:  (&irq_desc_lock_class){....}, at: [<800c29d8>] 
> __irq_get_desc_lock+0x8c/0xb0
> [   38.237363] CPU: 0 PID: 9 Comm: kworker/u2:1 Tainted: G W 4.14.134 #0
> [   38.245533] Workqueue: phy0 ieee80211_scan_work [mac80211]
> [   38.251224] Stack : 82870080 82870158 82860d60 800c0fa0 80610000 
> 00000000 00000001 805b0000
> [   38.259912]         80552e7c 8384db94 000003c0 800c1e04 82860d60 
> 00000000 8384db70 21bc325c
> [   38.268604]         00000000 00000000 00000005 8384da44 1c85fff7 
> 00000000 00000007 00000000
> [   38.277295]         0000014b 805c0000 0000014a 00000000 00000000 
> 828626d8 82870440 000003c0
> [   38.285986]         82870080 82870158 82860d60 82870440 00000001 
> 802cfa54 00000000 80610000
> [   38.294677]         ...
> [   38.297223] Call Trace:
> [   38.299762] [<8006cb0c>] show_stack+0x58/0x100
> [   38.304381] [<800aadf4>] ___might_sleep+0x100/0x120
> [   38.309440] [<800c3344>] synchronize_irq+0x3c/0xa0
> [   38.314406] [<800c6594>] __irq_disable+0x64/0xb4
> [   38.319184] [<800c35f4>] __disable_irq_nosync+0x3c/0x68
> [   38.324595] [<800c363c>] disable_irq+0x14/0x38
> [   38.329355] [<82ae5b48>] ath9k_calculate_summary_state+0x53c/0x6f4 
> [ath9k]
> [   39.210598] BUG: sleeping function called from invalid context at 
> kernel/irq/manage.c:112
> [   39.219073] in_atomic(): 1, irqs_disabled(): 1, pid: 9, name: 
> kworker/u2:1
> [   39.226182] 5 locks held by kworker/u2:1/9:
> [   39.230506]  #0:  ("%s"wiphy_name(local->hw.wiphy)){....}, at: 
> [<8009f89c>] process_one_work+0x250/0x4c0
> [   39.240379]  #1: ((&(&local->scan_work)->work)){....}, at: 
> [<8009f89c>] process_one_work+0x250/0x4c0
> [   39.249940]  #2:  (&local->mtx){....}, at: [<8290d338>] 
> ieee80211_scan_work+0x44/0x578 [mac80211]
> [   39.259519]  #3:  (&sc->mutex){....}, at: [<82ae4878>] 
> ath9k_ps_restore+0xd54/0x11dc [ath9k]
> [   39.268434]  #4:  (&irq_desc_lock_class){....}, at: [<800c29d8>] 
> __irq_get_desc_lock+0x8c/0xb0
> [   39.277386] CPU: 0 PID: 9 Comm: kworker/u2:1 Tainted: G W 4.14.134 #0
> [   39.285555] Workqueue: phy0 ieee80211_scan_work [mac80211]
> [   39.291238] Stack : 82870080 82870158 82860d60 800c0fa0 80610000 
> 00000000 00000001 805b0000
> [   39.299927]         80552e7c 8384db94 00000444 800c1e04 82860d60 
> 00000000 8384db70 21bc325c
> [   39.308618]         00000000 00000000 00000005 8384da44 1c85fff7 
> 00000000 00000007 00000000
> [   39.317309]         00000163 805c0000 00000162 00000000 00000000 
> 828626d8 828704c4 00000444
> [   39.326000]         82870080 82870158 82860d60 828704c4 00000001 
> 802cfa54 00000000 80610000
> [   39.334691]         ...
> [   39.337238] Call Trace:
> [   39.339786] [<8006cb0c>] show_stack+0x58/0x100
> [   39.344404] [<800aadf4>] ___might_sleep+0x100/0x120
> [   39.349464] [<800c3344>] synchronize_irq+0x3c/0xa0
> [   39.354428] [<800c6594>] __irq_disable+0x64/0xb4
> [   39.359207] [<800c35f4>] __disable_irq_nosync+0x3c/0x68
> [   39.364618] [<800c363c>] disable_irq+0x14/0x38
> [   39.369377] [<82ae5b48>] ath9k_calculate_summary_state+0x53c/0x6f4 
> [ath9k]
> [   39.379637] ttyS ttyS0: 2 input overrun(s)
> [   44.124920] wlan0: Trigger new scan to find an IBSS to join
> [   44.130870] BUG: sleeping function called from invalid context at 
> kernel/irq/manage.c:112
> [   44.139328] in_atomic(): 1, irqs_disabled(): 1, pid: 5, name: 
> kworker/u2:0
> [   44.146431] 6 locks held by kworker/u2:0/5:
> [   44.150755]  #0:  ("%s"wiphy_name(local->hw.wiphy)){....}, at: 
> [<8009f89c>] process_one_work+0x250/0x4c0
> [   44.160620]  #1:  ((&sdata->work)){....}, at: [<8009f89c>] 
> process_one_work+0x250/0x4c0
> [   44.168925]  #2:  (&wdev->mtx){....}, at: [<82917130>] 
> ieee80211_ibss_work+0x40/0x5a4 [mac80211]
> [   44.178408]  #3:  (&local->mtx){....}, at: [<8290d914>] 
> ieee80211_request_ibss_scan+0x4c/0x2b8 [mac80211]
> [   44.188576]  #4:  (&sc->mutex){....}, at: [<82ae4878>] 
> ath9k_ps_restore+0xd54/0x11dc [ath9k]
> [   44.197490]  #5:  (&irq_desc_lock_class){....}, at: [<800c29d8>] 
> __irq_get_desc_lock+0x8c/0xb0
> [   44.206444] CPU: 0 PID: 5 Comm: kworker/u2:0 Tainted: G W 4.14.134 #0
> [   44.214616] Workqueue: phy0 ieee80211_ibss_leave [mac80211]
> [   44.220394] Stack : 82870080 82870158 82860d60 800c0fa0 80610000 
> 00000000 00000001 805b0000
> [   44.229082]         80552e7c 83839b24 00000348 800c1e04 82860d60 
> 00000000 83839b00 21802cc8
> [   44.237773]         00000000 00000000 00000006 838399d4 acbbbd0d 
> 00000000 00000007 00000000
> [   44.246464]         0000017e 805c0000 0000017d 00000000 00000000 
> 828626d8 828703c8 00000348
> [   44.255155]         82870080 82870158 82860d60 828703c8 00000002 
> 802cfa54 00000000 80610000
> [   44.263847]         ...
> [   44.266394] Call Trace:
> [   44.268940] [<8006cb0c>] show_stack+0x58/0x100
> [   44.273561] [<800aadf4>] ___might_sleep+0x100/0x120
> [   44.278620] [<800c3344>] synchronize_irq+0x3c/0xa0
> [   44.283584] [<800c6594>] __irq_disable+0x64/0xb4
> [   44.288362] [<800c35f4>] __disable_irq_nosync+0x3c/0x68
> [   44.293765] [<800c363c>] disable_irq+0x14/0x38
> [   44.298526] [<82ae5b48>] ath9k_calculate_summary_state+0x53c/0x6f4 
> [ath9k]
> [   44.315077] ttyS ttyS0: 1 input overrun(s)
> [   45.130588] BUG: sleeping function called from invalid context at 
> kernel/irq/manage.c:112
> [   45.139061] in_atomic(): 1, irqs_disabled(): 1, pid: 5, name: 
> kworker/u2:0
> [   45.146171] 5 locks held by kworker/u2:0/5:
> [   45.150495]  #0:  ("%s"wiphy_name(local->hw.wiphy)){....}, at: 
> [<8009f89c>] process_one_work+0x250/0x4c0
> [   45.160368]  #1: ((&(&local->scan_work)->work)){....}, at: 
> [<8009f89c>] process_one_work+0x250/0x4c0
> [   45.169927]  #2:  (&local->mtx){....}, at: [<8290d338>] 
> ieee80211_scan_work+0x44/0x578 [mac80211]
> [   45.179499]  #3:  (&sc->mutex){....}, at: [<82ae4878>] 
> ath9k_ps_restore+0xd54/0x11dc [ath9k]
> [   45.188412]  #4:  (&irq_desc_lock_class){....}, at: [<800c29d8>] 
> __irq_get_desc_lock+0x8c/0xb0
> [   45.197366] CPU: 0 PID: 5 Comm: kworker/u2:0 Tainted: G W 4.14.134 #0
> [   45.205535] Workqueue: phy0 ieee80211_scan_work [mac80211]
> [   45.211226] Stack : 82870080 82870158 82860d60 800c0fa0 80610000 
> 00000000 00000001 805b0000
> [   45.219914]         80552e7c 83839b94 000003c0 800c1e04 82860d60 
> 00000000 83839b70 21802cc8
> [   45.228606]         00000000 00000000 00000005 83839a44 1c85fff7 
> 00000000 00000007 00000000
> [   45.237297]         00000197 805c0000 00000196 00000000 00000000 
> 828626d8 82870440 000003c0
> [   45.245988]         82870080 82870158 82860d60 82870440 00000001 
> 802cfa54 00000000 80610000
> [   45.254680]         ...
> [   45.257226] Call Trace:
> [   45.259764] [<8006cb0c>] show_stack+0x58/0x100
> [   45.264384] [<800aadf4>] ___might_sleep+0x100/0x120
> [   45.269444] [<800c3344>] synchronize_irq+0x3c/0xa0
> [   45.274408] [<800c6594>] __irq_disable+0x64/0xb4
> [   45.279187] [<800c35f4>] __disable_irq_nosync+0x3c/0x68
> [   45.284598] [<800c363c>] disable_irq+0x14/0x38
> [   45.289357] [<82ae5b48>] ath9k_calculate_summary_state+0x53c/0x6f4 
> [ath9k]
> [   46.170598] BUG: sleeping function called from invalid context at 
> kernel/irq/manage.c:112
> [   46.179076] in_atomic(): 1, irqs_disabled(): 1, pid: 5, name: 
> kworker/u2:0
> [   46.186184] 5 locks held by kworker/u2:0/5:
> [   46.190509]  #0:  ("%s"wiphy_name(local->hw.wiphy)){....}, at: 
> [<8009f89c>] process_one_work+0x250/0x4c0
> [   46.200382]  #1: ((&(&local->scan_work)->work)){....}, at: 
> [<8009f89c>] process_one_work+0x250/0x4c0
> [   46.209942]  #2:  (&local->mtx){....}, at: [<8290d338>] 
> ieee80211_scan_work+0x44/0x578 [mac80211]
> [   46.219522]  #3:  (&sc->mutex){....}, at: [<82ae4878>] 
> ath9k_ps_restore+0xd54/0x11dc [ath9k]
> [   46.228435]  #4:  (&irq_desc_lock_class){....}, at: [<800c29d8>] 
> __irq_get_desc_lock+0x8c/0xb0
> [   46.237389] CPU: 0 PID: 5 Comm: kworker/u2:0 Tainted: G W 4.14.134 #0
> [   46.245559] Workqueue: phy0 ieee80211_scan_work [mac80211]
> [   46.251249] Stack : 82870080 82870158 82860d60 800c0fa0 80610000 
> 00000000 00000001 805b0000
> [   46.259938]         80552e7c 83839b94 00000444 800c1e04 82860d60 
> 00000000 83839b70 21802cc8
> [   46.268629]         00000000 00000000 00000005 83839a44 1c85fff7 
> 00000000 00000007 00000000
> [   46.277320]         000001af 805c0000 000001ae 00000000 00000000 
> 828626d8 828704c4 00000444
> [   46.286011]         82870080 82870158 82860d60 828704c4 00000001 
> 802cfa54 00000000 80610000
> [   46.294702]         ...
> [   46.297249] Call Trace:
> [   46.299787] [<8006cb0c>] show_stack+0x58/0x100
> [   46.304406] [<800aadf4>] ___might_sleep+0x100/0x120
> [   46.309466] [<800c3344>] synchronize_irq+0x3c/0xa0
> [   46.314431] [<800c6594>] __irq_disable+0x64/0xb4
> [   46.319209] [<800c35f4>] __disable_irq_nosync+0x3c/0x68
> [   46.324620] [<800c363c>] disable_irq+0x14/0x38
> [   46.329379] [<82ae5b48>] ath9k_calculate_summary_state+0x53c/0x6f4 
> [ath9k]
> [   46.339657] ttyS ttyS0: 2 input overrun(s)
> [   46.579616] wlan0: Creating new IBSS network, BSSID c8:a9:46:87:7d:ad
>
>
> cns3xxx;
>
> None so far ..
>
> imx6:
>
> None so far
>
> Regards,
>
> Koen
>
Adding another one seen on ar71xx - Gl.MiFi:


[   13.152502] eth0: link down
[   13.155063] BUG: sleeping function called from invalid context at 
net/core/dev.c:5563
[   13.161685] in_atomic(): 1, irqs_disabled(): 1, pid: 456, name: ip
[   13.167848] 2 locks held by ip/456:
[   13.171315]  #0:  (rtnl_mutex){....}, at: [<8032aba4>] 
rtnetlink_rcv_msg+0x2d8/0x380
[   13.179039]  #1:  (&(&ag->lock)->rlock){....}, at: [<802e3c40>] 
ag71xx_hw_disable+0x24/0x94
[   13.187385] CPU: 0 PID: 456 Comm: ip Not tainted 4.14.134 #0
[   13.193019] Stack : 805a0000 80547894 8051de14 831c98ec 805f0000 
805f0000 83918efc 805733a7
[   13.201348]         80517e44 000001c8 805f38bc 831c9ccc 8390d0c0 
00000001 831c98a0 b0502ae3
[   13.209681]         00000000 00000000 80ae0000 831c979c 6138f004 
00000000 00000007 00000000
[   13.218016]         00000091 b3400000 00000090 00000000 80000000 
839c9d8c 839c9db0 00000001
[   13.226347]         80428b7c 831c9ccc 8390d0c0 83b51210 00000003 
00000000 00000000 805f0000
[   13.234680]         ...
[   13.237114] Call Trace:
[   13.239556] [<8006c88c>] show_stack+0x58/0x100
[   13.244006] [<800aab74>] ___might_sleep+0x100/0x120
[   13.248853] [<8030fb84>] napi_disable+0x30/0xd8
[   13.253354] [<802e3c80>] ag71xx_hw_disable+0x64/0x94
[   13.258304] [<802e3cd4>] ag71xx_stop+0x24/0x38
[   13.262731] [<8030d580>] __dev_close_many+0xcc/0x104
[   13.267697] [<803165fc>] __dev_change_flags+0xc8/0x1ac
[   13.272801] [<80316708>] dev_change_flags+0x28/0x70
[   13.277662] [<80329fc0>] do_setlink+0x31c/0x91c
[   13.282178] [<8032ca90>] rtnl_newlink+0x3ec/0x7f8
[   13.286865] [<8032abc8>] rtnetlink_rcv_msg+0x2fc/0x380
[   13.292019] [<8034de64>] netlink_rcv_skb+0xd4/0x178
[   13.296849] [<8034d440>] netlink_unicast+0x168/0x250
[   13.301797] [<8034da04>] netlink_sendmsg+0x3d8/0x434
[   13.306753] [<802f2774>] ___sys_sendmsg+0x1dc/0x290
[   13.311607] [<802f37d0>] __sys_sendmsg+0x54/0x84
[   13.316225] [<80071eac>] syscall_common+0x34/0x58



_______________________________________________
openwrt-devel mailing list
openwrt-devel at lists.openwrt.org
https://lists.openwrt.org/mailman/listinfo/openwrt-devel


More information about the openwrt-devel mailing list