Forum Rule: Always post complete source code & details to reproduce any issue!
Results 1 to 5 of 5

Thread: "watchdog: BUG: soft lockup - CPU#3 stuck for 22s! [kworker ...]"

  1. #1
    Junior Member
    Join Date
    Oct 2020
    Posts
    2

    "watchdog: BUG: soft lockup - CPU#3 stuck for 22s! [kworker ...]"

    Code:
    "watchdog: BUG: soft lockup - CPU#3 stuck for 22s! [kworker ...]"
    Is it possible Teensyduino could cause this?
    I've experienced this system locking up this CPU twice now, in two days, during repeated dev and flashing of a Teensy 4. I work on it for hours at a time though, and with only two times it's not that high of a coincidence, but it *has* only happened during times that I'm actively coding+flashing the board, and many programs still function, but Arduino and Teensyduino's little window end up being completely locked up (and can't be killed), and then some other software begins to fail. I can't shut down (with things starting to get frozen) and have to cold-reset.

    Syslog does not show the "soft lockup" message, instead it shows this log:

    Code:
    ------------[ cut here ]------------
    WARNING: CPU: 3 PID: 0 at kernel/workqueue.c:1444 __queue_work.cold.52+0xc/0x35
    Modules linked in: ...  nvidia_drm(POE) ... nvidia_modeset(POE) ...
    CPU: 3 PID: 0 Comm: swapper/4 Tainted: P           OE     4.19.0-12-amd64 #1 Debian 4.19.152-1
    Hardware name: ASUS All Series/Z97-PRO, BIOS 3503 04/18/2018
    RIP: 0010:__queue_work.cold.52+0xc/0x35
    Code: b6 ff ff 48 c7 c7 50 64 a5 89 c6 05 2f 01 09 01 01 45 31 ed e8 75 35 04 00 e9 51 ba ff ff 48 c7 c7 f8 e2 a3 89 e8 64 35 04 00 <0f> 0b 48 8b 3b c6 07 00 0f 1f 40 00 e9 37 be ff ff 48 c7 c7 98 64
    RSP: 0018:ffff9c0f7ed03d68 EFLAGS: 00010046
    RAX: 0000000000000024 RBX: ffff9c0f7ed25900 RCX: 0000000000000000
    RDX: 0000000000000000 RSI: ffff9c0f7ed166b8 RDI: ffff9c0f7ed166b8
    RBP: 0000000000000200 R08: 0000000000001003 R09: 0000000000000004
    R10: 0000000000000000 R11: 0000000000000001 R12: 0000000000000004
    R13: ffff9c0f7e818c00 R14: ffffffff89aeb720 R15: ffff9c0cbcf9d790
    FS:  0000000000000000(0000) GS:ffff9c0f7ed00000(0000) knlGS:0000000000000000
    CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    CR2: 00007f50c84b1000 CR3: 000000081ea0a006 CR4: 00000000001606e0
    Call Trace:
     <IRQ>
     queue_work_on+0x34/0x40
     __usb_hcd_giveback_urb+0x84/0x140 [usbcore]
     xhci_giveback_urb_in_irq.isra.40+0x7d/0xf0 [xhci_hcd]
     xhci_td_cleanup+0xfb/0x160 [xhci_hcd]
     xhci_irq+0x627/0x2330 [xhci_hcd]
     ? rt2800usb_txstatus_timeout.isra.9+0xe0/0xe0 [rt2800usb]
     __handle_irq_event_percpu+0x46/0x190
     handle_irq_event_percpu+0x30/0x80
     handle_irq_event+0x3c/0x5c
     handle_edge_irq+0x97/0x1e0
     handle_irq+0x1f/0x30
     do_IRQ+0x49/0xe0
     common_interrupt+0xf/0xf
     </IRQ>
    RIP: 0010:cpuidle_enter_state+0xb9/0x320
    Code: e8 dc b3 b0 ff 80 7c 24 0b 00 74 17 9c 58 0f 1f 44 00 00 f6 c4 02 0f 85 3b 02 00 00 31 ff e8 0e a6 b6 ff fb 66 0f 1f 44 00 00 <48> b8 ff ff ff ff f3 01 00 00 48 2b 1c 24 ba ff ff ff 7f 48 39 c3
    RSP: 0018:ffffbdf9c31d3e90 EFLAGS: 00000246 ORIG_RAX: ffffffffffffffdd
    RAX: ffff9c0f7ed220c0 RBX: 00007b6ba08e6b3e RCX: 000000000000001f
    RDX: 00007b6ba08e6b3e RSI: 000000002004c001 RDI: 0000000000000000
    RBP: ffff9c0f7ed2a310 R08: 0000000000000002 R09: 0000000000021980
    R10: 0001ed7ba350115c R11: ffff9c0f7ed210a8 R12: 0000000000000004
    R13: ffffffff89cb8978 R14: 0000000000000004 R15: 0000000000000000
     do_idle+0x228/0x270
     cpu_startup_entry+0x6f/0x80
     start_secondary+0x1a4/0x200
     secondary_startup_64+0xa4/0xb0
    ---[ end trace 1fe9bed41dff0c67 ]---

  2. #2
    Senior Member+ KurtE's Avatar
    Join Date
    Jan 2014
    Posts
    8,528
    Sorry I am probably not much help on this one, as I am guessing it is a Linux setup, and don't know enough of the internals.

    But might help others that might be able to help more to know some additional information like what type of machine and which OS...
    From your message of Syslog I am assuming a 64 bit Linux. ..

    Likewise what version of Arduino and Teensyduino. Or are you using some other build setup...

  3. #3
    Senior Member+ defragster's Avatar
    Join Date
    Feb 2015
    Posts
    13,516
    On Win 10 a build yesterday showed a register dump compile failure.

    Did note read (beyond seeing compile fail and register spew) or save the Compile output - just hit "F7" in sublime to recompile and then it worked.

    This is a first AFAIK in the years here. May have just been a disk I/O error?

    > but OP error is not during compile
    Last edited by defragster; 11-04-2020 at 05:29 PM.

  4. #4
    Senior Member+ KurtE's Avatar
    Join Date
    Jan 2014
    Posts
    8,528
    Again from the message above:
    Code:
    ------------[ cut here ]------------
    WARNING: CPU: 3 PID: 0 at kernel/workqueue.c:1444 __queue_work.cold.52+0xc/0x35
    Modules linked in: ...  nvidia_drm(POE) ... nvidia_modeset(POE) ...
    CPU: 3 PID: 0 Comm: swapper/4 Tainted: P           OE     4.19.0-12-amd64 #1 Debian 4.19.152-1
    I am somewhat sure that it is a form of linux...
    But a MAC might do this as well? as underlying some of it is some form of linux.

    If MAC maybe ran out of memory in Terminal monitor? As the new beta does mention a fix: Serial Monitor fix memory leak on MacOS

    But reading the dump information more, it looks like a PC running some form of Linux.
    Code:
    Hardware name: ASUS All Series/Z97-PRO
    Also guessing something to do with display: Modules linked in: ... nvidia_drm(POE) ... nvidia_modeset(POE) ...

  5. #5
    Junior Member
    Join Date
    Oct 2020
    Posts
    2

    Details update [sorry]

    Okay, one would think I would know better by now.
    This is a Debian Stable (Buster) system, using the backports repository to get the latest-possible available updates.
    The list of modules from the log I snipped down because it was several lines, each of 1000+ chars. I realize now that leaving the official non-free Nvidia module in there was then misleading, instead of its intent to just inform that I *am* using the "outside" Nvidia gpu drivers (not that it probably relates to the problem).
    It is, as shown in the logs, an ASUS Z97-Pro motherboard, with Intel I7-4790K (4 core, 8 thread) CPU.

    Sorry to make you guys have to guess at that. That was irresponsible of me!

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •