Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CPU soft lockup - possibly dualgpu related? #6906

Open
My1 opened this issue Jan 16, 2024 · 20 comments
Open

CPU soft lockup - possibly dualgpu related? #6906

My1 opened this issue Jan 16, 2024 · 20 comments
Labels
bug Something isn't working

Comments

@My1
Copy link

My1 commented Jan 16, 2024

Bug Description

basically after installing rustdesk and maybe running it (it happens delayed), i get notifications on my Laptop about the CPU being stuck having a soft lockup with rustdesk:rcs0:12345 (12345 being the process id) being mentioned.

htop also mentions aside from rustdesk either one or multiple --check-hwcodec-config processes from rustdesk

Interestingly this only happened on my laptop so far, my theory is it might be because the system has 2 GPUs, the integrated intel HD 620 and the AMD Radeon 520

the Processes don't even respond to SIGKILL and the only possible solution is to apt remove and forcefully shutdown as it hangs even in a reboot.
BIOS Update did not help either

How to Reproduce

install rustdesk (I used QApt with the deb file)
start rustdesk
wait

Expected Behavior

the CPU not locking up

Operating system(s) on local side and remote side

Linux, Kubuntu 22.04 (no remote side)

RustDesk Version(s) on local side and remote side

1.2.3 (no remote)

Screenshots

image

image

Additional Context

Laptop Model: HP 17-by0320ng
Intel i3 7020U
Intel HD 620
Radeon 520
16GB RAM

@My1 My1 added the bug Something isn't working label Jan 16, 2024
@rustdesk
Copy link
Owner

rustdesk commented Jan 17, 2024

Anyway please check out our nightly build, https://github.com/rustdesk/rustdesk/releases/tag/nightly, a lot of related changes there. We will reopen if the issue still in nightly build.

@My1
Copy link
Author

My1 commented Jan 17, 2024

Screenshot_20240117_082749

also here something I can see during reboot. if you know how/where to get the rest I can get it
20240117_083307

@rustdesk
Copy link
Owner

rustdesk commented Jan 17, 2024

@21pages follow up, rewrite this stupid --check-hwcodec-config.

  • stuck for 143 seconds with 100% cpu usage is crazy
  • only trigger check if there are hardware config changes, no need to check every time at startup.

@rustdesk rustdesk reopened this Jan 17, 2024
@rustdesk
Copy link
Owner

@My1 Thanks

@My1
Copy link
Author

My1 commented Jan 17, 2024

Oh 143 is cute, i think it got over 300 easily the first time if not significantly more

I think is more than just 100% load this seems kinda more crazy as the process doesn't get killed and even blocks rebooting

@21pages
Copy link
Collaborator

21pages commented Jan 17, 2024

@My1 Does this one work for you?
https://github.com/21pages/rustdesk/releases/download/nightly/rustdesk-1.2.4-x86_64-no-hwcodec.deb

If it works for you, could you try https://github.com/21pages/test/releases/download/test/available_no_loop and see whether it will cause the cpu soft lockup?

If "available_no_loop" doesn't cause the cpu soft lockup, could you try https://github.com/21pages/test/releases/download/test/available_loop?

If the cpu soft lockup happens, please show the terminal log.

@My1
Copy link
Author

My1 commented Jan 17, 2024

no hwcodec installed and ran fine (except that decoding 4k with cpu is a bit straineous for obvious reasons lol)

noloop dies with root, without it doesnt open the gui.

logs
my1@my1-qb:~/Downloads$ sudo ./available_no_loop 
[2024-01-17T16:27:46Z DEBUG hwcodec::encode] prepare yuv 3.123814ms
[2024-01-17T16:27:46Z TRACE hwcodec::encode] start checking "h264_nvenc"
[h264_nvenc @ 0x55d07644db00] Cannot load libcuda.so.1
[h264_nvenc @ 0x55d07644db00] Nvenc unloaded
avcodec_open2: Operation not permitted
[2024-01-17T16:27:46Z DEBUG hwcodec::encode] h264_nvenc new failed 825.493µs
[2024-01-17T16:27:46Z TRACE hwcodec::encode] finish checking "h264_nvenc"
[2024-01-17T16:27:46Z TRACE hwcodec::encode] start checking "h264_amf"
[h264_amf @ 0x55d07644e380] DLL libamfrt64.so.1 failed to open
avcodec_open2: Unknown error occurred
[2024-01-17T16:27:46Z DEBUG hwcodec::encode] h264_amf new failed 223.948µs
[2024-01-17T16:27:46Z TRACE hwcodec::encode] finish checking "h264_amf"
[2024-01-17T16:27:46Z TRACE hwcodec::encode] start checking "hevc_nvenc"
[hevc_nvenc @ 0x55d07644ee00] Cannot load libcuda.so.1
[hevc_nvenc @ 0x55d07644ee00] Nvenc unloaded
avcodec_open2: Operation not permitted
[2024-01-17T16:27:46Z DEBUG hwcodec::encode] hevc_nvenc new failed 357.734µs
[2024-01-17T16:27:46Z TRACE hwcodec::encode] finish checking "hevc_nvenc"
[2024-01-17T16:27:46Z TRACE hwcodec::encode] start checking "hevc_amf"
[hevc_amf @ 0x55d07644ea40] DLL libamfrt64.so.1 failed to open
avcodec_open2: Unknown error occurred
[2024-01-17T16:27:46Z DEBUG hwcodec::encode] hevc_amf new failed 212.34µs
[2024-01-17T16:27:46Z TRACE hwcodec::encode] finish checking "hevc_amf"
[2024-01-17T16:27:46Z INFO  available] available encoder count: 0
[2024-01-17T16:27:46Z INFO  hwcodec::decode] start checking CodecInfo { name: "h264", format: H264, vendor: OTHER, score: 94, hwdevice: AV_HWDEVICE_TYPE_CUDA }
Failed to create specified HW device:Operation not permitted
reset failed
[2024-01-17T16:27:46Z DEBUG hwcodec::decode] name:h264 device:AV_HWDEVICE_TYPE_CUDA new failed:160.613µs
[2024-01-17T16:27:46Z INFO  hwcodec::decode] finish checking CodecInfo { name: "h264", format: H264, vendor: OTHER, score: 94, hwdevice: AV_HWDEVICE_TYPE_CUDA }
[2024-01-17T16:27:46Z INFO  hwcodec::decode] start checking CodecInfo { name: "hevc", format: H265, vendor: OTHER, score: 95, hwdevice: AV_HWDEVICE_TYPE_CUDA }
Failed to create specified HW device:Operation not permitted
reset failed
[2024-01-17T16:27:46Z DEBUG hwcodec::decode] name:hevc device:AV_HWDEVICE_TYPE_CUDA new failed:146.732µs
[2024-01-17T16:27:46Z INFO  hwcodec::decode] finish checking CodecInfo { name: "hevc", format: H265, vendor: OTHER, score: 95, hwdevice: AV_HWDEVICE_TYPE_CUDA }
[2024-01-17T16:27:46Z INFO  hwcodec::decode] start checking CodecInfo { name: "h264", format: H264, vendor: OTHER, score: 70, hwdevice: AV_HWDEVICE_TYPE_VAAPI }
[2024-01-17T16:27:46Z DEBUG hwcodec::decode] name:h264 device:AV_HWDEVICE_TYPE_VAAPI new:65.391378ms
radeon: The kernel rejected CS, see dmesg for more information (-22).
avcodec_receive_frame no loop ret: 0
radeon: The kernel rejected CS, see dmesg for more information (-2).
[2024-01-17T16:27:46Z DEBUG hwcodec::decode] name:h264 device:AV_HWDEVICE_TYPE_VAAPI decode:54.943768ms

Message from syslogd@my1-qb at Jan 17 17:28:12 ...
 kernel:[31876.122599] watchdog: BUG: soft lockup - CPU#0 stuck for 26s! [available:rcs0:19073]

@21pages
Copy link
Collaborator

21pages commented Jan 18, 2024

It is possible that using VA-API on AMD GPUs may encounter this issue. ValveSoftware/steam-for-linux#8508

  1. run vainfo and show the terminal output.
  2. Please try https://github.com/21pages/test/releases/download/test/available_no_vaapi to see if it has the issue. If it works, https://github.com/21pages/rustdesk/releases/download/nightly/rustdesk-1.2.4-x86_64-no-vaapi.deb should work for you.
  3. If no-vaapi doesn't have this issue, please try ffmpeg with vaapi:

@My1
Copy link
Author

My1 commented Jan 18, 2024

VAinfo
my1@my1-qb:~$ DRI_PRIME=1 vainfo
libva info: VA-API version 1.14.0
libva info: Trying to open /usr/lib/x86_64-linux-gnu/dri/iHD_drv_video.so
libva info: Found init function __vaDriverInit_1_14
libva info: va_openDriver() returns 0
vainfo: VA-API version: 1.14 (libva 2.12.0)
vainfo: Driver version: Intel iHD driver for Intel(R) Gen Graphics - 22.3.1 ()
vainfo: Supported profile and entrypoints
      VAProfileMPEG2Simple            : VAEntrypointVLD
      VAProfileMPEG2Main              : VAEntrypointVLD
      VAProfileH264Main               : VAEntrypointVLD
      VAProfileH264Main               : VAEntrypointEncSliceLP
      VAProfileH264High               : VAEntrypointVLD
      VAProfileH264High               : VAEntrypointEncSliceLP
      VAProfileJPEGBaseline           : VAEntrypointVLD
      VAProfileJPEGBaseline           : VAEntrypointEncPicture
      VAProfileH264ConstrainedBaseline: VAEntrypointVLD
      VAProfileH264ConstrainedBaseline: VAEntrypointEncSliceLP
      VAProfileVP8Version0_3          : VAEntrypointVLD
      VAProfileHEVCMain               : VAEntrypointVLD
      VAProfileHEVCMain10             : VAEntrypointVLD
      VAProfileVP9Profile0            : VAEntrypointVLD
      VAProfileVP9Profile2            : VAEntrypointVLD


my1@my1-qb:~$ DRI_PRIME=1 LIBVA_DRIVER_NAME=radeonsi vainfo 
libva info: VA-API version 1.14.0
libva info: User environment variable requested driver 'radeonsi'
libva info: Trying to open /usr/lib/x86_64-linux-gnu/dri/radeonsi_drv_video.so
libva info: Found init function __vaDriverInit_1_14
libva info: va_openDriver() returns 0
vainfo: VA-API version: 1.14 (libva 2.12.0)
vainfo: Driver version: Mesa Gallium driver 23.0.4-0ubuntu1~22.04.1 for HAINAN (, LLVM 15.0.7, DRM 2.50, 5.15.0-91-generic)
vainfo: Supported profile and entrypoints
      VAProfileMPEG2Simple            : VAEntrypointVLD
      VAProfileMPEG2Main              : VAEntrypointVLD
      VAProfileVC1Simple              : VAEntrypointVLD
      VAProfileVC1Main                : VAEntrypointVLD
      VAProfileVC1Advanced            : VAEntrypointVLD
      VAProfileH264ConstrainedBaseline: VAEntrypointVLD
      VAProfileH264Main               : VAEntrypointVLD
      VAProfileH264High               : VAEntrypointVLD
      VAProfileNone                   : VAEntrypointVideoProc

lockup on ffmpeg with DRI_PRIME=1

Terminal
my1@my1-qb:~/Downloads$ sudo ./available_no_vaapi 
[sudo] password for my1: 
[2024-01-18T07:00:04Z DEBUG hwcodec::encode] prepare yuv 2.089308ms
[2024-01-18T07:00:04Z TRACE hwcodec::encode] start checking "h264_nvenc"
[h264_nvenc @ 0x55fff31cdb00] Cannot load libcuda.so.1
[h264_nvenc @ 0x55fff31cdb00] Nvenc unloaded
avcodec_open2: Operation not permitted
[2024-01-18T07:00:04Z DEBUG hwcodec::encode] h264_nvenc new failed 564.078µs
[2024-01-18T07:00:04Z TRACE hwcodec::encode] finish checking "h264_nvenc"
[2024-01-18T07:00:04Z TRACE hwcodec::encode] start checking "h264_amf"
[h264_amf @ 0x55fff31ce380] DLL libamfrt64.so.1 failed to open
avcodec_open2: Unknown error occurred
[2024-01-18T07:00:04Z DEBUG hwcodec::encode] h264_amf new failed 242.515µs
[2024-01-18T07:00:04Z TRACE hwcodec::encode] finish checking "h264_amf"
[2024-01-18T07:00:04Z TRACE hwcodec::encode] start checking "hevc_nvenc"
[hevc_nvenc @ 0x55fff31cee00] Cannot load libcuda.so.1
[hevc_nvenc @ 0x55fff31cee00] Nvenc unloaded
avcodec_open2: Operation not permitted
[2024-01-18T07:00:04Z DEBUG hwcodec::encode] hevc_nvenc new failed 266.074µs
[2024-01-18T07:00:04Z TRACE hwcodec::encode] finish checking "hevc_nvenc"
[2024-01-18T07:00:04Z TRACE hwcodec::encode] start checking "hevc_amf"
[hevc_amf @ 0x55fff31cea40] DLL libamfrt64.so.1 failed to open
avcodec_open2: Unknown error occurred
[2024-01-18T07:00:04Z DEBUG hwcodec::encode] hevc_amf new failed 195.071µs
[2024-01-18T07:00:04Z TRACE hwcodec::encode] finish checking "hevc_amf"
[2024-01-18T07:00:04Z INFO  available] available encoder count: 0
[2024-01-18T07:00:04Z INFO  hwcodec::decode] start checking CodecInfo { name: "h264", format: H264, vendor: OTHER, score: 94, hwdevice: AV_HWDEVICE_TYPE_CUDA }
Failed to create specified HW device:Operation not permitted
reset failed
[2024-01-18T07:00:04Z DEBUG hwcodec::decode] name:h264 device:AV_HWDEVICE_TYPE_CUDA new failed:144.988µs
[2024-01-18T07:00:04Z INFO  hwcodec::decode] finish checking CodecInfo { name: "h264", format: H264, vendor: OTHER, score: 94, hwdevice: AV_HWDEVICE_TYPE_CUDA }
[2024-01-18T07:00:04Z INFO  hwcodec::decode] start checking CodecInfo { name: "hevc", format: H265, vendor: OTHER, score: 95, hwdevice: AV_HWDEVICE_TYPE_CUDA }
Failed to create specified HW device:Operation not permitted
reset failed
[2024-01-18T07:00:04Z DEBUG hwcodec::decode] name:hevc device:AV_HWDEVICE_TYPE_CUDA new failed:120.552µs
[2024-01-18T07:00:04Z INFO  hwcodec::decode] finish checking CodecInfo { name: "hevc", format: H265, vendor: OTHER, score: 95, hwdevice: AV_HWDEVICE_TYPE_CUDA }
[2024-01-18T07:00:04Z INFO  available] available decoder count: 0


-------------


DRI_PRIME=1 ffmpeg -hwaccel vaapi  -i 1920_1080.264 -vf 'format=nv12' -c:v rawvideo -pix_fmt nv12 output.yuv
ffmpeg version 4.4.2-0ubuntu0.22.04.1 Copyright (c) 2000-2021 the FFmpeg developers
  built with gcc 11 (Ubuntu 11.2.0-19ubuntu1)
  configuration: --prefix=/usr --extra-version=0ubuntu0.22.04.1 --toolchain=hardened --libdir=/usr/lib/x86_64-linux-gnu --incdir=/usr/include/x86_64-linux-gnu --arch=amd64 --enable-gpl --disable-stripping --enable-gnutls --enable-ladspa --enable-libaom --enable-libass --enable-libbluray --enable-libbs2b --enable-libcaca --enable-libcdio --enable-libcodec2 --enable-libdav1d --enable-libflite --enable-libfontconfig --enable-libfreetype --enable-libfribidi --enable-libgme --enable-libgsm --enable-libjack --enable-libmp3lame --enable-libmysofa --enable-libopenjpeg --enable-libopenmpt --enable-libopus --enable-libpulse --enable-librabbitmq --enable-librubberband --enable-libshine --enable-libsnappy --enable-libsoxr --enable-libspeex --enable-libsrt --enable-libssh --enable-libtheora --enable-libtwolame --enable-libvidstab --enable-libvorbis --enable-libvpx --enable-libwebp --enable-libx265 --enable-libxml2 --enable-libxvid --enable-libzimg --enable-libzmq --enable-libzvbi --enable-lv2 --enable-omx --enable-openal --enable-opencl --enable-opengl --enable-sdl2 --enable-pocketsphinx --enable-librsvg --enable-libmfx --enable-libdc1394 --enable-libdrm --enable-libiec61883 --enable-chromaprint --enable-frei0r --enable-libx264 --enable-shared
  libavutil      56. 70.100 / 56. 70.100
  libavcodec     58.134.100 / 58.134.100
  libavformat    58. 76.100 / 58. 76.100
  libavdevice    58. 13.100 / 58. 13.100
  libavfilter     7.110.100 /  7.110.100
  libswscale      5.  9.100 /  5.  9.100
  libswresample   3.  9.100 /  3.  9.100
  libpostproc    55.  9.100 / 55.  9.100
[h264 @ 0x563618de36c0] decoding for stream 0 failed
Input #0, h264, from '1920_1080.264':
  Duration: N/A, bitrate: N/A
  Stream #0:0: Video: h264 (Main), yuv420p(tv, progressive), 1920x1080 [SAR 1:1 DAR 16:9], 60 tbr, 1200k tbn, 120 tbc
Stream mapping:
  Stream #0:0 -> #0:0 (h264 (native) -> rawvideo (native))
Press [q] to stop, [?] for help
radeon: The kernel rejected CS, see dmesg for more information (-22).
radeon: The kernel rejected CS, see dmesg for more information (-2).
Output #0, rawvideo, to 'output.yuv':
  Metadata:
    encoder         : Lavf58.76.100
  Stream #0:0: Video: rawvideo (NV12 / 0x3231564E), nv12(tv, progressive), 1920x1080 [SAR 1:1 DAR 16:9], q=2-31, 1492992 kb/s, 60 fps, 60 tbn
    Metadata:
      encoder         : Lavc58.134.100 rawvideo
frame=    1 fps=0.0 q=-0.0 Lsize=    3038kB time=00:00:00.01 bitrate=1492962.1kbits/s speed=0.291x    
video:3038kB audio:0kB subtitle:0kB other streams:0kB global headers:0kB muxing overhead: 0.000000%

----> CPU Lockup here

@21pages
Copy link
Collaborator

21pages commented Jan 18, 2024

Thanks, vaapi cause the issue

@21pages
Copy link
Collaborator

21pages commented Jan 18, 2024

https://www.reddit.com/r/linux_gaming/comments/wjkqtw/dri_prime_operates_backwards/
DRI_PRIME=0, ffmpeg doesn't have problems? and DRI_PRIME=0 vainfo will be AMD gpu?

@My1
Copy link
Author

My1 commented Jan 18, 2024

never used DRI_PRIME=0 explicitly before, but both =0 and not set are intel here, altho vainfo seems to not care about vainfo alone, to get the AMD running on vainfo I need both DRI_PRIME=1 and LIBVA_DRIVER_NAME=radeonsi (output of that in earlier comment).

my1@my1-qb:~$ DRI_PRIME=0 vainfo
libva info: VA-API version 1.14.0
libva info: Trying to open /usr/lib/x86_64-linux-gnu/dri/iHD_drv_video.so
libva info: Found init function __vaDriverInit_1_14
libva info: va_openDriver() returns 0
vainfo: VA-API version: 1.14 (libva 2.12.0)
vainfo: Driver version: Intel iHD driver for Intel(R) Gen Graphics - 22.3.1 ()
vainfo: Supported profile and entrypoints
      VAProfileMPEG2Simple            : VAEntrypointVLD
      VAProfileMPEG2Main              : VAEntrypointVLD
      VAProfileH264Main               : VAEntrypointVLD
      VAProfileH264Main               : VAEntrypointEncSliceLP
      VAProfileH264High               : VAEntrypointVLD
      VAProfileH264High               : VAEntrypointEncSliceLP
      VAProfileJPEGBaseline           : VAEntrypointVLD
      VAProfileJPEGBaseline           : VAEntrypointEncPicture
      VAProfileH264ConstrainedBaseline: VAEntrypointVLD
      VAProfileH264ConstrainedBaseline: VAEntrypointEncSliceLP
      VAProfileVP8Version0_3          : VAEntrypointVLD
      VAProfileHEVCMain               : VAEntrypointVLD
      VAProfileHEVCMain10             : VAEntrypointVLD
      VAProfileVP9Profile0            : VAEntrypointVLD
      VAProfileVP9Profile2            : VAEntrypointVLD

my1@my1-qb:~$ LIBVA_DRIVER_NAME=radeonsi vainfo
libva info: VA-API version 1.14.0
libva info: User environment variable requested driver 'radeonsi'
libva info: Trying to open /usr/lib/x86_64-linux-gnu/dri/radeonsi_drv_video.so
libva info: Found init function __vaDriverInit_1_14
iris: driver missing
iris: driver missing
libva error: /usr/lib/x86_64-linux-gnu/dri/radeonsi_drv_video.so init failed
libva info: va_openDriver() returns 2
vaInitialize failed with error code 2 (resource allocation failed),exit
my1@my1-qb:~$ LIBVA_DRIVER_NAME=radeonsi DRI_PRIME=0 vainfo
libva info: VA-API version 1.14.0
libva info: User environment variable requested driver 'radeonsi'
libva info: Trying to open /usr/lib/x86_64-linux-gnu/dri/radeonsi_drv_video.so
libva info: Found init function __vaDriverInit_1_14
iris: driver missing
iris: driver missing
libva error: /usr/lib/x86_64-linux-gnu/dri/radeonsi_drv_video.so init failed
libva info: va_openDriver() returns 2
vaInitialize failed with error code 2 (resource allocation failed),exit

additionally,

my1@my1-qb:~$ DRI_PRIME=0 glxinfo | grep "OpenGL renderer"
OpenGL renderer string: Mesa Intel(R) HD Graphics 620 (KBL GT2)
my1@my1-qb:~$ DRI_PRIME=1 glxinfo | grep "OpenGL renderer"
OpenGL renderer string: HAINAN (, LLVM 15.0.7, DRM 2.50, 5.15.0-91-generic)

@21pages
Copy link
Collaborator

21pages commented Jan 18, 2024

Thanks

@My1
Copy link
Author

My1 commented Jan 18, 2024

Edited glxinfo stuff with both DRI_PRIME

@My1
Copy link
Author

My1 commented Jan 18, 2024

the fun question is obviously if there's a way to detect these "broken" implementations of whatever is exactly at fault and try something else (like the intel GPU that can not only encode H264 but also H265)
Screenshot_20240118_084507

additionally, the 2 Desktops I use with RX560 and RX 580 dont have that issue.

@21pages
Copy link
Collaborator

21pages commented Jan 18, 2024

Yes, we'll take the ability table into consideration.

@rustdesk
Copy link
Owner

rustdesk commented Feb 23, 2024

vaapi cause the issue

How about the status of this issue? @21pages

@21pages
Copy link
Collaborator

21pages commented Feb 23, 2024

FFmpeg also has the same problem on his computer. I haven't removed the VAAPI decoding because currently it is the only one hardware decoder for Intel and AMD graphics cards on Linux.

@rustdesk
Copy link
Owner

@21pages submit an issue to ffmpeg and cite here.

@21pages
Copy link
Collaborator

21pages commented Feb 23, 2024

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants