When building tdx-reference-multimedia-image, around 5000 task, it hangs. How to solve this?

Host PC: WSL on Windows 11. 16GB RAM.

It seems that I can run 8 tasks at a time when building, and twice I had to restart the build. First time, it shut down on me. Second time, it was hung for at least an hour. I wish I took record of which task it was.

I would then force close (CTRL+C), do rm bitbake.lock in the build folder, and rerun the build again as bitbake tdx-reference-multimedia-image

I’m still building, but the number of tasks are ahead now. It was at 5000 ish, now it’s at 5300 range (total is 8672). Unfortunately, it seems that there is one task that hangs which I think is dependent on other tasks and then there’s another task that is dependent on another task, which is probably why my build just stays hung.

Any suggestions?
Also - what happens if there’s an ERROR: task failed with exit code '1'? Does running bitbake again fix this?

Edit: Running for the third time, almost at 5400 tasks, but it’s been 20 minutes so far. Attached is what the screen looks like.
2024-02-09-build-hung.txt (3.6 KB)

Anyone?

I ran it on the server and it works, but I’d like to know what’s the issue with my laptop - is it simply because my laptop isn’t powerful enough, or did I need to reset/fix the build somehow?

EDIT: Ran it on the server, but it got disconnected - I assumed it worked. I ran it again and I’m facing the same issue. Hangs around 5987 tasks, most of the tasks are related to Qt. It’s been running for an hour now.

Hello @srzm,

I was facing the same issue with WSL when I used too much parallel threads/tasks with too few RAM. Try to increase your RAM or decrease your count of cores in your WSL-config (%USERPROFILE%/.wslconfig)

This one is mine:

# Settings apply across all Linux distros running on WSL 2
[wsl2]

# Limits VM memory to use no more than 24 GB, this can be set as whole numbers using GB or MB
memory=24GB 

# Sets the VM to use 12 virtual processors
processors=12

# Sets amount of swap storage space, default is 25% of available RAM
swap=8GB

Best regards,
Markus

Unfortunately, this did not solve the problem. New error has come up:

ERROR: Task (/home/srzm/oe-core/build/…/layers/meta-toradex-demos/recipes-benchmark/cpuburn/cpuburn-a53_git.bb:do_fetch) failed with exit code ‘1’

This is another problem.
Your machine cannot download the sources for the cpuburn-recipe. You can try the following:

  • clean & restart the build process with
bitbake world -c clean
bitbake tdx-reference-multimedia-image

or

  • rerun the specific recipe with
bitbake cpuburn-a53

If it fails again fetching the sources you can copy the URL from the error message, paste it into your browser and download it manually into the download folder of your build environment. Then restart the build of the reference image.

If that fails too then maybe your company firewall is blocking the git protocol. Then you can replace it with https.

Best regards,
Markus

Thank you Markus, you’ve been truly helpful!

The world command line didn’t work:

srzm:~/oe-core/build$ bitbake world -c clean
WARNING: You have included the meta-tpm layer, but ‘tpm or tpm2’ has not been enabled in your DISTRO_FEATURES. Some bbappend files and preferred version setting may not take effect. See the meta-tpm README for details on enabling tpm support.
Loading cache: 100% |####################################################################################| Time: 0:00:01
Loaded 4751 entries from dependency cache.
WARNING: No recipes in default available for:
/home/srzm/oe-core/build/…/layers/meta-toradex-nxp/recipes-bsp/imx-mkimage/imx-mkimage_1.0.bbappend
/home/srzm/oe-core/build/…/layers/meta-toradex-nxp/recipes-multimedia/gstreamer/gstreamer1.0-plugins-base_1.20.0.imx.bbappend
WARNING: preferred version 3.19.0.imx of optee-client not available (for item optee-client)
WARNING: versions of optee-client available: 3.14.0 3.16.0
NOTE: Resolving any missing task queue dependencies
ERROR: Nothing PROVIDES ‘optee-os’ (but /home/srzm/oe-core/build/…/layers/meta-freescale/recipes-security/smw/smw_git.bb DEPENDS on or otherwise requires it)
optee-os was skipped: incompatible with machine verdin-imx8mp (not in COMPATIBLE_MACHINE)
optee-os was skipped: missing required machine feature ‘optee’ (not in MACHINE_FEATURES)
optee-os was skipped: incompatible with machine verdin-imx8mp (not in COMPATIBLE_MACHINE)
ERROR: Required build target ‘meta-world-pkgdata’ has no buildable providers.
Missing or unbuildable dependency chain was: [‘meta-world-pkgdata’, ‘smw’, ‘optee-os’]

So I’d assume world doesn’t work here, so I did

bitbake tdx-reference-minimal-image -c clean
bitbake tdx-reference-minimal-image

Ran into the same issue. So… cpuburn it is.

I do

bitbake tdx-reference-minimal-image -c clean
bitbake cpuburn-a53

It does indeed fail.

WARNING: /home/srzm/oe-core/build/…/layers/meta-toradex-demos/recipes-benchmark/cpuburn/cpuburn-a53_git.bb:do_fetch is tainted from a forced run
Initialising tasks: 100% |###############################################################################| Time: 0:00:01
Sstate summary: Wanted 11 Local 4 Mirrors 0 Missed 7 Current 225 (36% match, 97% complete)
NOTE: Executing Tasks
WARNING: cpuburn-a53-git-r0 do_fetch: Failed to fetch URL https://raw.githubusercontent.com/ssvb/cpuburn-arm/dd5c5ba58d2b0b23cfab4a286f9d3f5510000f20/cpuburn-a8.S;name=ssvb, attempting MIRRORS if available
ERROR: cpuburn-a53-git-r0 do_fetch: Fetcher failure: Fetch command export PSEUDO_DISABLED=1; export DBUS_SESSION_BUS_ADDRESS=“unix:path=/run/user/1000/bus”; export PATH=“/home/srzm/oe-core/build/tmp/sysroots-uninative/x86_64-linux/usr/bin:/home/srzm/oe-core/layers/openembedded-core/scripts:/home/srzm/oe-core/build/tmp/work/cortexa53-tdx-linux/cpuburn-a53/git-r0/recipe-sysroot-native/usr/bin/aarch64-tdx-linux:/home/srzm/oe-core/build/tmp/work/cortexa53-tdx-linux/cpuburn-a53/git-r0/recipe-sysroot/usr/bin/crossscripts:/home/srzm/oe-core/build/tmp/work/cortexa53-tdx-linux/cpuburn-a53/git-r0/recipe-sysroot-native/usr/sbin:/home/srzm/oe-core/build/tmp/work/cortexa53-tdx-linux/cpuburn-a53/git-r0/recipe-sysroot-native/usr/bin:/home/srzm/oe-core/build/tmp/work/cortexa53-tdx-linux/cpuburn-a53/git-r0/recipe-sysroot-native/sbin:/home/srzm/oe-core/build/tmp/work/cortexa53-tdx-linux/cpuburn-a53/git-r0/recipe-sysroot-native/bin:/home/srzm/oe-core/layers/openembedded-core/bitbake/bin:/home/srzm/oe-core/build/tmp/hosttools”; export HOME=“/home/srzm”; /usr/bin/env wget -t 2 -T 30 --passive-ftp -P /home/srzm/oe-core/build/…/downloads/cpuburn-a53-git ‘https://raw.githubusercontent.com/ssvb/cpuburn-arm/dd5c5ba58d2b0b23cfab4a286f9d3f5510000f20/cpuburn-a8.S’ --progress=dot -v failed with exit code 5, no output
ERROR: cpuburn-a53-git-r0 do_fetch: Bitbake Fetcher Error: FetchError(‘Unable to fetch URL from any source.’, ‘https://raw.githubusercontent.com/ssvb/cpuburn-arm/dd5c5ba58d2b0b23cfab4a286f9d3f5510000f20/cpuburn-a8.S;name=ssvb’)
ERROR: Logfile of failure stored in: /home/srzm/oe-core/build/tmp/work/cortexa53-tdx-linux/cpuburn-a53/git-r0/temp/log.do_fetch.355968
ERROR: Task (/home/srzm/oe-core/build/…/layers/meta-toradex-demos/recipes-benchmark/cpuburn/cpuburn-a53_git.bb:do_fetch) failed with exit code ‘1’
NOTE: Tasks Summary: Attempted 791 tasks of which 790 didn’t need to be rerun and 1 failed.
NOTE: Writing buildhistory
NOTE: Writing buildhistory took: 3 seconds

Summary: 1 task failed:
/home/srzm/oe-core/build/…/layers/meta-toradex-demos/recipes-benchmark/cpuburn/cpuburn-a53_git.bb:do_fetch
Summary: There were 4 WARNING messages.
Summary: There were 2 ERROR messages, returning a non-zero exit code.

I see two URLs…

  1. https://raw.githubusercontent.com/ssvb/cpuburn-arm/dd5c5ba58d2b0b23cfab4a286f9d3f5510000f20/cpuburn-a8.S

  2. https://raw.githubusercontent.com/ssvb/cpuburn-arm/dd5c5ba58d2b0b23cfab4a286f9d3f5510000f20/cpuburn-a8.S;name=ssvb

The first URL displays the code, but the second URL says “Error 404: Page not found”. Not sure why.

Right now I’m figuring out how to download it manually into the download folder, so I’ll update once I figure that out.

Please note that according to the Yocto Project’s system requirements, while you may use the Windows Subsystem for Linux version 2 (WSL 2) to set up a build host on Windows 10 or later, or Windows Server 2019 or later, builds using WSL 2 are not officially validated.

For more information and tips on setting up WSL 2 for Yocto Project builds, you can visit: this page

We strongly recommend using a dedicated Linux machine for building Yocto projects, especially for large images like the tdx-reference-multimedia-image, to ensure the best performance and compatibility.

I’m using the WSL2 on a Windows 11 host for an image based on the tdx-reference-multimedia-image since several months without problems.

What irritates me about the error log is the following line:

WARNING: /home/srzm/oe-core/build/…/layers/meta-toradex-demos/recipes-benchmark/cpuburn/cpuburn-a53_git.bb:do_fetch is tainted from a forced run

It seems that the hang of the build process has caused quite a mess in the environment.
You can try the following two things:

  • A complete cleanup of the cpuburn-arm53 recipe with
    bitbake cpuburn-a53 -c cleanall

  • What I would recommend: a reset of the build environment by deleting the entire contents of the build directory (apart from the conf directory, which must remain)
    rm -rf bitbake-cookerdaemon.log tmp/ buildhistory/ cache/ deploy/
    This does not take much time, because the build results of recipes that have run through completely are retained. So it does not start from scratch.

  • For a complete reset of the build environment, you can also delete the sstate-cache directory
    rm -rf ../sstate-cache/
    But then the build process is started from the very beginning.

Best regards,
Markus

@alex.tx - Thank you. I am running on two machines right now.

  1. WSL on my local laptop (@Mowlwurf - I’ve done your steps and will update when done)
  2. ubuntu server that I SSH to.

Both times I have issues. I should open up another topic, but they both have the same issue, which is why I’ve switched to minimal rather than multimedia images.

Right now, Markus is helping me on the local laptop that has WSL. When I bitbake tdx-reference-multimedia-image, it hangs around 5000 like the original post, so I did the same process on the server. Still same issue. It was when it hits the Qt part that it hangs.

So when I switched to bitbake tdx-reference-minimal-image on the server, it ran successfully with no issues. So there is an issue with the multimedia image, specifically relating to Qt. I ran the multimedia again anyway yesterday morning on the server, and this is the current screen (I have not ended it yet) - it has been 5965 since yesterday.

Setscene tasks: 3587 of 3587
Currently 19 running tasks (5965 of 8672) 68% |################################################ |
0: qtdeclarative-5.15.7+gitAUTOINC+0d60f81bf6-r0 do_package - 23h12m50s (pid 1678484)
1: librsvg-2.52.10-r0 do_compile - 23h12m49s (pid 1678491)
2: qtxmlpatterns-5.15.7+gitAUTOINC+97209f5679-r0 do_compile - 23h12m49s (pid 1678512)
3: qtgraphicaleffects-5.15.7+gitAUTOINC+dfacc1706e-r0 do_compile - 23h12m49s (pid 1678517)
4: qtwebsockets-5.15.7+gitAUTOINC+4fe33a26f2-r0 do_compile - 23h12m48s (pid 1678546)
5: qtquickcontrols-5.15.7+gitAUTOINC+be434da57b-r0 do_compile - 23h12m48s (pid 1678677)
6: qtquickcontrols2-5.15.7+gitAUTOINC+8b7daceeb8-r0 do_compile - 23h12m48s (pid 1678703)
7: librsvg-native-2.52.10-r0 do_compile - 23h12m44s (pid 1679063)
8: qtwayland-5.15.7+gitAUTOINC+533fff12f7-r0 do_compile - 23h12m43s (pid 1679071)
9: qtquicktimeline-5.15.7+gitAUTOINC+fca37ec814-r0 do_compile - 23h12m41s (pid 1679111)
10: qtcoap-5.15.7+gitAUTOINC+628d3b8abd-r0 do_compile - 23h12m40s (pid 1679280)
11: qtsystems-5.15.7+gitAUTOINC+e3332ee38d-r0 do_compile - 23h12m39s (pid 1679299)
12: qtpurchasing-5.15.7+gitAUTOINC+267f179714-r0 do_compile - 23h12m36s (pid 1679406)
13: qtopcua-5.15.7+gitAUTOINC+6d45793cae-r0 do_compile - 23h12m34s (pid 1679480)
14: qtconnectivity-5.15.7+gitAUTOINC+09e33f2513-r0 do_compile - 23h12m34s (pid 1679481)
15: qtlottie-5.15.7+gitAUTOINC+0dcf0bb9cf-r0 do_compile - 23h12m34s (pid 1679483)
16: qt3d-5.15.7+gitAUTOINC+bf79d391c0-r0 do_compile - 23h12m34s (pid 1679484)
17: qtremoteobjects-5.15.7+gitAUTOINC+62bf1183d1-r0 do_compile - 23h12m34s (pid 1679489)
18: qtsensors-5.15.7+gitAUTOINC+d9574231fd-r0 do_compile - 23h9m0s (pid 1679455)

Side note: How to do block code with scrolling feature?

My advice on the WSL about the relationship between the number of cores and RAM also applies to your server.
When the build is running you should monitor your RAM usage (for example with htop). When it gets too high the build process gets in a deadlock situation.

To reduce the RAM usage you have to reduce the count of used cores on your server. How to do this please see this thread: https://community.nxp.com/t5/i-MX-Processors/How-can-I-reduce-the-number-of-parallel-yocto-builds/m-p/1068774/highlight/true#M157026

Best regards,
Markus

1 Like

This post has 2 answers, both had hangs and required two different solutions.

TL;DR - When bitbake tdx-reference-multimedia-image, it hangs around 5000 tasks, due to deadlocking. One solution is based on the server (Ubuntu) and the other solution is based on local laptop (Windows 11, Ubuntu via WSL).

SERVER - Qt KEEPS HANGING AND GETS DEADLOCKED

As @Mowlwurf mentioned, problem arises when there are too much parallel threads/tasks with too few RAM. When the build is running, if the RAM usage gets too high, the build process gets in a deadlock situation, so we need to reduce the RAM usage by reducing the count of used cores. This solved the problem by checking how many cores the server had and I kept decreasing until it stopped deadlocking. Build completed!

For others that want to know…
In the server, I used nproc to know my count, which was 36. In the build/conf/local.conf file, put

BB_NUMBER_THREADS=“24”
PARALLEL_MAKE=“-j 15”

Yocto documents recommended that PARALLEL_MAKE not to be more than 20, and I tried with 20 but deadlocked too, but 15 worked. (see Post 9)

Then I go ahead and clean everything, removed everything in build folder EXCEPT conf folder.

Then it works.

LOCAL WSL LAPTOP - TAINTED URL FROM FORCED RUN

Similar problem as the server, except in this case, we adjust the WSL config file first (see first post by Markus). I believe because it hung/got deadlocked and I had to force closed the bitbake instruction, it came up with issues. This caused the URL of one recipe to be was tainted, so we -c cleanall and removed some files/folders (see Post 6 by Markus for instructions).

This now works as well. This thread has been solved! All credit goes to @Mowlwurf.

I will use the first reply as the solution because that’s truly the answer to the question.

@szrm

I’m glad to hear your problem was resolved. I’ve seen similar issues building on machines with low memory amounts. Especially when building QT and friends as that seems to just use everything you’ve got and more. In my case, I was actually seeing OOM Killer messages in the system logs indicating that the kernel was explicitly killing things due to out-of-memory conditions.

I’ve never tried to build on WSL so cannot comment on that but I do recall a recommendation of 4GB per core from someone at a conference once. My main build machine is a threadripper with 48 cores and 192GB RAM so I no longer see these issues. Before increasing the RAM I had to reduce the parallel settings like you did.

Drew

1 Like