~/tcbworkdir> docker push localhost:5000/custom_weston
Using default tag: latest
The push refers to repository [localhost:5000/custom_weston]
Get “http://localhost:5000/v2/”: net/http: request canceled (Client.Timeout exceeded while awaiting headers)
Something similar happens with other dockers that have ports binded.
I use an nginx docker to serve the torizon images to use with easy installer.
docker run --rm -d -p 4321:80 --name web -v ~/easy_install_images:/usr/share/nginx/html nginx
Before running the tcb I can acces localhost:4321 and browse the images.
After runnign tcb localhost:4321 becomes unreachable.
This sounds like a odd bug. Hmm as far as I know the only command that really interacts with Docker registries is the bundle command. Technically the build command does as well but it uses the same logic as the bundle command. With this command the only thing tcb does is some pull requests to the registry to pull down the necessary container images. Given this I can’t imagine it does anything to mess with the ports in anyway.
But first let’s try and analyze the issue. First of all it might be important to know, how exactly are you setting up this local registry? Second since you’re using the tcb build command here, what is your *.yaml file that you are passing to the command?
Once I have this info I can try to reproduce the issue on my side, and see if I can take a closer look on my setup.
I have narrowed it down, and you are right, it is not any command, it is just the bundle command which causes this effect. The build command also does the same, but only when there is a “bundle: compose-file” setup in the yaml, which I am guessing internally calls the bundle command.
I think I am setting up the local registry the standard way:
docker run -d -p 5000:5000 --restart=always --name registry registry:2
This is the bundle command that I use:
(jcabecerans/test_cvc2 is private, hence login)
I originially detected this issue when working with the nginx docker. I didn’t post anything back then and just restarted my pc and nignx would work again (until the next tcb bundle/build).
However, now I am now trying to use a local docker registry to build the bundle. But the moment I run tcb bundle the local registry becomes unaccessible so there is no real way arround it …
Okay I was able to reproduce this. However I’m still not sure of the root cause. Furthermore I tried reproducing on a Windows machine and then on a Linux machine. I was only able to reproduce this on a Windows machine.
I don’t think tcb is causing the issue at least directly. Since the issue only occurs on Windows there must be some Windows specific component to the bug. Even stranger is that when I start a new registry container it is still inaccessible. Meaning that the issue isn’t just with that instance of the registry, somehow the issue persists beyond it.
Restarting Docker Desktop at least seems to restore usability of the registry. So it seems related to Docker Desktop perhaps the networking. If I check the logs of the registry container when it’s inaccessible I don’t see any of my requests to the registry get through.
This is a real puzzling issue. This will take more time to investigate especially since we don’t have too many Windows-based developers. Please let me know if you figure out anything on your side.
Good to know it is not only me.
I will let you know if I stumble uppon a solution.
I will use remote registries for the time beeing then.
I don’t think tcb is causing the issue at least directly. Since the issue only occurs on Windows there must be some Windows specific component to the bug. Even stranger is that when I start a new registry container it is still inaccessible. Meaning that the issue isn’t just with that instance of the registry, somehow the issue persists beyond it.
I am not a docker expert my any means. I could be completely off but my intuition is telling me …
Given that:
The containers that use ports are inaccessible.
Those containers keep running and detect no fault.
It is possible to start new containers.
This behaviour happens not only with registry but also nginx.
To me it just looks like tcb hijacks the communication “bus” and the other containers are left with a severed connection to the outside world.
(taken from tcb-env-setup.sh)
alias torizoncore-builder=‘docker run --rm -it’“$volumes”‘-v $(pwd):/workdir -v ‘“$storage”’:/storage --net=host -v /var/run/docker.sock:/var/run/docker.sock torizon/torizoncore-builder:’“$chosen_tag”
We use it because the bundle command starts what’s called a “docker in docker” (dind) container so that we have an isolated environment to pull and bundle the containers. But anyways this would be odd if this is what causes the issue on Windows since it’s a fairly common method.
Do you happen to know if any other containers are affected negatively other than the local registry container? I’m curious if this is a general issue or just a specific interaction between tcb and the local registry container.
We use it because the bundle command starts what’s called a “docker in docker” (dind) container so that we have an isolated environment to pull and bundle the containers. But anyways this would be odd if this is what causes the issue on Windows since it’s a fairly common method.
Oh, very interesting, nice to learn new things
Do you happen to know if any other containers are affected negatively other than the local registry container? I’m curious if this is a general issue or just a specific interaction between tcb and the local registry container.
I have tried a couple extra containers that use ports and they all have the same symptoms. They keep running but the port is not accessible.
Based off your observations it seems the problem is narrowed down to container ports. This seems to match with what I observed previously with the registry container. With the registry container I noticed when it was in the “not working” state the logs for this container showed that it was no longer getting any communications. This lines up with your observation of ports not working.
Possibly on Windows somehow the container network stack is affected?
Unfortunately though there’s still no good answers here to the root problem. This will probably need further investigation by our team here at Toradex.
For the time being I would suggest an alternative solution. Perhaps try setting up a container registry that doesn’t rely on the registry container itself. I can’t guarantee when we’d be able to fix this on our side, so an alternative solution may be necessary to avoid you from getting blocked for too long.
Alright, thanks for your support !
Please let me know if/when there is a fix
For the time being I would suggest an alternative solution. Perhaps try setting up a container registry that doesn’t rely on the registry container itself. I can’t guarantee when we’d be able to fix this on our side, so an alternative solution may be necessary to avoid you from getting blocked for too long.
I’ll give it a try, I was thinking of having the docker regisitry in another PC in the local network. That should also work.
Do you have any update on this topic? My colleagues are struggling with the same problem - they cannot use tcb with local registry because of this error.
This issue hasn’t been tackled yet since it’s a rather niche issue among our customer base at the moment. I’d suggest alternative registries since I can’t guarantee when this will be looked at.
During the investigation, our team could not reproduce the issue anymore.
Using the setup script to setup torizoncoreBuilder on WSL, after executing the bundle command, it was possible to interact with the resgistry over curl.
Could you please check on your side if this issue is gone?