Connection to Docker Daemon at "tcp://127.0.0.1:22376" is refused

Hello,
I’m trying to automate the building of customized toradex images using CICD.
My cicd looks like:

image: .xxxx/xxxx/dind-gcloud-buildx:v0.0.1
variables:
  ENV: dev

stages:
  - build

build:
  stage: build
  services:
    - name: docker:20.10.17-dind
  script:
    - cd toradexbuild
    - bash build.sh

My docker image is Alpine, run in a virtual machine (Linux) installed in windows host.

+ uname -a
Linux runner-3tqsqp1-project-145-concurrent-0 4.19.0-24-cloud-amd64 #1 SMP Debian 4.19.282-1 (2023-04-29) x86_64 Linux

The build.sh looks like:

#!/bin/bash
set -xe
shopt -s expand_aliases

output_folder="output"
password="auth key"

echo "------------------ Start building ... ------------------"
# source torizoncore-builder
source tcb-env-setup.sh -- --privileged <<< "yes" || true

torizoncore-builder build --force --file tcbuild.yaml --set PASSWORD="$password" --set OUTPUT="$output_folder"

Everything works fine when i run the build in the dind-gcloud-buildx:v0.0.1 docker. but, the CICD gives me this :

An unexpected Exception occurred. Please provide the following stack trace to
the Toradex TorizonCore support team:
Traceback (most recent call last):
  File "/usr/local/lib/python3.9/dist-packages/urllib3/connection.py", line 203, in _new_conn
    sock = connection.create_connection(
  File "/usr/local/lib/python3.9/dist-packages/urllib3/util/connection.py", line 85, in create_connection
    raise err
  File "/usr/local/lib/python3.9/dist-packages/urllib3/util/connection.py", line 73, in create_connection
    sock.connect(sa)
ConnectionRefusedError: [Errno 111] Connection refused
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
  File "/usr/local/lib/python3.9/dist-packages/urllib3/connectionpool.py", line 790, in urlopen
    response = self._make_request(
  File "/usr/local/lib/python3.9/dist-packages/urllib3/connectionpool.py", line 491, in _make_request
    raise new_e
  File "/usr/local/lib/python3.9/dist-packages/urllib3/connectionpool.py", line 467, in _make_request
    self._validate_conn(conn)
  File "/usr/local/lib/python3.9/dist-packages/urllib3/connectionpool.py", line 1092, in _validate_conn
    conn.connect()
  File "/usr/local/lib/python3.9/dist-packages/urllib3/connection.py", line 611, in connect
    self.sock = sock = self._new_conn()
  File "/usr/local/lib/python3.9/dist-packages/urllib3/connection.py", line 218, in _new_conn
    raise NewConnectionError(
urllib3.exceptions.NewConnectionError: <urllib3.connection.HTTPSConnection object at 0x7f976d18d220>: Failed to establish a new connection: [Errno 111] Connection refused
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
  File "/usr/local/lib/python3.9/dist-packages/requests/adapters.py", line 486, in send
    resp = conn.urlopen(
  File "/usr/local/lib/python3.9/dist-packages/urllib3/connectionpool.py", line 844, in urlopen
    retries = retries.increment(
  File "/usr/local/lib/python3.9/dist-packages/urllib3/util/retry.py", line 515, in increment
    raise MaxRetryError(_pool, url, reason) from reason  # type: ignore[arg-type]
urllib3.exceptions.MaxRetryError: HTTPSConnectionPool(host='127.0.0.1', port=22376): Max retries exceeded with url: /version (Caused by NewConnectionError('<urllib3.connection.HTTPSConnection object at 0x7f976d18d220>: Failed to establish a new connection: [Errno 111] Connection refused'))
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
  File "/usr/local/lib/python3.9/dist-packages/docker/api/client.py", line 214, in _retrieve_server_version
    return self.version(api_version=False)["ApiVersion"]
  File "/usr/local/lib/python3.9/dist-packages/docker/api/daemon.py", line 181, in version
    return self._result(self._get(url), json=True)
  File "/usr/local/lib/python3.9/dist-packages/docker/utils/decorators.py", line 46, in inner
    return f(self, *args, **kwargs)
  File "/usr/local/lib/python3.9/dist-packages/docker/api/client.py", line 237, in _get
    return self.get(url, **self._set_request_timeout(kwargs))
  File "/usr/local/lib/python3.9/dist-packages/requests/sessions.py", line 602, in get
    return self.request("GET", url, **kwargs)
  File "/usr/local/lib/python3.9/dist-packages/requests/sessions.py", line 589, in request
    resp = self.send(prep, **send_kwargs)
  File "/usr/local/lib/python3.9/dist-packages/requests/sessions.py", line 703, in send
    r = adapter.send(request, **kwargs)
  File "/usr/local/lib/python3.9/dist-packages/requests/adapters.py", line 519, in send
    raise ConnectionError(e, request=request)
requests.exceptions.ConnectionError: HTTPSConnectionPool(host='127.0.0.1', port=22376): Max retries exceeded with url: /version (Caused by NewConnectionError('<urllib3.connection.HTTPSConnection object at 0x7f976d18d220>: Failed to establish a new connection: [Errno 111] Connection refused'))
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
  File "/builder/tcbuilder/backend/bundle.py", line 556, in download_containers_by_compose_file
    dind_client = manager.get_client()
  File "/builder/tcbuilder/backend/bundle.py", line 321, in get_client
    dind_client = docker.DockerClient(base_url=self.docker_host, tls=tls_config)
  File "/usr/local/lib/python3.9/dist-packages/docker/client.py", line 45, in __init__
    self.api = APIClient(*args, **kwargs)
  File "/usr/local/lib/python3.9/dist-packages/docker/api/client.py", line 197, in __init__
    self._version = self._retrieve_server_version()
  File "/usr/local/lib/python3.9/dist-packages/docker/api/client.py", line 221, in _retrieve_server_version
    raise DockerException(
docker.errors.DockerException: Error while fetching server API version: HTTPSConnectionPool(host='127.0.0.1', port=22376): Max retries exceeded with url: /version (Caused by NewConnectionError('<urllib3.connection.HTTPSConnection object at 0x7f976d18d220>: Failed to establish a new connection: [Errno 111] Connection refused'))
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
  File "/usr/local/lib/python3.9/dist-packages/docker/api/client.py", line 268, in _raise_for_status
    response.raise_for_status()
  File "/usr/local/lib/python3.9/dist-packages/requests/models.py", line 1021, in raise_for_status
    raise HTTPError(http_error_msg, response=self)
requests.exceptions.HTTPError: 404 Client Error: Not Found for url: http+docker://localhost/v1.41/containers/c0edfa412f020d8fa89f777f35ed5069d3ebcad5c1221693f8e808a471beba56/stop
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
  File "/builder/torizoncore-builder", line 221, in <module>
    mainargs.func(mainargs)
  File "/builder/tcbuilder/cli/build.py", line 479, in do_build
    build(args.config_fname, args.storage_directory,
  File "/builder/tcbuilder/cli/build.py", line 465, in build
    raise exc
  File "/builder/tcbuilder/cli/build.py", line 454, in build
    handle_output_section(
  File "/builder/tcbuilder/cli/build.py", line 303, in handle_output_section
    handle_bundle_output(
  File "/builder/tcbuilder/cli/build.py", line 364, in handle_bundle_output
    download_containers_by_compose_file(**download_params)
  File "/builder/tcbuilder/backend/bundle.py", line 598, in download_containers_by_compose_file
    manager.stop()
  File "/builder/tcbuilder/backend/bundle.py", line 295, in stop
    self.dind_container.stop()
  File "/usr/local/lib/python3.9/dist-packages/docker/models/containers.py", line 438, in stop
    return self.client.api.stop(self.id, **kwargs)
  File "/usr/local/lib/python3.9/dist-packages/docker/utils/decorators.py", line 19, in wrapped
    return f(self, resource_id, *args, **kwargs)
  File "/usr/local/lib/python3.9/dist-packages/docker/api/container.py", line 1203, in stop
    self._raise_for_status(res)
  File "/usr/local/lib/python3.9/dist-packages/docker/api/client.py", line 270, in _raise_for_status
    raise create_api_error_from_http_exception(e) from e
  File "/usr/local/lib/python3.9/dist-packages/docker/errors.py", line 39, in create_api_error_from_http_exception
    raise cls(e, response=response, explanation=explanation) from e
docker.errors.NotFound: 404 Client Error for http+docker://localhost/v1.41/containers/c0edfa412f020d8fa89f777f35ed5069d3ebcad5c1221693f8e808a471beba56/stop: Not Found ("No such container: c0edfa412f020d8fa89f777f35ed5069d3ebcad5c1221693f8e808a471beba56")
Deploying commit ref: tcbuilder-20231115120425

The flag -- --network=host is added to the build command
log:

docker run --rm -v /deploy -v /builds/toradexbuild:/workdir -v storage:/storage -v /var/run/docker.sock:/var/run/docker.sock --network=host -e TCB_CONTAINER_NAME=tcb_1700058794 --name tcb_1700058794 --privileged torizon/torizoncore-builder:3 build --force --file tcbuild.yaml --set PASSWORD=ya29.c.c0AYxxxxxxxxxxxxxxxxxxxxxx --set OUTPUT=toradex-image-dev-0460d08b
Building image as per configuration file 'tcbuild.yaml'...

I’m using the latest version of the tcb-env-setup.sh.

Can you help me please ?
Thanks !

Greetings @Samir1,

My docker image is Alpine, run in a virtual machine (Linux) installed in windows host.

Is it just with Alpine image you see this issue? Since you said it works fine with dind-gcloud-buildx:v0.0.1. Or do other distros in CICD give you an issue as well?

For reference this is the CI file we use in our own building and testing of TorizonCore Builder: https://github.com/toradex/torizoncore-builder/blob/bullseye/.gitlab-ci.yml

We use the docker:latest as our CICD base image for the docker-in-docker capabilities to work right out of the box. Keep in mind TorizonCore Builder itself is run in a docker container, and if you’re bundling containers like it seems you are. Then an additional docker in docker container will be used for the bundling process.

That said it’s not very clear why you’re getting this issue with Alpine. Unfortunately, the errors shown in the logs are pretty generic errors from the Docker API that don’t really tell much specifics.

Looking through our known issues I see these issues which have a similar error message:

Perhaps your affected by either of these.

Best Regards,
Jeremias

Hello @jeremias.tx,
Thanks for your response!

Here is my Dockerfile:

FROM docker:20.10.17-dind
COPY --from=docker/buildx-bin:latest /buildx /usr/libexec/docker/cli-plugins/docker-buildx
RUN apk update && apk upgrade

  • When this image docker (dind-gcloud-buildx:v0.0.1) is created and executed in a VM Debian running in Windows (VM Debian in Windows)

-----> The cicd fails.

  • When the same image docker is created and executed in Linux (Ubuntu 22:04) native machine.

------> The cicd works fine.

  • When i build in my native linux (without dind).

-------> The cicd works fine.

(of course, I adapt the CI CD for each scenario)

I really don’t understand what causes the problem?

Thank you so much!

Based on your observations it sounds like it only fails in the situation where you’re running a VM on a Windows machine. Perhaps the issue is due to your VM setup/settings? It’s hard to say since there’s quite a lot of factors with VM setup.

But, at least it seems like on native Linux machines your process works without issue.

Best Regards,
Jeremias