TCB, private registry and certificates problem

Hi,
we made a setup of a private docker registry on our premises (not docker hub).
Configuration uses certificate and key issued by the corporation. Verification:

  1. Push/pull is possible from the developer machine by WSL.
  2. openssl s_client shows proper certificate for the registry

Problem:
torizoncore-builder fails during “bundle” stage.

What I have already tried to extend the torizoncore-builder image as below.

  1. adding corporate CA to the torizon-builder image (dockerfile)
    ADD certs/XXX.crt /usr/local/share/ca-certificates/
    RUN update-ca-certificates
  2. adding corporate CA to the python cert storage (dockerfile)
    RUN cat /etc/ssl/certs/XXX.pem >> /usr/local/lib/python3.9/dist-packages/certifi/cacert.pem
  3. adding corporate CA to docker config in torizoncore-builder image (changes overlay)
    /etc/docker/certs.d/XXXX:443/ca.crt

When I run interactively the torizoncore-builder image - I can see proper certificates of the registry (e.g. via curl).
When I use insecure-registry as “dind_param” it works, however it is not a production solution.

Have you got any proposition? I am running out of ideas.

Logs

Building image as per configuration file 'tcbuild.yaml'...
=>> Handling input section
Unpacking Toradex Easy Installer image.
Copying Toradex Easy Installer image.
Unpacking TorizonCore Toradex Easy Installer image.
Importing OSTree revision 0c834097c0c3e79ebb47cb9f7f09cc3241dfa8445bea61bdb37cce6869162dd1 from local repository...
1088 metadata, 12667 content objects imported; 407.0 MB content written                                                                                                                                                   
Unpacked OSTree from Toradex Easy Installer image:
  Commit checksum: 0c834097c0c3e79ebb47cb9f7f09cc3241dfa8445bea61bdb37cce6869162dd1
  TorizonCore Version: 5.6.0+build.13
=>> Handling customization section
=> Handling device-tree subsection
=> Selecting custom device-tree 'device-trees/dts-arm32/XXXX'
'XXXXX' compiles successfully.
warning: removing currently applied device tree overlays
Device tree XXXX successfully applied.

=>> Handling output section
Applying changes from STORAGE/dt.
Applying changes from WORKDIR/changes1.
XXX has been generated for changes and is ready to be deployed.
Deploying commit ref: tcbuilder-20220523215952
Pulling OSTree with ref tcbuilder-20220523215952 from local archive repository...
  Commit checksum: XXX
  TorizonCore Version: 5.6.0+build.13-tcbuilder.20220523215952
  Default kernel arguments: quiet logo.nologo vt.global_cursor_default=0 plymouth.ignore-serial-consoles splash fbcon=map:3

1088 metadata, 12682 content objects imported; 407.1 MB content written                                                                                                                                                   
Pulling done.
Deploying OSTree with checksum .....
Deploying done.
Copy files not under OSTree control from original deployment.
Packing rootfs...
Packing rootfs done.
Updating TorizonCore image in place.
Bundling images to directory XXXX
Starting DIND container
Using Docker host "tcp://127.0.0.1:22376"
Connecting to Docker Daemon at "tcp://127.0.0.1:22376"
Fetching container image XXXX
Stopping DIND container
Removing output directory 'XXXX' due to build errors
Error: Error: container images download failed: 500 Server Error for https://127.0.0.1:22376/v1.40/images/create?tag=1.0.0&fromImage=XXXX: Internal Server Error ("Get https://XXXXXX: x509: certificate signed by unknown authority")

Greetings @marek.kucinski,

You’re on the right track here but I think there’s a bit of a misunderstanding here. When TorizonCore Builder pulls images with bundle it’s not the TorizonCore Builder container that does this.

We start another parallel container running “docker-in-docker” in order to pull the images for the bundle command. So passing your certs to the TorizonCore Builder container wouldn’t actually do anything here for you.

What you probably want to try is the --dind-param switch for the bundle command. This switch passes dockerd arguments to this “docker-in-docker” container that does the image pull. Looking at the list of dockerd arguments: dockerd | Docker Documentation

I do see some arguments related to passing tls certs. Though these arguments seem to be a path string to where dockerd can find the cert files themselves. Given this I’m not quite sure if this would work since you can pass the file paths sure, but I don’t see a way to actually pass the cert files themselves into the docker-in-docker container.

Let me check this on my side to make sure, but give this a try if you have a chance.

Best Regards,
Jeremias

Hi Jeremias,
is there any documentation of the architecture of TCB solution? To be honest - I did my best to understand the source code, but it is not a shortest way to find the solution :slight_smile:

According to your answer:

  1. Is the DinD executed parallel at host or “embedded” in TCB container? DinD name suggest nested approach (so in TCB). If the former - what is the image source and what are parameters?

  2. For sure DinD tries to check the CA chain during pull, but I could not find the CA store.

  3. Even if I pass obviously wrong parameters by --dind-params, I do not see errors coming out. How could I debug the DinD execution?

  4. I did try to pass the “–dind-param” for the bundle. I’ve tried the multiple combinations of:
    a) --tls
    b) --tlscacert
    c) --tlsverify

  5. Even if do succeed with ‘bundle --dind-param’ then I do have to split “torizoncore-builder build”:
    a) remove “bundle” from docker-compose
    b) call: build > bundle > combine
    c) it is not so fluent e.g. intermediate directories are not deleted

  6. I think that --tlsXX commands are to secure the connection to the dockerd control socket (not passed to container executed). Documentation here is quite vague, however see here:
    Protect the Docker daemon socket | Docker Documentation

Additional question:
How could we contribute to the code of TCB? Some optimizations would be welcome :slight_smile:

1 Like

Taking a closer look at all this again I don’t think we have the features or capabilities for such a use-case. Probably what you want is closer to what’s described in this Docker article: https://docs.docker.com/engine/security/certificates/

Is the DinD executed parallel at host or “embedded” in TCB container? DinD name suggest nested approach (so in TCB). If the former - what is the image source and what are parameters?

In parallel and you can see the docker run parameters in this part of the code: https://github.com/toradex/torizoncore-builder/blob/bullseye/tcbuilder/backend/bundle.py#L252

For sure DinD tries to check the CA chain during pull, but I could not find the CA store.

The article I linked earlier suggests that Docker additionally checks certs under /etc/docker/certs.d. Now for this case I would assume this would need to be in the /etc inside the DIND container.

Even if I pass obviously wrong parameters by --dind-params, I do not see errors coming out. How could I debug the DinD execution?

Really? If I pass wrong parameters to --dind-params I get errors. You can see more of what’s going on by running TorizonCore Builder with --log-level debug

Even if do succeed with ‘bundle --dind-param’ then I do have to split “torizoncore-builder build”

I’m not quite sure what you mean by this. The capabilities of bundle and combine should be covered by build. Or am I misunderstanding you?

I think that --tlsXX commands are to secure the connection to the dockerd control socket (not passed to container executed). Documentation here is quite vague, however see here:

I think you’re right this only affects the daemon and not necessarily the client doing the docker pull.

How could we contribute to the code of TCB? Some optimizations would be welcome

We appreciate feedback and suggestions but we don’t really have a process for this kind of open-source contributions yet.

In any case I’ll need to discuss possible options or ideas with the team here in Toradex. Hopefully there is an idea or workaround that could help you now. Worst-case this becomes a pending feature request that requires development on our side.

Best Regards,
Jeremias

Hi Jeremias,
at the beginning - thank you for rapid reaction. It really builds our trust in Toradex :slight_smile:

Under provided link there is a disclaimer:
On Linux any root certificates authorities are merged with the system defaults, including the host’s root CA set. If you are running Docker on Windows Server, or Docker Desktop for Windows with Windows containers, the system default certificates are only used when no custom root certificates are configured.
As I do run Windows + WSL:
a) Windows Cert store is filled with corporate CA (verified by certmgr on PC and in web browser by connection to registry with HTTPS)
b) WSL has proper CA’s (add; update-ca-certificates; verify by /etc/ssl/certs/ca-certificates; verify by docker pull from our registry working)
c) TorizonCore-Builder has CA (copy in the image; update-ca-certificates; verify by /etc/ssl/certs/ca-certificates)
Nevertheless, I would try to pass all three parameters so also a client certificate. Maybe passing the tlscacert only is not enough to enforce “trust” to registry. So maybe only mutual (client+server) auth is supported.

When I look at linked code - I think the certificates directory is created during runtime. Maybe if I properly format host-workdir parameter I could force DinD to bind (mount) my local certificates storage and override those auto-created? Could you help me with that?

However I am not sure is that possible due to this part of code (…generated…).

Just to be sure: is the dind image: 19.03.8-dind c814ba3a41a3 ?

How could I bind “outer” (or OS) /etc/docker/certs.d to the dind? I think the only way would be to make fork image (by FROM) and change the bundle.py to use it.

I will check --log-level.

Regarding:
"The capabilities of bundle and combine should be covered by build.
You are right, but how can I pass the “–dind param” to the build command? Are all nested subcommands parameters (like bundle/combine) able to be passed to the build command which will pass it later to bundle and combine? So could I call as below?
torizoncore-builder build --dind-param="...."

Again thank you for your valuable input. I am sure we will sort it out!

When I look at linked code - I think the certificates directory is created during runtime. Maybe if I properly format host-workdir parameter I could force DinD to bind (mount) my local certificates storage and override those auto-created? Could you help me with that?

The certs you’re looking at in this code snippet are the certs that the DinD container autogenerates itself as described here: Docker

We create a temp directory since we use these so that the TorizonCore Builder container can connect and make calls to the docker instance running in the DinD container.

If you could, let’s try an experiment detached from TorizonCore Builder. Try spinning up a DinD container manually in your WSL instance. See if bindmounting your certs to /etc/docker/certs.d inside this DinD container is really enough to allow you to pull your images from your corporate registry. If you can figure out what DinD needs to access your registry then it’d be easier for us to figure out what changes are needed in TorizonCore Builder. When you spin up this test DinD container try to mimic the run flags that we use in TorizonCore Builder.

Just to be sure: is the dind image: 19.03.8-dind c814ba3a41a3 ?

Correct.

How could I bind “outer” (or OS) /etc/docker/certs.d to the dind? I think the only way would be to make fork image (by FROM) and change the bundle.py to use it.

I believe this would be the way too, off the top of my head. But as I said above try doing a test on a separate DinD container outside of TorizonCore Builder first. That way you don’t have to re-build TorizonCore Builder and such.

Are all nested subcommands parameters (like bundle/combine) able to be passed to the build command which will pass it later to bundle and combine ? So could I call as below?

Well almost every command and their arguments should be available as part of the yaml schema that the build command uses. See the schema here: TorizonCore Builder Tool “build” command | Toradex Developer Center

However, the --dind-param argument is not in the schema yet and can’t be specified. We have ticket in our backlog to address this but we haven’t gotten around to it. Though you can use the bundle command to produce a directory with the bundle output, then pass this directory to the build command as per the schema.

With that said please try the DinD test so that we can figure out how best to modify TorizonCore Builder for your use-case.

Best Regards,
Jeremias

Hi Jeremias,
for your information - topic is still valid, however I do have concurrent tasks and do not have too much time at this moment to fix this aspect.
I will let you inform on advances. Please do not close the topic.
Best regards,
MK

No worries just remember to inform me once you have a chance to try what I suggested. We are still interested in supporting your registry use case and the investigation/test on your side would be greatly appreciated.

Best Regards,
Jeremias

Hi Jeremias,
I can confirm that adding volume bind with certificate of registry to the docker:dind works. Steps required to verify - below.

  1. Prepare certs dir:

certs.d/my.registry.addr:443/ca.crt

  1. Create the DIND host container

docker run -it --rm --privileged --name DIND --network some-network --network-alias docker -e DOCKER_TLS_CERTDIR=/certs -v some-docker-certs-ca:/certs/ca -v some-docker-certs-client:/certs/client -v pwdcerts.d/:/etc/docker/certs.d/ docker:dind

verified both: docker:dind and docker:19.03.8-dind

  1. Run docker TEST container:

docker run -it --rm --name TEST --network some-network -e DOCKER_TLS_CERTDIR=/certs -v some-docker-certs-client:/certs/client:ro -v pwd/certs.d/:/etc/docker/certs.d/ docker:latest sh

  1. Pull the image from TEST through DIND

docker pull my.registry.addr:443/test:latest

Info1: certs provided by DOCKER_TLS_CERTDIR are used to secure the daemon connectivity between DIND and TEST. Those have nothing to do with registry certificates and trust to them.

Info2: ca.crt could be:
** top-level certificate*
** intermediate certificate*
** domain certificate*

as long as registry server where connection is made is below or equal the provided certificate (so the trust chain could be verified).

At this moment I do not have easy solution how to merge this into torizoncore-builder.
Maybe adding new parameter to mount the local directory to the /etc/docker/certs.d/ ? It would be the most flexible (I think). here?

Next step would be support for user certificates for login - probably would work out-of-the-box if those could be supplied to the DIND.

Thank you for testing and verifying this method. With what you’ve described here it does seem like to work as I thought with DIND. So here’s what will do from the Toradex side.

Using your test here as motivation I’ll make a feature request to the team to add this feature. Which would allow users the option to access a container registry using certificates rather than password-based authentication. Though at this time I can’t say when this work will be done by the team on our side.

As for your case, in the short-term you’ll probably need to implement some fix in TorizonCore Builder and use a modified TorizonCore Builder for your needs. As a suggestion the list here in the code: torizoncore-builder/bundle.py at 649312930eea02cc2dc3b456381aede1d16062e3 · toradex/torizoncore-builder · GitHub

Is the list of mount-points/volumes that get passed to the DIND container that is ran by TorizonCore Builder. So you can add your mounts for your certificates to this list. Then that should more or less work.

Best Regards,
Jeremias

Hi Jeremias,
I have prepared a patch for the bundle command. Probably it is not following python clean code rules, but it works :slight_smile:
Now using command

bundle --registry-certs /my/loca/dir/certs.d docker-compose.yml

bind mounts the certificates for the registries. No more errors as below

Error: container images download failed: 500 Server Error …
x509: certificate signed by unknown authority

Please find attached patch below. I hope it will speed up things on your side.
Sadly - incorporating it in the “build” command of torizoncore-builder (and tcbuild.yml) is beyond my capabilities (both technical and time).
Please let me know when this feature would be included within torizoncore-builder.

diff --git a/tcbuilder/backend/bundle.py b/tcbuilder/backend/bundle.py
index 34478b8..3a17d65 100755
--- a/tcbuilder/backend/bundle.py
+++ b/tcbuilder/backend/bundle.py
@@ -122,7 +122,7 @@ class DindManager(DockerManager):
     DIND_CONTAINER_NAME = "tcb-fetch-dind"
     TAR_CONTAINER_NAME = "tcb-build-tar"
 
-    def __init__(self, output_dir, host_workdir):
+    def __init__(self, output_dir, host_workdir, registry_certs=None):
         super(DindManager, self).__init__(output_dir)
 
         # Create certificate directory based on date/time.
@@ -133,6 +133,8 @@ class DindManager(DockerManager):
         # Certificates and output directory as accessible from our container.
         self.cert_dir = cert_dir
         self.bundle_dir = output_dir
+        # Certificates used to access the registry
+        self.registry_certs_dir = registry_certs
 
         if isinstance(host_workdir, str):
             # How to access certs and output directory from other containers.
@@ -227,7 +229,7 @@ class DindManager(DockerManager):
         if default_platform is not None:
             log.debug(f"Default platform: {default_platform}")
             _environ['DOCKER_DEFAULT_PLATFORM'] = default_platform
-
+        
         _mounts = [
             docker.types.Mount(
                 source=self.cert_dir_host[0],
@@ -240,8 +242,19 @@ class DindManager(DockerManager):
                 type='volume',
                 target='/var/lib/docker/',
                 read_only=False
-            )
+            ),
+            
         ]
+        if self.registry_certs_dir is not None:
+            log.debug(f"Using custom registry certs.d: {self.registry_certs_dir}")
+            _mounts.append(
+                docker.types.Mount(
+                source=self.registry_certs_dir,
+                type='bind',
+                target='/etc/docker/certs.d/',
+                read_only=False
+            )
+            )
         log.debug(f"Volume mapping for DinD: {_mounts}")
 
         # Augment DinD program arguments.
@@ -446,7 +459,7 @@ def login_to_registries(client, logins):
 # pylint: disable=too-many-arguments,too-many-locals
 def download_containers_by_compose_file(
         output_dir, compose_file, host_workdir, logins, output_filename,
-        platform=None, dind_params=None, use_host_docker=False, show_progress=True):
+        platform=None, dind_params=None, registry_certs=None, use_host_docker=False, show_progress=True):
     """
     Creates a container bundle using Docker (either Host Docker or Docker in Docker)
 
@@ -457,6 +470,7 @@ def download_containers_by_compose_file(
     :param logins: List of logins to perform: each element of the list must
                    be either a 2-tuple: (USERNAME, PASSWORD) or a 3-tuple:
                    (REGISTRY, USERNAME, PASSWORD) or equivalent iterable.
+    :param registry_certs: mount directory with certs.d of registries.
     :param output_filename: Output filename of the processed Docker Compose
                             YAML.
     :param platform: Container Platform to fetch (if an image is multi-arch
@@ -502,7 +516,7 @@ def download_containers_by_compose_file(
         manager = DockerManager(output_dir)
     else:
         log.debug("Using DindManager")
-        manager = DindManager(output_dir, host_workdir)
+        manager = DindManager(output_dir, host_workdir, registry_certs=registry_certs)
 
     network = get_own_network()
 
diff --git a/tcbuilder/cli/bundle.py b/tcbuilder/cli/bundle.py
index d8a5da9..7cc9405 100644
--- a/tcbuilder/cli/bundle.py
+++ b/tcbuilder/cli/bundle.py
@@ -17,12 +17,13 @@ log = logging.getLogger("torizon." + __name__)
 
 # pylint: disable=too-many-arguments
 def bundle(bundle_dir, compose_file, force=False, platform=None,
-           logins=None, dind_params=None):
+           logins=None, dind_params=None, registry_certs=None):
     """Main handler of the bundle command (CLI layer)
 
     :param bundle_dir: Name of bundle directory (that will be created in the
                        working directory).
     :param compose_file: Relative path to the input compose file.
+    :param registry_certs: Directory with registry certificates to be mounted.
     :param force: Whether or not to overwrite the (output) bundle directory
                   if it already exists.
     :param platform: Default platform to use when fetching multi-platform
@@ -50,6 +51,7 @@ def bundle(bundle_dir, compose_file, force=False, platform=None,
     bundle_be.download_containers_by_compose_file(
         bundle_dir, compose_file, host_workdir, logins,
         output_filename=common.DOCKER_BUNDLE_FILENAME,
+        registry_certs=registry_certs,
         platform=platform,
         dind_params=dind_params)
 
@@ -96,7 +98,8 @@ def do_bundle(args):
            force=args.force,
            platform=args.platform,
            logins=logins,
-           dind_params=args.dind_params)
+           dind_params=args.dind_params,
+           registry_certs=args.registry_certs)
 
     common.set_output_ownership(args.bundle_directory)
 
@@ -150,6 +153,9 @@ def init_parser(subparsers):
               "tool (can be employed multiple times). The parameter will be processed "
               "by the Docker daemon (dockerd) running in the container. Please see "
               "Docker documentation for more information."))
+    subparser.add_argument(
+        "--registry-certs", dest="registry_certs",
+        help=("Enables to pass the registry certs.d directory."))
 
     # Temporary solution to provide better messages (DEPRECATED since 2021-05-25).
     subparser.add_argument(
2 Likes

Looks good and thank you for providing your changes as a reference. This will be a big help to the team when they get around to this task.

I’ll try to remember to inform you once you’ve formally added this feature from our side. For the meantime feel free to use your patched version of TorizonCore Builder for your use case here.

Best Regards,
Jeremias

Please be aware, that supporting private registries requires providing certificate for the private registry. We found out an issue with proposed above solution when building on Windows platform.
Let’s assume that a registry has got address:
some-internal-registry.corporation.net:443
Then to provide the certificates a directory in format:
/etc/docker/certs.d/some-internal-registry.corporation.net:443/
shall be handed to the TCB DinD. It shall contain expected certificate with proper file name.
Sadly, Windows cannot handle directories with “:”, so even repository checkout cannot do this (to be honest - fails silently without notice).

It does not affect the WSL based builds (or linux in general).

It is related to the Feature Release TCB-119

Thank you for keeping us informed of this. The team is actually in the process of working on a fix for this issue. I’ll relay your message here so they can check if this issue affects the implementation they are working on.

Best Regards,
Jeremias

Just to give an update, the team has added a feature to allow the bundle command to accept cacerts to validate alternate registries. This can be seen on the public TorizonCore Builder repo here: platform push: --cacert-to and --login-to parameters added · toradex/torizoncore-builder@019bf7a · GitHub

We have not created a new release of TorizonCore Builder with this change yet however. In order to get this fix now you can either use the setup script to pull the early-access tag of TorizonCore Builder. Or just build the latest TorizonCore Builder tool itself from the repo source.

Best Regards,
Jeremias