Torizon cloud remote access when network interface changes

Hi,

For now I guess it is more of a general question. I am working with IoT devices that use Tezi 6.6.1 b14 OS as core (with a few local modifications irrelevant to this question).
Our devices support 4g / ethernet / wifi connectivity. We have one device that has been struggling to connect to dwg torizon io server (remote access service log below)
Given that there is definitely internet connection (pings to google are fine) as well as there are at least 2 interfaces with active internet connection (usually wifi and 4g):
What could be the reason for the timeouts during the session fetching process ?

remote acces service log (as a new user i’m not allowed to upload files and only allowed max 2 links, so i had to edit some links out):

Jun 20 14:47:16 verdin-am62-15207038 rac[855]: [2024-06-20T14:47:16Z INFO ] No remote session found for device
Jun 20 14:47:19 verdin-am62-15207038 rac[855]: [2024-06-20T14:47:19Z INFO ] Received new session
Jun 20 14:47:19 verdin-am62-15207038 rac[855]: [2024-06-20T14:47:19Z WARN ] config file exists
Jun 20 14:47:19 verdin-am62-15207038 rac[2001]: Server listening on 127.0.0.1 port 45957.
Jun 20 14:47:20 verdin-am62-15207038 rac[855]: [2024-06-20T14:47:20Z INFO ] ssh to ssh://9d73125b-fb1e-4e37-b0c1-161ba892d824@ras.torizon.io:2222
Jun 20 14:47:26 verdin-am62-15207038 rac[855]: [2024-06-20T14:47:26Z INFO ] Accepting server public key: ssh-ed25519 AAAAC3NzaC1lZDI1NTE5AAAAIF/USc7Hk35wU0N14xAoFwwGVJJITuP+zsmvNQirE976
Jun 20 14:47:26 verdin-am62-15207038 rac[855]: [2024-06-20T14:47:26Z INFO ] requesting remote port forwarding to localhost:8661 (-R)
Jun 20 14:48:25 verdin-am62-15207038 rac[855]: [2024-06-20T14:48:25Z INFO ] Received connection from 127.0.0.1 on 127.0.0.1:8661 handling with SpawnedSshd(SpawnedSshdSession { sshd_path: “/usr/sbin/sshd”, config_dir: “/home/torizon/run/rac”, host_key_path: None, used_port: Some(45957), strict_mode: true })
Jun 20 14:49:05 verdin-am62-15207038 rac[855]: [2024-06-20T14:49:05Z ERROR] could not get sessions, trying later
Jun 20 14:49:05 verdin-am62-15207038 rac[855]: 0: Error in ssh session loop
Jun 20 14:49:05 verdin-am62-15207038 rac[855]: 1: could not get remote-sessions: Failed to fetch https: dgw torizon io/director/remote-sessions.json: Transport ‘other’ error fetching ‘https dgw torizon io/director/remote-sessions.json’: error sending request for url (https dgw torizon io/director/remote-sessions.json): operation timed out
Jun 20 14:49:05 verdin-am62-15207038 rac[855]:
Jun 20 14:49:05 verdin-am62-15207038 rac[855]: Backtrace omitted. Run with RUST_BACKTRACE=1 environment variable to display it.
Jun 20 14:49:05 verdin-am62-15207038 rac[855]: Run with RUST_BACKTRACE=full to include source snippets.
Jun 20 14:49:05 verdin-am62-15207038 rac[2004]: Connection closed by 127.0.0.1 port 35468 [preauth]
Jun 20 14:49:05 verdin-am62-15207038 rac[855]: [2024-06-20T14:49:05Z INFO ] ssh ↔ spawned sshd channel ended: Custom { kind: BrokenPipe, error: “channel closed” }
Jun 20 14:49:13 verdin-am62-15207038 rac[855]: [2024-06-20T14:49:13Z INFO ] Received new session
Jun 20 14:49:13 verdin-am62-15207038 rac[855]: [2024-06-20T14:49:13Z WARN ] config file exists
Jun 20 14:49:13 verdin-am62-15207038 rac[2007]: Server listening on 127.0.0.1 port 34583.
Jun 20 14:49:15 verdin-am62-15207038 rac[855]: [2024-06-20T14:49:15Z INFO ] ssh to ssh://9d73125b-fb1e-4e37-b0c1-161ba892d824@ras.torizon.io:2222
Jun 20 14:49:21 verdin-am62-15207038 rac[855]: [2024-06-20T14:49:21Z INFO ] Accepting server public key: ssh-ed25519 AAAAC3NzaC1lZDI1NTE5AAAAIF/USc7Hk35wU0N14xAoFwwGVJJITuP+zsmvNQirE976
Jun 20 14:49:21 verdin-am62-15207038 rac[855]: [2024-06-20T14:49:21Z INFO ] requesting remote port forwarding to localhost:8661 (-R)
Jun 20 14:50:41 verdin-am62-15207038 rac[855]: [2024-06-20T14:50:41Z ERROR] could not get sessions, trying later
Jun 20 14:50:41 verdin-am62-15207038 rac[855]: 0: Error in ssh session loop
Jun 20 14:50:41 verdin-am62-15207038 rac[855]: 1: error sending request for url (https dgw torizon io/ras/sessions): operation timed out
Jun 20 14:50:41 verdin-am62-15207038 rac[855]: 2: operation timed out
Jun 20 14:50:41 verdin-am62-15207038 rac[855]:
Jun 20 14:50:41 verdin-am62-15207038 rac[855]: Backtrace omitted. Run with RUST_BACKTRACE=1 environment variable to display it.
Jun 20 14:50:41 verdin-am62-15207038 rac[855]: Run with RUST_BACKTRACE=full to include source snippets.
Jun 20 14:50:50 verdin-am62-15207038 rac[855]: [2024-06-20T14:50:50Z INFO ] Received new session
Jun 20 14:50:50 verdin-am62-15207038 rac[855]: [2024-06-20T14:50:50Z WARN ] config file exists
Jun 20 14:50:50 verdin-am62-15207038 rac[2033]: Server listening on 127.0.0.1 port 40705.
Jun 20 14:50:51 verdin-am62-15207038 rac[855]: [2024-06-20T14:50:51Z INFO ] ssh to ssh://9d73125b-fb1e-4e37-b0c1-161ba892d824@ras.torizon.io:2222
Jun 20 14:50:56 verdin-am62-15207038 rac[855]: [2024-06-20T14:50:56Z INFO ] Accepting server public key: ssh-ed25519 AAAAC3NzaC1lZDI1NTE5AAAAIF/USc7Hk35wU0N14xAoFwwGVJJITuP+zsmvNQirE976
Jun 20 14:50:57 verdin-am62-15207038 rac[855]: [2024-06-20T14:50:57Z INFO ] requesting remote port forwarding to localhost:8661 (-R)

Hi, I’m the Toradex AI Assistant. I’m in early beta, and I would appreciate your feedback.
The Toradex Support team will follow up with you shortly!

Did you like my answer?

:+1: :-1:


Handshake,

The issue you’re encountering with the Torizon Cloud remote access service timing out, despite having an active internet connection, could be attributed to several factors. Given the context of your setup with multiple network interfaces (4G, Ethernet, WiFi) and the log entries provided, here are some potential causes and troubleshooting steps:

  1. Network Interface Priority: When multiple network interfaces are active, the system might prioritize them differently, which could affect how outbound connections are managed. Ensure that the system’s routing table is configured to prioritize the interface with a stable internet connection for outbound traffic.

  2. Firewall or Security Settings: Verify that any firewall or security settings on the device or within your network are configured to allow outbound connections on the ports used by the Torizon Cloud remote access service. Although you mentioned that port 22 is not blocked, the service might use other ports for establishing connections.

  3. DNS Resolution: Ensure that the DNS resolution is working correctly across all network interfaces. The device should be able to resolve ras.torizon.io and other relevant domains without delay. Misconfigured DNS settings might cause timeouts during the session fetching process.

  4. Network Stability: Even if the internet connection appears to be active, intermittent connectivity issues or high latency on the network can lead to timeouts. This can be more common with mobile networks like 4G. Monitoring the network’s stability and switching to the most stable connection available might help.

  5. Remote Access Service Configuration: Review the configuration of the remote access service on your device. Ensure that it’s correctly set up to handle multiple network interfaces and that there are no conflicting settings that might hinder its ability to establish a connection.

  6. System Time: Ensure that the system time on your device is accurate. Significant time discrepancies can lead to SSL/TLS handshake failures, affecting the ability to establish secure connections.

  7. Debugging Logs: Increase the verbosity of the logging for the remote access service and network-related services. This might provide more insights into where the connection process is failing.

  8. Network Requirements: Revisit the network requirements for the Torizon Cloud, ensuring that all prerequisites are met for a successful connection. This includes not just open ports but also any required protocols or outbound traffic rules.

If after going through these steps the issue persists, it might be beneficial to isolate the problem further by temporarily disabling all but one network interface to see if the issue is related to a specific type of connection. Additionally, testing the remote access functionality in a different network environment

Please, also check if these links can help you solve your problem:

Remote Access | Toradex Developer Center.

Greetings @stasgil,

Let me forward your logs here to our team and see if we can come up with anything.

I also have some initial questions if you don’t mind. So you said you have several devices, but this is the only one that is having issues with remote-access correct?

Is there anything special about this particular device compared to the rest? Has it always had issues like this? Or was it able to use remote-access in the past?

Best Regards,
Jeremias

Remote ssh was stable before on this device, yes… The problems with ssh remote sessions started after I updated the device from Tezi_6.6.0-devel-202403+build.22 to Tezi .6.6.1 b14 for testing a particular service that was switching the wifi network connection every 5 minutes.
Since then I have reverted the OS version and removed my changes, but the remote ssh session problem persists.
So actually a correction to my original post, currently our devices use Tezi_6.6.0-devel-202403+build.22

Jeremias,

Thank you for your quick response. After further investigation it turned out that our sim data limit
was capped and 4g interface was configured as highest priority route, so it transfer any data over remote connection. This question can be considered as closed.

Thank you for letting us know. Glad to hear you were able to find the root cause.