.NET TCPclient behaviour on WCE7 compared to desktop .NET

Henry · July 2, 2018, 8:27am

The following simple data receiving loop, using a socket connection, works well when compiled for desktop windows (using full .NET) but after 2-3 minutes loses the tcpip connection and throws an exception when running on Colibri T20 (using CF .NET 3.5).

(I have also noticed that, before the exception, several byte values received over the tcpip socket are wrong. But this does not happen on Windows dektop.)

Any ideas?

        TcpClient client2 = new TcpClient();            
        client2.Connect("192.168.1.10", 23);    
        NetworkStream stream = client2.GetStream();            
        while (true)
                {                        
                 stream.Read(buffer, 0, 2400);
                }

Henry · July 3, 2018, 10:16am

Adding a Stopwatch and some debugging messages we have now confirmed that, occasionally, the intervals between execution of line 6 are longer than expected.

It seems as if a different thread would be taking control of the CPU code where this code is running. So our cod does not read the data fast enough.

The delay is in the hundreds of ms, and, given the data rate, this could mean that the receiving buffer (in the TcpClient) is filled up with around 24000 bytes per 100ms of delay.

But we have configured the buffers to be much larger:

        client2.ReceiveBufferSize = 480000;
        client2.SendBufferSize = 480000;

Identifying the causing thread and changing priorities could be an option. Still, the problem is puzzling given the size of the buffer.

andy.tx · July 4, 2018, 10:30am

Dear @Henry

In order to reproduce the error as close as possible to your real setup, can you please give me some information about the data source:

How many data packets do you send (per second)?
What is the size of one packet?

Regards, Andy

Henry · July 4, 2018, 11:11am

Dear @andy.tx

Our Colibri T20 is receiving 1600 packets per second, each packet 15 bytes x 10 samples= 150 bytes/packet.

The total data flow is 240.000 bytes per second. These data are coming from a remote ADC over the tcpip socket.

We lose an occasional byte after several seconds. Always at the same time as we see a delay of around several hundred ms that we attribute to another thread taking up CPU time.

andy.tx · July 4, 2018, 2:50pm

Dear @Henry

I’m afraid I was not able to reproduce the problem. I simulated your environment as follows:

Test setup

On my PC I started our internal tool WinSock.exe with the following command line parameters:

winsock.exe -l 150 -i 5 -n 8 8123

The tool acts as a TCP server. It waits for an incoming connection on port 8123, then sends 8 packets of 150 bytes each, every 5ms. This leads to a throughput of 240kB/s.

On the Colibri T20, I ran the TcpReceive.exe application to receive the data and verify that there is no data lost. The Colibri T20 is in its factory default configuration, plus the .NET CF V3.5 is installed.

Conclusion

I received more than 5 million packets (750MB) without losing a single Byte. I tried to stress the WinCe system by running other applications and limiting the CPU frequency to 216MHz - still the data was received reliably.

...
All data received up to packet #5056865
All data received up to packet #5056880
All data received up to packet #5056888
...

Can you please try to locate the difference in your setup? Does my setup work on your hardware, too? The C# application TcpReceive.exe will require some adaptations (IP address) in order to match your environment.

Regards, Andy

Henry · July 12, 2018, 4:08pm

@andy.tx

We haven’t figured out this problem yet, but we are getting there.

Running our data receiver code on Windows, and stressing the system with CPU-hungry applications, the problem is eventually reproduced. Setting the prioty of our thread to high, solves the issue.

Wireshark is showing how the receiver starts sending Duplicate ACKs and asking for retransmissions as a result of missing packets in the sequence.

On the embedded setting the prioriy to 1 improved the situation (it took longer to start missing bytes). But sooner or later it seems as if 100 to 200ms are taken by other threads, and the data receiver is too slow to get all the data.

Since on the embedded this happens even at priority 1, it is a bit surprising.

andy.tx · July 13, 2018, 3:26pm

Dear @Henry

One possible explanation is, that after a while the .NET garbage collector kicks in, causing a longer delay.
If my guess is correct, this could only be solved by receiving the data in a native application outside the .NET framework.

Regards, Andy

raja.tx · July 17, 2018, 9:23am

Dear @Henry,

Thank you for posting your inputs.

If possible, Could you please share reproducible demo application with us, let us reproduce the issue and try to find the workaround or solution.

Also, If you try C++ native application data reading loop testing, please let us know the result.

raja.tx · July 17, 2018, 12:22pm

Dear @Henry,

Please use the attachment feature(Clip ICON) or https://share.toradex.com/ to share VS project with us.
As you said, this would be architecture dependent. However, just let’s try on our side.

raja.tx · July 18, 2018, 2:30pm

Dear @Henry,

Thank you for sharing the demo project. I am looking into the issue and get back you as soon as I find something.

Henry · July 20, 2018, 8:45am

@raja.tx @andy.tx

We have figured the problem. Showing debug messages using the following code sometimes blocks the data receiver code for >100ms . Removing this code eliminates data corruption and data loss (buffer overflow).

this.Invoke((System.Threading.ThreadStart)delegate
{
this.label1.Text = "Waited " + elapsedbeforebyte.ToString() + "ms for the first byte to arrive \n"; 
...
}

raja.tx · July 20, 2018, 9:13am

Dear @Henry,

Thank you for sharing the solution here. Best regards and good luck!

Henry · July 16, 2018, 8:54am

Dear @andy.tx

We haven’t still given up trying the C# data reading loop using TcpClient.Read.

But made interesting observations:

If we increase to 1 the priority of the data reading Thread with Toradex’s tool, we see corrupted bytes coming through the socket from time to time.
If we decrease the priority of this same thread, data corruption doesn-t happen anymore, but we think we start loosing full tcp packets. We assume this is the case because we often see that the amount of bytes available for reading, when the data receiver code calls Tcpclient.Read, reaches 32767bytes. It never goes above this value. Since our TcpClient buffer is much larger, this value seems to indicate an overflow of an intermediate buffer limited to 32KB in size (we don’t see this limit when running the code on Windows desktop).

Henry · July 17, 2018, 9:46am

Dear @raja.tx

Sure, I am happy to send the code we are using at the receiving end (please advice on how to send the VS solution).

However, I think what we are seeing could be architecture dependent.

Our data generator is a custom remote data acquisition board using a Cypress PSOC and the W5500 as a SPI to TCPIP bridge. The buffer size is 32K on the W5500 (http://www.wiznet.io/product-item/w5500/). There is no buffering on the PSOC.

At the receiving end we use x3 ASIX AX88772 with a RX buffer of 20K (since we needed x3 Ethernet ports, of which only one is being used for testing). And a USB HUB to get the data into the Colibri via a USB port.

If you use a PC as a data sender, the buffer size at this end might be bigger than we have (at our data generation end). Your set up could be much more robust against a busy/slow receivers, as the sender would be capable of holding the data for longer, waiting for the receiver to be ready for reception.

Henry · July 18, 2018, 1:11am

Dear @raja.tx

This is the VS2008 solution we are using:
https://share.toradex.com/6f9qn4qajoi4gb9

Click the button, then a data receiver worker thread is created, and 20 seconds delay is provided to play with priorities and affinities of this worker thread, before the socket connection to a remote 192.168.1.10 data sender is created.

After the socket is created, our data generator waits for about 1 second and then starts sending data at 240.000 bytes per sec. We measure the time it takes to receive the slowest block of 24.000 bytes.

Our data comes from a data acq board, in 2’s complement format. We know that the baseline level, for our 24b converter, should be around 8000000. So if we get words far above or below this value plus noise level we assume we are detecting corrupted data.

In our hands, if we stress the CPU, we immediately see corrupted words and longer than expected latencies for the 24000byte testing block size.