That’s something I don’t understand. Latency at about 3 or 4 for a order of magnitude greater than the clock (10us compared to less than 700ps) is absolutely ridiculous.
Not really, you ever heard of hardware/software partitioning? That’s exactly the underlying reason for this.
I can understand some overhead, but at this level… In fact, even though latency is huge, the jitter - a much important problem - is even worse.
To achieve such low jitter as you expect you would definitely need an RTOS or even run this on a dedicated MCU.
Now, using the micro-controller, why not ? but how many gpios are available ?
There are actually more than 40 GPIOs available as both the parallel camera as well as the parallel display interfaces are not supported on Apalis TK1 and we just routed those MXM3 pins to the K20.
How to send data to the Arm cores without adding latency ?
I’m afraid that is not really possible.
Are they using shared memory ?
No, they are using a dedicated SPI interface.
Is there a possibility to use DMA to copy data directly to the RAM, independently of the CPU ?
No.
This is something I don’t understand. It is very easy to use for example a high performance audio codec. How the audio signals (sometimes multichannel 384 Ks/s) can be handled correctly without having huge amount of jitter ?
This involves buffering and dedicated hardware e.g. like the AVP.
If this works, this means that there are low latency mechanisms to get data. How a stereo audio stream at 192 Ks/s with 24bit data ( 2.6us per sample) can be handled with latency of 10us with random jitter than can be even larger ?
None of them process data at such small junks like 64 bytes but rather buffer a lot more.
That’s not logic : codecs, DACs or ADCs are widely used without any problems, even using the main processor for signal processing.
Yes, using kilobytes if not megabytes of buffering.
How HDMI connections can work with such awful timings ? It’s not logic, HDMI connections works very well with high resolutions.
All done in hardware really.
How PCIe multi channel audio cards can work flawlessly on a standard PC with software plugins, having only the latency of the signal processing (which can be large, but pretty constant) ? That’s not logic.
Just buffering, really. Plus it may even drop some samples without you ever noticing.
So, there must exist some ways to acquire and emit data with timings better than that ! I least, I can accept a large latency if it is guaranteed to be stable, but it is very far from the case !
Not without running an RTOS or changing the way you plan to go about transferring data. Which makes me wonder what exactly you plan on using with those 64 byte chunks and if they are coming off an FPGA why one could not pre-process and buffer them some more so that this whole senseless discussion would be rendered obsolete.
I know that, this is mainly related to Linux, so are there alternatives for the OS ?
No, this is not related to Linux at all. I don’t think any other general purpose OS will meet what you expect be it M$ or fruity.
Even a RTOS could be OK as I only need to do signal processing on several cores. Are there system and toochains other than Linux available for the TK1 board with the minimal drivers amount (at least GPIO and ethernet )?
No, so far I have not heard of any such. The biggest hurdle on TK1 would be how to make use of all the GPU functionality in such a case. However depending on what exactly you are trying to achieve you may just run U-Boot and get away with your own dedicated native stuff making use of certain low-level drivers available there like e.g. for PCIe.