Hey,
have any of you experience with getting moderately fast data transfer (e.g. 50MByte/s) from an Xilinx Artix7 FPGA to an ARM Cortex CPU, in this case the one on the TK1 board, going?
I have looked at the Xilinx XDMA driver. But they explicitly state that that’s only guaranteed to work on x86 systems. One explanation for that I found was that ARM systems may not be cache coherent, and if the FPGA-as-DMA shoves data into the RAM, the CPU might still used old, cached data.
But I’d be “happy” if I was at that stage in the hierarchy of problems
I don’t know whether there are other aspects of the driver which might be x86-only without jumping in my face by using x86-only kernel functions or such, and which might explain unexpected behavior…
So if anyone knows an alternative to the XDMA driver, feel free to mention it. (Although that would also require a matching IP core if not xdma-compatible, right?)
Or if anyone has actually experience with getting this to work on an ARM target…
As the Xilinx forum hasn’t been of that much help with any details, I’m also looking in other places, as unlikely it may be that somebody used this kind of hardware constellation, too
Btw., I’m aware of Xillybus, especially the price tag for it’s IP core…
What’s my experience so far:
I got the XDMA driver compiled against Linux kernel 3.10.105 (Linux 4 Tegra for TK1).
The user-interrupts via MSI-X IRQs do get allocated by the driver, their poll()'ing via provided character devices does not work, though. Which made me, for now, to poll the FPGA’s registers to know about the buffer write position and when I can fetch a block of data… draining the CPU horribly.
Reading/writing registers exposed by the FPGA configuration via /dev/xdma0_user (device file without DMA) works. But for one thing, using the DMA (via AXI-MemoryMapped interface) freezes the CPU silently (no interesting kernel messages), although reading one block of data from the FPGA’s designated DMA-able memory as a test works, so there should be no fundamental error in using this I guess.
Whether I have this CPU freeze depends on a) the driver (legacy vs. current version) and also when there have been changes to the FPGA configuration - in parts which should have nothing to do with the DMA memory or data arbiter register file…
The fact that user interrupts do not work and hence the polling scheme is easily disturbed by other things going on on the CPU, makes it harder to debug this (with lots of time consuming kernel prints and such).