The problem is that the word “transfer” is used in two different domains here: In the Linux software abstraction, a SPI transfer is one unit sent to the driver. It is part of the drivers work to split such a transfer down to whatever the hardware supports (depending on FIFO/DMA etc…)
The DSPI uses the word transfers in the sense of bursts from its FIFO to the transfer engine, which is typically 8-bit. That is the reason you see the delays also between bytes…
Optional SPI slave node properties:
- fsl,spi-cs-sck-delay: a delay in nanoseconds between activating chip
select and the start of clock signal, at the start of a transfer.
- fsl,spi-sck-cs-delay: a delay in nanoseconds between stopping the clock
signal and deactivating chip select, at the end of a transfer.
That said, I agree, especially for the Linux kernel binding description using the word transfer is misleading, especially since it also mentions at the end/beginning of chip select. Normally, the IP would return the chip select to high after each (IP) level transfer (8-Bit), but the driver also uses the continuous mode (see SPIx_PUSHR, CONT).
There is also the FCPCS, which allows to mask these delays (“This bit enables the masking of “After SCK (t ASC )” and “PCS to SCK (t CSC )” delays when operating in Continuous PCS mode.”, see also 12.4.4.4.6 Fast Continuous Selection Format in the Vybrid RM). It seems that would allow to remove the delays leaving just a gap “half of the baudrate”, which, I guess, would lead to a smooth continuous clock signal… It seems not entirely trivial to do that masking.