UBI fs / nand read/write speed

What is expected read/write speed of Colibri iMX6ULL module? I use full featured dd to benchmark it:

opkg update
opkg install coreutils
dd bs=1MiB count=100 if=/dev/zero of=trash status=progress
reboot
dd bs=1MiB if=trash of=/dev/null status=progress

since UBI compression is on by default, of course I’m interested more in results with compression turned off
chattr -c trash

or when data is not compressible:
dd bs=1M count=100 if=/dev/urandom of=trash status=progress

With compression enabled I may get ~70MB/s for write and ~7MB/s for read. But after ‘chattr -c’ read speed is 2MB/s in the best case, write speed starts from tens of MB/s and drops to 9MB/s. If I increase block count to 200 it continues dropping to 2.6MB/s at the end.

Are these figures expected? I would expect better figures. At least Colibri iMX7D and Colibri VF61 seem having faster flash. iMX7D with very similar NAND chip has 4x better read speed about 3x better write speed. Perhaps something is not right with NAND module clocks?

Thanks

Hi @Edward

We are using Bonnie++ for testing our hardware and also the read/write speed of Flash. If we compare the values from iMX7 and iMX6ULL, we could not see a big difference between the read/speed of these two modules.

Being that said, iMX6ULL has also a lower CPU clock speed than iMX7 and is single core, so I would expect a lower speed.

Could you do the test with Bonnie on your side if you see a big difference in the reading/writing values of different modules?

Best regards,
Jaski

Here’s the ouput of /bonnie++ -d /1 -u root -s 128 -r 64 on uncompressed /1 on iMX7D:
Version 1.97

------Sequential Output------ --Sequential Input- --Random- Concurrency   1-Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- --Seeks--
Machine        Size K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP  /sec %CP
colibri-imx7   128M   170  99  3394   1  3306   5   693  99 +++++ +++ 700.5 289
Latency             54129us   22827ms    3412ms   13459us     155us   20598us
Version  1.97       ------Sequential Create------ --------Random Create--------
colibri-imx7        -Create-- --Read--- -Delete-- -Create-- --Read--- -Delete--
              files  /sec %CP  /sec %CP  /sec %CP  /sec %CP  /sec %CP  /sec %CP
                 16  4249  53  3873  54  5580  75  5923  80 +++++ +++  2754  67
Latency             59945us     334ms     125ms    2566us     191us    2085us
1.97,1.97,colibri-imx7,1,1569240982,128M,,170,99,3394,1,3306,5,693,99,+++++,+++,700.5,289,16,,,,,4249,53,3873,54,5580,75,5923,80,+++++,+++,2754,67,54129us,22827ms,3412ms,13459us,155us,20598us,59945us,334ms,125ms,2566us,191us,2085us

2x times better sequential write with pretty the same memory chip. Yes, CPU load is surprising 15% on iMX6ULL compared to 1% on iMX7D. But why rewrite is faster than write on iMX6ULL?

Well, 2MB/s or 6MB/s both are in the range of cheap advertising flash drives. But looking at what offer different NAND chip manufacturers, looks like I need decent SD card if better performance is needed, no option to replace on Colibri for better performance, right? Thank you.

Hi Jaski,

Thanks for your reply. Ugh. Bonnie++ takes ages to complete even on PC. Doing from iMX6ULL on USB key lowers sequential output 1.5 times, confirmed 12+ MiB/s by different tools, only 8 according to Bonnie++. I trust more dd, but OK.
2nd problem with Bonnie is that by default it tries to write 2x amount of free RAM, which is 2x 400+ MB, and I have only 300+ MB on NAND. It succeeds, which means data is compressible and UBI is able to fit such large file. OK, then I create folder and disable compression :
root@rcserv:/# mkdir 1
root@rcserv:/# chattr -c 1

and

root@rcserv:/# /bonnie++ -d /1 -u root
Using uid:0, gid:0.
Writing a byte at a time...done
Writing intelligently...Can't write block.: No such file or directory
Can't write block 44722.
root@rcserv:/#

Can’t write block, which must be duo to too large test file. Retrying with lowered data and RAM sizes:

root@rcserv:/# /bonnie++ -d /1 -u root -s 128 -r 64
Using uid:0, gid:0.
Writing a byte at a time...done
Writing intelligently...done
Rewriting...done
Reading a byte at a time...done
Reading intelligently...done
start 'em...done...done...done...done...done...
Create files in sequential order...done.
Stat files in sequential order...done.
Delete files in sequential order...done.
Create files in random order...done.
Stat files in random order...done.
Delete files in random order...done.
Version  1.97       ------Sequential Output------ --Sequential Input- --Random-
Concurrency   1     -Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- --Seeks--
Machine        Size K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP  /sec %CP
rcserv         128M   138  93  1607  15  1673  16   548  97 +++++ +++  2135 229
Latency               104ms     118ms     106ms   52918us    1013us   50179us
Version  1.97       ------Sequential Create------ --------Random Create--------
rcserv              -Create-- --Read--- -Delete-- -Create-- --Read--- -Delete--
              files  /sec %CP  /sec %CP  /sec %CP  /sec %CP  /sec %CP  /sec %CP
                 16  2293  44  2064  36  3335  60  2900  55 +++++ +++  1182  49
Latency             17385us     461ms    9277us     101ms    1028us    8241us
1.97,1.97,rcserv,1,1569318927,128M,,138,93,1607,15,1673,16,548,97,+++++,+++,2135,229,16,,,,,2293,44,2064,36,3335,60,2900,55,+++++,+++,1182,49,104ms,118ms,106ms,52918us,1013us,50179us,17385us,461ms,9277us,101ms,1028us,8241us
root@rcserv:/#

As you see sequential output for block is 1607 K/sec. This is quite close to what I see with dd (2.0-2.6MB/s). Reading NAND datasheet I see block (128k) erase time 1ms, page (2k) program time 300us. Block reprogram speed is limited by Reprogram time is limited by 128k / (0.001 + 128/2*0.0003) = 6+ MB/s. Hm, not a lot, but I saw 4x better NAND performance on iMX7D. Memory chips differ AB/AC at the end, but datasheets provide same erase/program times.

Thanks and regards,
Edward

Hi @Edward

Thanks for your Input.

2x times better sequential write with pretty the same memory chip. Yes, CPU load is surprising 15% on iMX6ULL compared to 1% on iMX7D. But why rewrite is faster than write on iMX6ULL?

Yeah at first glance, this may be surprising, but if you think about slower CPU clock, slower cache and slower RAM of iMX6ULL, then this gives you a good explanation of the results.

But looking at what offer different NAND chip manufacturers, looks like I need decent SD card if better performance is needed, no option to replace on Colibri for better performance, right? Thank you.

What speed are you looking for?

Best regards,
Jaski

Yeah at first glance, this may be surprising, but if you think about slower CPU clock, slower cache and slower RAM of iMX6ULL, then this gives you a good explanation of the results.

VF61 CPU is slower than on iMX6ULL, but as I remember VF61 Colibri had better write performance. But I tried looking at MX30LF4G28AC /AB specification, also compared to other Macronix offers, it seems they all have typical write and rewrite speed about 6+MiB/s, while the worst case specs are much worse than that. Well, this is in the range of cheap and semi working advertising USB pendrives, and this is why I expected more from onboard memory. I tried looking for different parallel NAND offers, Micron for which I found specs have typical write/rewrite about 10MiB/s. Still not a lot. Looks like if better storage performance is needed I should switch to decent SD or USB memory. Of course if data is compressible, UBI compression may help and onboard storage will be fine, so it all depends on application and should be tested for where it’s better to keep data, on external memory or in internal.
Thanks, solved.

Thanks for your feedback. I will check for the results of VF61 and get back to you.

Anyway important data should be kept/backup outside the module.

Best regards,
Jaski