I am building a C++ application for an Apalis iMX6Q that makes extensive use of double’s, and I would like to ensure that I am getting the maximum performance. However, I am a bit confused about NEON vs VFPv3 and when/how each of these is used.
Given the following trivial test program:
int main()
{
double a = 2.234535463524;
double c = 4.23462354234;
double b = b/a;
return 0;
}
With arm-gcc 6.2.0, I can specify the fpu: neon, or vfpv3. However, when I compile the above code (with -O0 to prevent the code from getting optimized away), I see no difference between these two options. In both cases, the division get translated to an vdiv, like so:
vldr.64 d18, [fp, #-28]
vldr.64 d17, [fp, #-12]
vdiv.f64 d16, d18, d17
vstr.64 d16, [fp, #-28]
This is confusing, since both the NEON and the vfp instruction sets include a vdiv instruction. So what determines whether a floating point operation is executed on the NEON or vfp units?
Thanks in advance,
Jeroen