使用gcc、icc和icx编译器编译SuperPI的性能对比测试

SuperPI是著名的圆周率计算软件,通常用于CPU的单线程性能对比测试。目前这款软件的源代码已经开放,可以在gitHub下载。

SuperPI的编译安装非常简单,只要本机安装了make工具和编译器,就可以使用make all命令对源码进行编译安装。如有需要,还可以通过修改makefile的方式更换编译器,以便更好地适应硬件平台特性。

测试环境:

CPU:Intel Core i9 12900KF

内存:2×32GB DDR4 3000

操作系统:Ubuntu20.04.4LTS

编译器版本:

$ gcc -v

.....

gcc version 9.4.0 (Ubuntu 9.4.0-1ubuntu1~20.04.1)

$ icc -v

icc version 2021.5.0 (gcc version 9.4.0 compatibility)

$ icx -v

Intel(R) oneAPI DPC++/C++ Compiler 2022.0.0 (2022.0.0.20211123)

......

SuperPI解压三份,目录分别重命名为:SuperPI-gcc、SuperPI-icc和SuperPI-icx,并将其中makefile中的编译器分别设置为gcc、icc和icx,随后使用make all命令进行编译。编译后生成的主程序对比:

$ ll ./SuperPI-*/pi_css5

-rwxrwxr-x 1 uxr uxr 1093360 May  11 11:57 ./SuperPI-gcc/pi_css5*

-rwxrwxr-x 1 uxr uxr 1301584 May  11 12:06 ./SuperPI-icc/pi_css5*

-rwxrwxr-x 1 uxr uxr 1099448 May  11 12:04 ./SuperPI-icx/pi_css5*

可以看到gcc编译生成的主程序是最小的;icx生成的程序略大于gcc;icc生成的程序是最大的,比gcc生成的程序大了将近20%。

测试命令:

$ ./pi_css5 $((1<<26))

这一命令的预计将计算圆周率小数点后2的26次方位的数值,即64M(6700万)位。

测试结果:

gcc:

Calculation of PI using FFT and AGM, ver. LG1.1.2-MP1.5.2a.memsave

initializing...

nfft= 16777216

radix= 10000

error_margin= 0.365078

calculating 67108864 digits of PI...

AGM iteration

precision= 48: 3.80 sec

precision= 80: 3.77 sec

precision= 176: 3.77 sec

precision= 352: 3.78 sec

precision= 688: 3.78 sec

precision= 1392: 3.78 sec

precision= 2784: 3.77 sec

precision= 5584: 3.78 sec

precision= 11168: 3.78 sec

precision= 22336: 3.78 sec

precision= 44688: 3.77 sec

precision= 89408: 3.80 sec

precision= 178816: 3.77 sec

precision= 357648: 3.77 sec

precision= 715312: 3.78 sec

precision= 1430640: 3.78 sec

precision= 2861280: 3.77 sec

precision= 5722592: 3.77 sec

precision= 11445200: 3.77 sec

precision= 22890416: 3.77 sec

precision= 45780848: 3.77 sec

precision= 91561728: 3.78 sec

writing pi67108864.txt...

93.72 sec. (real time)

icc:

Calculation of PI using FFT and AGM, ver. LG1.1.2-MP1.5.2a.memsave

initializing...

nfft= 16777216

radix= 10000

error_margin= 0.365078

calculating 67108864 digits of PI...

AGM iteration

precision= 48: 3.84 sec

precision= 80: 3.81 sec

precision= 176: 3.82 sec

precision= 352: 3.86 sec

precision= 688: 3.86 sec

precision= 1392: 3.86 sec

precision= 2784: 3.86 sec

precision= 5584: 3.82 sec

precision= 11168: 3.82 sec

precision= 22336: 3.82 sec

precision= 44688: 3.82 sec

precision= 89408: 3.82 sec

precision= 178816: 3.81 sec

precision= 357648: 3.81 sec

precision= 715312: 3.82 sec

precision= 1430640: 3.82 sec

precision= 2861280: 3.81 sec

precision= 5722592: 3.82 sec

precision= 11445200: 3.82 sec

precision= 22890416: 3.82 sec

precision= 45780848: 3.81 sec

precision= 91561728: 3.82 sec

writing pi67108864.txt...

94.92 sec. (real time)

icx:

Calculation of PI using FFT and AGM, ver. LG1.1.2-MP1.5.2a.memsave

initializing...

nfft= 16777216

radix= 10000

error_margin= 0.365078

calculating 67108864 digits of PI...

AGM iteration

precision= 48: 3.78 sec

precision= 80: 3.75 sec

precision= 176: 3.75 sec

precision= 352: 3.75 sec

precision= 688: 3.78 sec

precision= 1392: 3.75 sec

precision= 2784: 3.76 sec

precision= 5584: 3.75 sec

precision= 11168: 3.75 sec

precision= 22336: 3.75 sec

precision= 44688: 3.76 sec

precision= 89408: 3.76 sec

precision= 178816: 3.75 sec

precision= 357648: 3.75 sec

precision= 715312: 3.76 sec

precision= 1430640: 3.76 sec

precision= 2861280: 3.75 sec

precision= 5722592: 3.75 sec

precision= 11445200: 3.75 sec

precision= 22890416: 3.75 sec

precision= 45780848: 3.75 sec

precision= 91561728: 3.76 sec

writing pi67108864.txt...

93.29 sec. (real time)

结论

在本次测试的三种编译器中,作为测试基准的gcc9.4.0编译器,性能优于icc而逊于icx。icx性能最强,但性能仅比gcc强不到1%;icc性能最差,但也仅比gcc差1%左右。考虑到测试过程中可能存在误差,因此结论是:在运行单线程程序时,三种编译器编译而成的程序性能基本一致

后记:

在运行2^26位测试后,又分别进行了2^27和2^28测试,但由于程序限制,实际生成的位数为0.75×2^27位和0.75×2^28位,近似于1亿位和2亿位。测试结果分别为:

gcc:

1亿位203.22s

2亿位467.73s

icc:

1亿位200.50s

2亿位465.03s

icx:

1亿位201.77s

2亿位460.28s

因此不难看出结论仍然成立,即在运行单线程程序时,三种编译器编译而成的程序性能基本一致。

站务

全部专栏