Willus.com Home   |   Archive   |   About  

Willus.com's MinGW/Gnu C Tips

MinGW Tips
Overview
Install Notes
Benchmark
Starter Links
Compile Flags
Fast Math Funcs
DLLs
Globbing
Console
Binary Stdout
Predefined Macros
Show All Tips
 
  Overview
This page has some useful links for the MinGW C compiler and lists some useful information/tips that I've learned about using MinGW. Click on any of the links in the left (blue) or right (green) tables.

Benchmarks

DEC 29, 2011 See my latest 2011 Win32/64 C Compiler Benchmarks.

I did my first compiler benchmarks in 2002. Below are some more recent ones that are more limited in scope.

NOV 19, 2011 (There are older 2010 benchmarks below. Scroll down for those.)
Today I benchmarked MinGW gcc 4.5.2 (32-bit and 64-bit), gcc 4.4.x (32-bit and 64-bit), and Fabrice Bellard's TCC, an amazingly small compiler that emphasizes compile speed. All benchmarks were compiled and run on my home PC, a Core i5-670 with 3.7 GHz turbo-boost speed running 64-bit Windows 7 (SP 1). TCC lives up to its claims of compiling almost 10 times faster than gcc, but, as might be expected, that compile performance comes with a significant penalty in executable performance (3 - 4 times slower in these two benchmarks). For what it is intended to be--a very lightweight, pseudo-scriptable C compiler, TCC is a remarkable achievement.

Benchmark #1, Image manipulation: crop and re-size 163 images
Lines of code: Approx. 250 K
Compiler Compiler Options Exe Type Compile Time Exe Size Exe Run Time
gcc 4.4.0 -O3 -ffast-math -m32 32-bit 70.0 s 2.09 MiB 105.4 s
gcc 4.4.5 -O3 -ffast-math -m64 64-bit 74.0 s 2.33 MiB 78.9 s
gcc 4.5.2 -O3 -ffast-math -m32 32-bit 93.5 s 2.24 MiB 105.7 s
gcc 4.5.2 -O3 -ffast-math -m64 64-bit 92.2 s 2.28 MiB 76.5 s
tcc 0.9.25 (none) 32-bit 10.2 s 2.22 MiB 325.1 s


Benchmark #2, Beam-Wave Interaction Simulator (heavy use of floating point)
Lines of code: Approx. 350 K
Compiler Compiler Options Exe Type Compile Time Exe Size Exe Run Time
gcc 4.4.0 -O3 -ffast-math -m32 32-bit 99.5 s 3.54 MiB 29.0 s
gcc 4.4.5 -O3 -ffast-math -m64 64-bit 102.3 s 3.98 MiB 25.3 s
gcc 4.5.2 -O3 -ffast-math -m32 32-bit 127.1 s 3.79 MiB 27.6 s
gcc 4.5.2 -O3 -ffast-math -m64 64-bit 126.4 s 3.95 MiB 25.8 s
tcc 0.9.25 (none) 32-bit 13.0 s 3.96 MiB 93.8 s



FEB 2, 2010
I was able to benchmark some different versions of MinGW gcc with both 32-bit and 64-bit executables, mostly under Windows 7 64-bit (freshly installed). I ran a couple of benchmarks. Interestingly, for one benchmark, going from gcc 3.X to gcc 4.X made a factor of two difference in the speed. For another, it made almost no difference. Other notes: both benchmarks are single threaded; using the -march=native and -mtune=native (I think -mtune=native is the default) didn't buy much over just using -O3 -ffast-math; and 64-bit bought about a 15% - 25% improvement over 32-bit.

Benchmark #1, Crop and re-size 200 images
Compiler/Flags Compiled on Exe Type Exe Run Time Run on
gcc 3.4.2 -O3 -ffast-math AMD 3200+ 32-bit 548 s Win XP/AMD 3200+ 2.0 GHz
gcc 3.4.2 -O3 -ffast-math AMD 3200+ 32-bit 451 s Win 7/Core i5 3.46 GHz
gcc 4.4.3 (no optimization) Core i5 64-bit 420 s Win 7/Core i5 3.46 GHz
gcc 3.4.2 -O3 -ffast-math Core i5 32-bit 254 s Win 7/Core i5 3.46 GHz
gcc 4.4.0 -O3 -ffast-math Core i5 32-bit 115 s Win 7/Core i5 3.46 GHz
gcc 4.4.3 -O3 -ffast-math Core i5 64-bit 90 s Win 7/Core i5 3.46 GHz
gcc 4.4.3 -O3 -ffast-math -march=native -mtune=native Core i5 64-bit 90 s Win 7/Core i5 3.46 GHz

Benchmark #2, Beam-Wave Interaction Simulator
Compiler/Flags Compiled on Exe Type Exe Run Time Run on
gcc 3.2 -O3 -ffast-math AMD 3200+ 32-bit 79 s Win XP/AMD 3200+ 2.0 GHz
gcc 3.2 -O3 -ffast-math AMD 3200+ 32-bit 40 s Win 7/Core i5 3.7 GHz*
gcc 3.4.2 -O3 -ffast-math Core i5 32-bit 28.9 s Win 7/Core i5 3.7 GHz*
gcc 4.4.0 -O3 -ffast-math Core i5 32-bit 29.0 s Win 7/Core i5 3.7 GHz*
gcc 4.4.3 -O3 -ffast-math Core i5 64-bit 25.8 s Win 7/Core i5 3.7 GHz*
* - 3.7 GHz is clock speed in turbo boost mode.

Benchmark #3, Info-Zip's Zip 3.1/Unzip 6.0
All versions were run on a core-i5 670 with turbo-boost (Windows 7).
Run times were to zip and unzip (-t) 10.4 GB of mostly JPEG files.
Compiler/Flags Compiled on Exe Type Zip Run Time Unzip Run Time
gcc 3.2 -O3 AMD 3200+ 32-bit 607 s 110 s
gcc 3.4.2 -O3 Core i5 32-bit 569 s 111 s
gcc 4.4.0 -O3 Core i5 32-bit 543 s 109 s
gcc 4.4.3 -O3 Core i5 64-bit 513 s 110 s


Benchmark #4, Info-Zip's Zip 3.1/Unzip 6.0 using bzip2 compression
All versions were run on a core-i5 670 with turbo-boost (Windows 7).
Run times were to zip (-Z bzip2) and unzip (-t) 2.3 GB of mostly JPEG files.
(Strange to note the large run time for gcc 4.4.0 32-bit.)
Compiler/Flags Compiled on Exe Type Zip Run Time Unzip Run Time
gcc 3.2 -O3 AMD 3200+ 32-bit 448 s 200 s
gcc 3.4.2 -O3 Core i5 32-bit 486 s 197 s
gcc 4.4.0 -O3 Core i5 32-bit 711 s 217 s
gcc 4.4.3 -O3 Core i5 64-bit 389 s 181 s


Starter Links and References
There are several good links to MinGW documentation at the MinGW site. Advanced users can quickly find answers to many technical questions by searching through the MinGW users mail archive. For introductions, try Colin Peters' Programming Win32 with GNU C and C++ page (local mirror here). A development environment for MinGW can be obtained from the folks at bloodshed.net.

Compile Flags
I do a lot of numeric (double precision floating point) programming, and my experience is that the -O3 -ffast-math compile flag combination consistently yields just about the best result. Be warned that -ffast-math takes some math shortcuts and does not follow all IEEE error-handling conventions, so if you use this flag, you should verify your results, especially if you need very high accuracy. My experience is that it is well worth the speed boost to use this flag. I have seen on the MinGW users mail archive that -Os (optimize for small code size) can also yield the best results in some cases, perhaps because it allows the code to fit better into the CPU's L1 or L2 cache. I also prefer to use the -Wall flag to report all warnings. This is an excellent practice.

Some Fast Math Functions

[Note 2011: My inline math functions are essentially not necessary anymore with recent versions of gcc, but I leave this page up out of historical interest. See Note 3 below.]

When MinGW's pow function became 10x slower in release 3.0 and caused some of my codes which used it heavily to become much slower, I started investigating ways to implement some faster math functions. I first patched the 3.0 pow() function to go back to how it was in 2.0, but then I decided to be more aggressive. The floating point unit in most modern Intel and AMD CPU's (e.g. Pentiums and Athlons) has many built-in transcendental functions such as sine, cosine, arc-tangent, etc. These built-ins are automatically used by the Microsoft C run-time library DLL which MinGW links to by default, but making calls to the DLL typically incurs significant overhead. You can use the header file here to in-line some of these functions for faster performance on Pentiums and Athlons. It requires use of the -ffast-math compile flag. I took some of the code from Chapter 14 (pp. 807-808) of the Art of Assembly Language link below. Note that the exp() and atan2() in-line versions are actually slower on a 64-bit Opteron compile (SuSE Linux 8.0). Also note that these in-line functions do not do any error checking or trapping of any kind.

NOTE! My in-line pow() function now returns correct results if the first argument is zero (Rev 1.01).

NOTE 2! GCC v4.0 will include a more complete set of fast math intrinsics for x87-compatible processors, including fsincos.

NOTE 3! (4-11-2010) I've noticed lately that the difference between my in-lines and the gcc 3.x/4.x defaults depends significantly on what arguments are sent to the functions. Sometimes mine are faster; sometimes the gcc defaults are faster. In general, with gcc 4.x, I've found that only my sincos in-line gives me any benefit over the gcc default on Core 2 processors, and it's not by much.

x87inline.h   |   x87test.c   |   Art of Assembly
In-line Assy How-To   |   In-line Assy Linux Docs   |   Gnu C In-line Assy docs

Results:    PIII   |   P4 Xeon   |   Opteron (32-bit)   |   Opteron (64-bit)


MinGW and Win32 DLLs
A very cool feature in MinGW is that you can put DLL files directly in the compile/link command (just like .o files) to be linked into your program without the need to create library stub files. For more about how to do that, try Colin Peters' DLL Page (local mirror here) (from Colin Peters' Win32 Programming Page--local mirror here). Colin Peters started MinGW.

Also, regarding function name mangling (decorating) in MinGW, try Wu YongWei's Page (local archived copy).

No Globbing
By default compile, if you run a MinGW compiled command-line utility and pass it a wildcard argument such as *.c, it acts exactly as a unix utility and looks for every file ending in .c in your current file directory, replacing the *.c argument with the name of every one of those files so that your program never actually sees the *.c. To prevent this "globbing," put CRT_noglob.o (in the MinGW library directory) at the beginning of your link list when linking.

No Console Window
To make sure your application doesn't open a console window, use the -mwindows flag when linking.

Binary stdout
If you want the output from the stdout FILE stream to be binary (no translation of CR/LF chars):

  #ifdef __MINGW32__
  /* Required header file */
  #include <fcntl.h>
  #endif
...
  #ifdef __MINGW32__
  /* Switch to binary mode */
  _setmode(_fileno(stdout),_O_BINARY);
  #endif


Seeing all predefined macros
Here's how to see all predefined macros from gcc. Create a file test.c that has one line in it:

  int main(void) {}

Then compile with this command:

  gcc -dM -E test.c

Even easier (I got an e-mail tip on this)

  echo . | gcc -dM -E -

(These work with any gcc port.)

 
MinGW Links
MinGW64 Home
MinGW Home
Users Mail Archive
MinGW Bug List
Gnu C Home

Willus.com Links
Willus.com Home
Win32/64 Compilers
Win32/64 Software
Compiler Benchmark

This page last modified
Saturday, 19-Nov-2011 09:41:53 MST