parallel compiles - parallelizing the software build/compile process



for so long everybody's been doing builds single-threaded and very few people have thought of parallelizing their compiles to save build time (which is usually quite long). old unix build tools like make don't help any in this regard unless

  • you are on a *NIX box
  • you can use & to background the command
  • you hand-craft the makefile and skip autoconf and automake and ./configure, possibly ignoring other kinds of systems.

most of us have computers with multiple threads or cores and lots of RAM and sparing 2 threads/cores isn't much of a problem.

it's not every day I improve someone's build processes. However, I feel like this single-threaded build method has gone on long enough, and I am about to even change my own build code. I need the speed on my old box. even on a new box, I would still like the speed and I want to use my cpu to its potential. why waste cycles?

Who knows, perhaps SOMEONE will come up with a program called pmake (parallel make),pautoconf, and pautomake. or maybe even just enhance make to work in parallel.

building for multiple cpu's is simple enough for anyone with knowledge of basic parallel computing. well, maybe that's not so simple for anyone who is just starting out with computers...


the compile.cmd you see below does not download wellas a .cmd file - it just views as a text file like tou see below. so I have provided it as a zip file, so if you want to modify it, you can.

Download Now - source and executable (7/2/2012, <1KB)


to get started, see I have a batch file package called listed there you can configure gw2.cmd to your liking and use to compile your stuff which adds a manifest to your applications for windows vista/7/8 compatibility. it generates both 32 and 64-bit exe's in one swath, but sequentially right now. I intend to make it compile for x86 and x64 in parallel. it does this currently by using separate directories for the 2 targets, 32\ and 64\.

I thought about parallelizing the compilation process, it would not be hard, but with no blocking and for compiling many exe's or many separate obj files you would have a zillion compilers consuming all your RAM and CPU, not to mention the hard disk would be thrashing...). blocking is a simple wating for a file you create (not the exe, it's gradual) when the when a build process is over. so you can block on that. it sort of acts like Windows' win32 call WaitForMultipleObjects() on some mutexes. because that's what we are basicallyt reproducing, except with files. and because we don't have mutexes, we are having to do 4 semaphores - and trying to avoid deadlock conditions.

unless you do it in a similar fashion to what I will code or describe, it would hang if the compile failed, so checking the errorlevel (checking for failure) on the compiler would be critical. make does this automatically and bails if there is a failure by default.

well, I think I just solved part of the parallel problem. :-) you use START to fire off the jobs in the cmd shell using an extra parameter in the cmd shell to tell the batch file that this is a 32-bit or 64-bit compile job, and not just a compiler argument list.

what to parallelize

there are many ways to parallel compile depending on the kinds of builds you need:

  • parallelizing compiles for multiple processor targets as I have done here. (arm, atom, x86, x64)
  • parallelizing compiles for multiple platforms (mac, windows) if you have a compiler which affords this (such as embarcadero).
  • parellelizing the various projects you are building, if there are few enough of them. I wouldn'toverload the system though. too many can cause the hard disk to threash and peak CPU and max out your RAM and virtual memory or swap and cause the compilers to crash/SEGFAULT or something similar (or just quit for "apparently no reason"). I would be careful with this idea. if you are a developer like me, your projects tend to grow in number.


here is the rundown of the basic algorithm of the batch file:

main process:
	clear all target jobs's directories (make clean)
	fire off 1 background/child process for each target processor for the job
	wait until any compiler failure or all jobs have ended

target processor job (one of these for each target proc):
	compile job
	indicate compiler failure if any in target job's directory by making a file exist
	indicate compiler job has ended in target job's directory by making a file exist

compiler switches for auto-threading/auto-parallelization


gcc switches for auto-threading (however, it's got a fixed number of threads, maybe OpenMP is the thing to use instead, still uses OpenMP/libgomp-1):

gcc switches: -Wall -Wextra -v -save-temps -fopenmp -lgomp -ftree-parallelize-loops=0 -floop-parallelize-all -ftree-slp-vectorize -O2 -fprefetch-loop-arrays -floop-nest-optimize

PLEASE NOTE that you CANNOT currently use anything othr than -O2 with -ftree-parallelize-loops. 11/8/2014


with MSVC++ the switches to auto-parallelize your code are: /QPAR /QPAR /Qpar-report:1 /volatile:ms /Qvec-report:1


use LINQ in .Net to do auto-parallelization.


use LINQ in .Net to do auto-parallelization.


not supported I think.

code (.cmd batch file)

rem ----- name this file compile.cmd
rem -----this is meant to work with 20110812 build of 32 and 64-bit windows zip files of mingw-w64 compiler's auto builds.
set gpp32=c:\mingw-w32-bin_i686-mingw_20110812\bin\i686-w64-mingw32-g++.exe
set gpp64=c:\mingw-w64-bin_i686-mingw_20110812\bin\x86_64-w64-mingw32-g++.exe
if /i @%1@ eq @--32@ goto x32
if /i @%1@ eq @--64@ goto x64

rem -----start the 32-bit compile process
mkdir 32\ 64\
del /q 32\* 64\*
set outfile=%1
rem -----shift the commandline arguments by 1
start compile.cmd --32 %1 %2 %3 %4 %5 %6 %7 %8 %9
start compile.cmd --64 %1 %2 %3 %4 %5 %6 %7 %8 %9

rem -----block until any failure or all are done. question is, is a warning considered a failure with -W -Wall?
if exists 32\failure goto end
if exists 64\failure goto end
if exists 32\done goto end
if exists 64\done goto end
goto lupe

rem ---------------child process section: 64-bit
rem -----get rid of the --64 argument
set outfile=%1
%gpp64% -W -Wall -o 64\%outfile%.exe %1 %2 %3 %4 %5 %6 %7 %8 %9 > 64\%outfile%.err
rem -----you can further expand this beyond just the %9
rem -----if you are smart about it using variables... commandline is 8192 characters long
rem -----indicate compiler error with file existence of 64\failure
if not errorlevel 1 (
echo. > 64\failure
rem -----create file for blocking for main process.
echo. >64\done
goto end

rem ---------------child process section: 32-bit
rem -----get rid of the --32 argument
set outfile=%1
%gpp32% -W -Wall -o 32\%outfile%.exe %1 %2 %3 %4 %5 %6 %7 %8 %9 > 32\%outfile%.err
rem -----you can further expand this beyond just the %9
rem -----if you are smart about it using variables... commandline is 8192 characters long
rem -----indicate compiler error with file existence of 32\failure
if errorlevel 1 (
echo. > 32\failure
rem -----create file for blocking for main process.
echo. >32\done
goto end

type 32\%outfile%.err 64\%outfile%.err