
I've spent the last week or so modifying the C++ code for my game to optimize the included headers with a tool called include-what-you-use. This uses the Clang compiler to analyze dependencies and suggests what files to add or remove. Unfortunately, while I hoped this process would improve compile times for my game, it actually made them about 50% worse.
Background
My adult visual novel and management sim Up There They Love is built using a custom engine split into two main parts: A C++ backend and an HTML/CSS/JavaScript frontend. The game executable uses QHttpServer to set up a web server and the Chromium Embedded Framework to loop back on itself and render the frontend.
Right now, I have 291 source files for the C++ backend, about half of which are generated with my Panini framework for code generation. Instead of figuring out all the dependencies for each file, I dumped all my dependencies in a precompiled header. But there are trade-offs with that approach.
Precompiled headers
Precompiled headers (PCH) are a C++ feature for managing your dependencies. You designate one header for the compiler to apply before compiling source files. If done well, precompiled headers can significantly speed up your average compilation time.
The main downside to precompiled headers is that they reduce the effectiveness of incremental builds. This flag tells the compiler to apply incremental changes to the final output based on the previous compilation output. For most code changes, incremental building speeds up compilation time significantly. But when you change the precompiled header, the compiler must discard the previous compilation result and compile the entire project from scratch. On large projects, this can significantly increase iteration times.
Using precompiled headers also requires discipline from all team members to include only headers and constants that rarely, if ever, change. And as we all should know, discipline is rarely a good policy. What tends to happen is that headers get included in the PCH "just in case" because it takes considerable time to figure out the actual dependencies for each source file.
But what if you could determine the necessary includes automatically? The include-what-you-use program could tell me the exact headers I needed for my project instead of relying on a precompiled header.
Setting up IWYU
The first step to using include-what-you-use is figuring out how to correctly set up the compiler flags. It took me a while to realize that most of the arguments the program accepts are flags for Clang because this isn't laid out specifically in the documentation. Luckily, if you use CMake, you can swap out the compiler, and the program will figure everything out for you:
cmake -DCMAKE_CXX_COMPILER="%VCINSTALLDIR%/bin/cl.exe" -DCMAKE_CXX_INCLUDE_WHAT_YOU_USE=include-what-you-use -G Ninja ...
Unfortunately for me, I don't use CMake. Instead, I generate the Visual Studio project files by hand because I prefer the control, and I'm a masochist like that. I'll spare you the boring details about figuring out the exact compile flags I needed for IWYU, but suffice it to say that I wrote a Python script that ran the program on each of my source files. Here are the flags that worked for me:
include-what-you-use.exe
-Xiwyu --verbose=3 -Xiwyu --comment_style=short -Xiwyu --max_line_length=100 -Xiwyu --pch_in_code
-std=c++17 -fmsc-version=1936 -w -Wno-invalid-token-paste
-march=x86-64 -fms-compatibility -fms-extensions -fdelayed-template-parsing
-Xiwyu --mapping_file=.\properties/msvc.imp
-Xiwyu --mapping_file=.\properties/sssg.imp
-Xiwyu --mapping_file=.\source/server/_Generated/_generated.imp
-DWIN32 -D_WIN32 -D_WINDOWS -D_CRT_SECURE_NO_WARNINGS -D_CRT_SECURE_NO_DEPRECATE
-DWIN32_LEAN_AND_MEAN -DVC_EXTRALEAN -DUNICODE -DNOMINMAX -D_MSVC_LANG=201703 -DNDEBUG -DQT_NO_DEBUG
-DQT_DLL -DQT_CORE_LIB -DQT_NETWORK_LIB -DUSE_CEF
-isystemC:\Program Files (x86)\Microsoft Visual Studio\2017\Community\VC\Tools\MSVC\14.16.27023\include
-isystem..\parts/chromium-embedded
-isystem..\parts/fmod-2.02.07/inc
-isystem..\parts\grantlee-master/include
-isystem..\parts/json-package/include
-isystem..\parts\panini-master/include
-isystem..\parts\qhttp-master/src
-isystemD:\SDK\Qt5.13\5.13.0\msvc2017_64/include
-isystem..\parts/sdl-package/include
-isystem..\parts\sdl-image-master
--include-directory=./dependencies/QtForward
--include-directory=../generated
--include-directory=./source/InkWrapper
--include-directory=./source/server
D:\Projects\SSSG\source/server\Views\Chromium\ChromiumViewService.cpp
With the Python script, I could run IWYU on my entire codebase and get a nice to-do list of errors and warnings to fix. I even went so far as to include a "watch mode" to automatically detect changes in the source folder. This watch mode would then run the files through the IWYU program again to see if it made things better or worse.
A practical example
Here's what the includes for my ChromiumWebService.cpp looked like when I used a precompiled header:
#include "PCH.hpp"
#include "ChromiumViewService.hpp"
#include <include/cef_parser.h>
#include "Services/ServiceLocator.hpp"
#include "Views/Chromium/ScriptingService.hpp"
#include "Game.hpp"
#include "MapSDLScanCodeToVirtualKey.hpp"
#include "StringHelpers.hpp"
And after I had "optimized" them with include-what-you-use:
#include "ChromiumViewService.hpp"
#include <corecrt_wstring.h> // for wcslen
#include <include/base/cef_ref_counted.h> // for scoped_refptr
#include <include/cef_browser.h> // for CefBrowser, CefBrowserHost
#include <include/cef_command_line.h> // for CefCommandLine
#include <include/cef_frame.h> // for CefFrame
#include <include/cef_parser.h> // for CefBase64Encode, CefURIEncode
#include <include/cef_request_context.h> // for CefRequestContext
#include <include/cef_values.h> // for CefDictionaryValue
#include <include/cef_version.h> // for CEF_VERSION
#include <ostream> // for char_traits, strlen, operator<<, string, size_t, basic_ostream, memcpy, operator+, log, stringstream
#include <string> // for operator<<
#include <QtCore/qbytearray.h> // for QByteArray
#include <QtCore/qdir.h> // for QDir
#include <QtCore/qfile.h> // for QFile
#include <QtCore/qglobal.h> // for qMin
#include <QtCore/qmutex.h> // for QMutex, QMutexLocker
#include <QtCore/qstandardpaths.h> // for QStandardPaths
#include <QtCore/qurl.h> // for QUrl
#include <SDL_blendmode.h> // for SDL_BLENDMODE_BLEND
#include <SDL_error.h> // for SDL_GetError
#include <SDL_filesystem.h> // for SDL_GetBasePath
#include <SDL_keycode.h> // for KMOD_RALT, KMOD_RCTRL, KMOD_RSHIFT, KMOD_CAPS, KMOD_LALT, KMOD_LCTRL, KMOD_LSHIFT, KMOD_NUM
#include <SDL_mouse.h> // for SDL_CreateSystemCursor, SDL_GetDefaultCursor, SDL_SetCursor, SDL_SYSTEM_CURSOR_CROSSHAIR, SDL_SYSTEM_CURSOR_HAND, SDL_Cursor, SDL_SYSTEM_CURSOR_ARROW, SDL_SYSTEM_CURSOR_IBEAM, SDL_SYSTEM_CURSOR_NO, SDL_SYSTEM_CURSOR_SIZEALL, SDL_SYSTEM_CURSOR_SIZENESW, SDL_SYSTEM_CURSOR_SIZENS, SDL_SYSTEM_CURSOR_SIZENWSE, SDL_SYSTEM_CURSOR_SIZEWE, SDL_SYSTEM_CURSOR_WAIT
#include <SDL_pixels.h> // for SDL_PIXELFORMAT_ARGB8888
#include "Logging.hpp" // for Logging, SSSG_LOG_ERROR, SSSG_LOG_TRACE, SSSG_LOG_INFO
#include "MapSDLScanCodeToVirtualKey.hpp" // for mapSDLScanCodeToVirtualKey
#include "Platform.hpp" // for IsBuildShipping, SSSG_BUILD_DEBUG, SSSG_BUILD_SHIPPING
#include "Services/ServiceLocator.hpp" // for ServiceLocator
#include "StringHelpers.hpp" // for stringToString
#include "Views/Chromium/ScriptingService.hpp" // for ScriptingService
#include "_Generated/Constants.hpp" // for WINDOW_SCROLL_MULTIPLIER
The comments after each include are generated automatically by IWYU to help keep things organized. Getting the includes right can be very fiddly. At one point, I had a clean report from the IWYU script on my local machine, but the same changes broke on my build machine. That was because my local machine used headers from Visual Studio 2022 while the build machine was still on VS 2017.
Results
The whole optimization process took about a week to complete. Luckily, this was all unpaid labor on a hobby project, this next part would be extremely frustrating otherwise.
After getting a clean bill of health from my build machine, I decided to check how much faster my game compiles now. My idea was to focus on the build performance of two targets, Debug and Shipping. The former has all compiler optimizations disabled, while the latter enables every optimization possible and even strips out the internal debugging tools. The Shipping target gives the compiler more to do than the Debug target and should always be slower to complete.
I'm using Visual Studio 2022 to compile my game. To get clean measurements, I selected "Project Only" -> "Rebuild Only This Project". This option cleans the project output before compiling and ensures that compiler optimizations like incremental builds don't skew the samples. I took ten samples for each target and calculated their average and median values. I then reran all these tests with the precompiled header option completely disabled to see if that made a difference. And finally, I rolled back all my changes to before I started applying suggestions from IWYU and ran the same tests for a third time.
Here are the results:
.png)
For my Debug target, the IWYU-optimized version compiles 38% slower on average and nearly 48% slower when I disable precompiled headers entirely. It's even worse for my Shipping target. The IWYU-optimized version compiles 45% slower with PCH enabled and 48% slower when I disable precompiled headers entirely.
Conclusion
What, exactly, is the point of "optimizing" your source files to only "include what you use" if it makes compile times 50% worse on average? It's frustrating that nobody seems to have put this central assumption of IWYU to the test. But I don't think it's all bad news.
IWYU at least forced me to compile my game with something that isn't the Visual Studio compiler. Clang has already found some errors in my code that MSVC++ happily accepted. This makes my code more resilient, and fixing these small issues will make porting the game to other platforms much easier.
Still, it would have been nice to know beforehand that I should have just stuck with a precompiled header. I won't undo all my work now, but I hope my tale can prevent someone else from making the same mistake. ✌
