Writing simulations of ever larger virtual flocks is a minor obsession I have returned to a handful of times over the years. My last serious attempt at cramming in as many particles as possible lead me to a) spatial algorithms, and b) Rust.
In such a flocking simulator each agent reacts to the position and velocity of its neighbouring flock mates. This generates emergent (and so very pretty) flocking patterns that swirl and undulate.
So to make nearest neighbour lookup fast, it turns out you can eschew fancy recursive trees in favour of a very mechanically sympathetic data structure that uses a flat contiguous array of values to provide quick, but approximate results. 1
Here is what 100,000 agents (aka boids) looks like. The video compression is kind of meh, but hopefully you get the idea.
Possibly in the grip of extreme hubris, I think it could be fun to try and have a go at a simulation that can do a cool 1 million at 60FPS. Any and all suggestions welcome as how to do this. Calculation of each update is "embarrassingly parallel" so I have the vague idea of using SIMD somehow....
1 ) The paper I adapted the data structure from is titled "Simple and efficient approximate nearest neighbor search using spatial sorting" and was published in the 2015 28th SIBGRAPI Conference on Graphics, Patterns and Images.
Essentially you place each agent in a 2D grid, where the arrangement within the grid roughly resembles the arrangement of boids in the 2D simulation space.
From one frame to the next, the accuracy of the grid will only degrade gradually, so it can be efficiently resorted using an adaptive sorting algorithm that runs well on data that is nearly already sorted. Even better, you can dedicate a set number of sorting passes per frame, and as long as you do enough sorting on average, the NN lookups will be good enough.
(If anyone is truly interested I could do a more detailed writeup of this with diagrams and stuff.)