Recent changes in git:
started porting from transform feedback to compute shaders with the calculation engine in a reusable library (lots of work still to do, so this isn't live yet)
323b505 rename symbols (all with mightymandel prefix, private with additional _ prefix)
373e9e7 remove unimplemented status
6475c03 reduce file count; build a shared library
938ae1e conditional compaction; debug output
f7f60ed first compute shader test
ba46c6d start of library API with test of threading
The primary issue with transform feedback is that the CPU is waiting for the GPU to finish before issuing the next commands to the GPU - with compute shaders and conditional rendering the throughput should increase, at the cost of a little latency when checking whether rendering is done. Compute shaders are also more sane than vertex/geometry shaders when doing computations. The library API will be quite simple - there'll be functions to create/destroy a mightymandel context, start a render, stop the render, wait until the render is done, get current raw iteration results (whether done or not, for progressive display), and clean up after rendering. The rendering runs in a separate thread so your main loop isn't stolen. This work will bump the version requirement to OpenGL 4.3, the no-deep-zoom OpenGL 3.3 version will no longer be supported.
changes to the existing engine2936c26 accumulate error flags to improve reference point finding
2da20be new series approximation in fpxx_approx2; flag --order
f23d2b2 series approximation code generator
Accumulating error flags means that points that are glitched with multiple references are given higher priority for consideration as the next reference. Seems to reduce the total number of references needed.
The series approximation code generator is written in Haskell, so you now need
ghc to compile mightymandel. After updating the source from git, run
make -C src generate
before following the previous build instructions. Code is generated for various numbers of coefficients, the highest is 64. The number used can be selected at runtime with the
--order N flag, I usually use 12 as a reasonable trade-off between series approximation calculation time and per-pixel calculation time. You also need
--no-de. The default --order is 0, which selects the old non-parallel 3rd order method, use --help to see the available orders. The coefficient calculations use MPFR and are parallelized using OpenMP, which gives a speed boost if you have several cores. The eventual aim is to do the arbitrary precision maths on the GPU, using OpenGL compute shaders instead of MPFR+OpenMP (GLSL has uaddCarry() and usubBorrow() which should make this less painful).
Here's a diagram showing how the 4th order series approximation coefficient recurrence is parallelized. Higher orders would be even more incomprehensible