Thanks for that, and you've done amazing improvements on it, since it is about 10 times faster than mine?
And indeed much faster than using long double!
SIMD, no C++ operations, normalization only when needed, assembler, etc?
At what location and zoom level did you measure the 10x difference? MM uses automatic scale reduction whenever possible, and this makes comparisons a little trickier. The basic idea is the following: when calculating a perturbed orbit, the magnitude of its absolute value grows steadily (with some small jumps up/downwards and some bigger jumps downwards) from delta
0 to the bailout value. The rate of the growth is also increasing as we approach the bailout.
In the classical algorithm without SA, if |delta
0| < 10
-308, long double must be used to avoid underflow. But if SA can skip N iterations where N is large enough (more than 90% of the length of the reference orbit), the approximated value will likely be in the range of doubles: |delta
N| > 10
-308. If "Ignore small addends" is turned on, delta
0 is no longer added to the calculated delta
i (since delta
0 is tens or hundreds of magnitudes smaller), and all the iterations can be calculated using doubles*. This also enables the use of SIMD codepaths, resulting in a further 2x-8x speedup.
* There is one more thing to check: the previously mentioned bigger downward jumps in log|delta
i| are caused by sudden drops in the magnitude of the reference orbit (|Z
m|). So we have to be sure that minValue = |delta
N| * min
m>N|Z
m| > 10
-308This method can also be taken one step further: if minValue > 10
-38, the remaining iterations can be calculated using floats instead of doubles, increasing the speed further by up to 2x when using SIMD. (This is not yet enabled in the current beta.) And also in the opposite direction: if minValue is between 10
-600 and 10
-308, scaled doubles can be used, which are a little slower than ordinary doubles, but much faster than long doubles.
I found some issues with the location that was given by simon.snake, zoomed further a little bit to 1448.
The approximation breaks and it get very slow when it is changed, fewer terms makes it a little better but it is distorted unless approximation is turned off completely.
I will have to check this. Maybe the normalization has to be done more frequently.
I don't quite follow your description on how you determine how many iterations you can skip.
In KF I simply use a couple of reference pixels, in the corners, calculated with perturbation starting from iteration 0, and compare how many iterations that get correct within a given tolerance, about 0.1%.
I found that checking the corners only is not always enough. In fact, when the image is centered on a julia, there are some cases when the borders are rendered correctly but the center is corrupted due to the SA skipping too many iterations. Evaluating the skippable iterations on small regions locally might help, but it would also be more costly.
Did you remove all auto-glitch correction? Anyway, if you implement Pauldelbrot's method your program will be super!
Yes, I had to disable glitch correction, as I mentioned under Known issues in the changelog. I started to implement Pauldelbrot's method, but the whole thing is in an intermediate state now, and neither the old nor the new method works correctly.