Logo by Fiery - Contribute your own Logo!

END OF AN ERA, FRACTALFORUMS.COM IS CONTINUED ON FRACTALFORUMS.ORG

it was a great time but no longer maintainable by c.Kleinhuis contact him for any data retrieval,
thanks and see you perhaps in 10 years again

this forum will stay online for reference
News: Check out the originating "3d Mandelbulb" thread here
 
*
Welcome, Guest. Please login or register. March 29, 2024, 01:00:14 PM


Login with username, password and session length


The All New FractalForums is now in Public Beta Testing! Visit FractalForums.org and check it out!


Pages: 1 ... 5 6 [7] 8 9 ... 17   Go Down
  Print  
Share this topic on DiggShare this topic on FacebookShare this topic on GoogleShare this topic on RedditShare this topic on StumbleUponShare this topic on Twitter
Author Topic: SuperFractalThing: Arbitrary precision mandelbrot set rendering in Java.  (Read 98994 times)
0 Members and 1 Guest are viewing this topic.
mrflay
Alien
***
Posts: 36


« Reply #90 on: June 26, 2013, 11:40:26 PM »

Can you give the location parameters?
I had a similar problem and that was due to the scaling and  the limitation of the double datatype, which should not occur until about e600 if you are scaling the variables.
The location was in the text file attached. To this post, I've attached the text file with the coordinates for the image I linked to.

* tf_e405_b.txt (0.92 KB - downloaded 116 times.)
Logged
cbuchner1
Fractal Phenom
******
Posts: 443


« Reply #91 on: June 27, 2013, 10:49:30 AM »

My above E-300 inner loop has 8 adds and 7 multiplies (I *could* make a third, "was smaller than E-300 originally" version with 5 adds by dropping the delta-zero term which will be negligible in these cases; and 2 multiplies and an add are needed to calculate radius for bailout, or I could drop to 5 multiplies and 7 adds) and my below E-300 inner loop has 4 adds and 4 multiplies. If anyone has optimized this loop to remove more multiplies I'd be curious to know how.

I am kind of thinking that you could achieve a speed-up of factor 2 by going to SSE2, and a speed-up of factor 4 by going to Intel AVX. But I don't think software built on top of a Java VM has that kind of low level access.
Logged
Kalles Fraktaler
Fractal Senior
******
Posts: 1458



kallesfraktaler
WWW
« Reply #92 on: June 27, 2013, 12:16:51 PM »

The location was in the text file attached. To this post, I've attached the text file with the coordinates for the image I linked to.
My program is able to zoom beyond this. And yes, it is very time consuming due to the huge number of iterations.
I recall now that the problem I had was that the location just did not get updated, so maybe it is not relevant.
I would like to ask you again, why the coefficients, why equation (3), (4) and (5)? I am only using equation (1).

I create tables with all the delta-values, and when the zoom-level is above e300 the ap-library has a parameter for stripping zeroes when converting to double.
The variable m_nScalingOffset is an integer, and m_nScaling is a double. These variables are actually applied for all zoom levels, however I have not noticed any performance difference.
Code:
m_nScalingOffset=0;
m_nScaling=1;
for(i=300;i<m_nZoom;i++){
m_nScalingOffset++;
m_nScaling=m_nScaling*10;
}
The value m_nScalingOffset is used when the delta table is assigned (m_db_cdr and m_db_cdi are tables of type double, cr, ci, rref and iref are ap variables):
Code:
m_db_cdr[x][y]=(cr-rref).ToDouble(m_nScalingOffset);
m_db_cdi[x][y]=(ci-iref).ToDouble(m_nScalingOffset);
The standard mandelbrot function is calculated with the ap-library for the reference point, by default the middle point in the image to be rendered. Each value is stored as a double in arrays m_db_dxr and m_db_dxi. All other variables are the ap-library, except i and nMaxIter.
Code:
for(i=0;i<nMaxIter;i++){
m_db_dxr[il] = xr.ToDouble();
m_db_dxi[i] = xi.ToDouble();
xin = (xr*xi).Double() + iref;
xrn = sr - si + rref;
xr = xrn;
xi = xin;
sr = xr.Square();
si = xi.Square();
}
Now the delta values are in practical multiplied with the m_nScaling value. So when they are used in function (1), they need to be divided away. Any delta*delta need an additional division of m_nScaling. So here is my implementation of the function (1):
Code:
Dnr = (m_db_dxr[antal]*m_dbDr[x][y] - m_db_dxi[antal]*m_dbDi[x][y])*2 + m_dbDr[x][y]*m_dbDr[x][y]/m_nScaling - m_dbDi[x][y]*m_dbDi[x][y]/m_nScaling + m_db_cdr[x][y];
Dni = (m_db_dxr[antal]*m_dbDi[x][y] + m_db_dxi[antal]*m_dbDr[x][y] + m_dbDr[x][y]*m_dbDi[x][y]/m_nScaling)*2 + m_db_cdi[x][y];
yr=m_db_dxr[antal]+m_dbDr[x][y]/m_nScaling;
yi=m_db_dxi[antal]+m_dbDi[x][y]/m_nScaling;
m_dbDi[x][y] = Dni;
m_dbDr[x][y] = Dnr;

I hope this can also help Pauldelbrot.
What is "Series approximation" that my app is lacking?
Unfortunately I still haven't got the SFT java app working on my machine, so I have never tested it. It would be nice if someone could do a comparison on the performance. I suspect that there isn't any big difference, since Java, or any language, would not add any overhead on simple arithmetic with hardware datatypes?
Logged

Want to create DEEP Mandelbrot fractals 100 times faster than the commercial programs, for FREE? One hour or one minute? Three months or one day? Try Kalles Fraktaler http://www.chillheimer.de/kallesfraktaler
http://www.facebook.com/kallesfraktaler
hapf
Fractal Lover
**
Posts: 219


« Reply #93 on: June 28, 2013, 02:48:36 PM »

I would like to ask you again, why the coefficients, why equation (3), (4) and (5)? I am only using equation (1).
What is "Series approximation" that my app is lacking?
I think the formula for delta2 in http://www.superfractalthing.co.nf/sft_maths.pdf is incorrect. With the correct
version you get the formulas 3) TO 5).  surprised
This approach is very interesting indeed. I can't say I understand it well yet but in theory it promises huge speed ups compared to the traditional approach.
Coming back to your question. The "series approximation" is (2). Formula (1) for delta(n+1) is exact, but uses delta(n) which makes it recursive and expenisve to compute. Formula (2) for delta(n) is an approximation since it ignores all terms with delta(0)^m, 4 <= m <= 2^n.
But it is NOT recursive which makes it far more efficient to compute once all the An, Bn and Cn have been computed and stored. With formula (2) you can do a binary search for the n where delta(n) gets > 2 and regular iterations would be stopped, PROVIDED the approximation is good enough (e.g. the missing terms don't distort delta (n) too much). When it works a pixel is computed not in O(n) time but
O(log n) time. A huge speed up for large n.
I have some questions as well.
- The reference point should be in the Mandelbrot set or at least not go to infinity till the maximal iteration allowed for the image is reached,  right? At least for all points using it as reference. How is this point found best so that the distances for all the points/pixels in the image remain small enough? Should it best be in the image itself? If not, how far can it be away till we get precision issues when using doubles only?
- What to do when (2) does not converge fast enough with A, B and C alone? Fallback on (1)? Search a new and better reference point?
- Does a Cn term << the Bn term guarantee that all the other terms are "irrelevant"? What should the minimal size difference be?

« Last Edit: June 28, 2013, 04:04:03 PM by hapf » Logged
Kalles Fraktaler
Fractal Senior
******
Posts: 1458



kallesfraktaler
WWW
« Reply #94 on: June 28, 2013, 03:51:34 PM »

Thanks a lot, it is so obvious when you point it out to me!
Logged

Want to create DEEP Mandelbrot fractals 100 times faster than the commercial programs, for FREE? One hour or one minute? Three months or one day? Try Kalles Fraktaler http://www.chillheimer.de/kallesfraktaler
http://www.facebook.com/kallesfraktaler
claude
Fractal Bachius
*
Posts: 563



WWW
« Reply #95 on: June 28, 2013, 04:20:30 PM »

- The reference point should be in the Mandelbrot set or at least not go to infinity till the maximal iteration allowed for the image is reached,  right? At least for all points using it as reference.

Yes.  If it's a periodic point that's a bonus, because if the period is say P then the coefficents for the approximation (eq. 3-5) are eventually periodic, and I think it works out that the approximation is good until about 2P.  Furthermore you only need to compute P X_n at high precision for the later eq1 iterations, because they repeat after that.

Quote
How is this point found best so that the distances for all the points/pixels in the image remain small enough? Should it best be in the image itself? If not, how far can it be away till we get precision issues when using doubles only?

I think the "best period" has to do with "partials":

Define the partials as each n such that |X_n| is less than |X_m| for all m < n, starting with n = 1, X_1 = C.

Then if P is a partial of a point C, Newton's method can be used to find the centre of period P, which is likely to be a good reference point.

The problem is that if you have an image with lots of different C, they might have different partials, and (so far) the only way I found to compute partials accurately is with full precision iteration.  (Using eq1 for partials fails at the same places that the reference point isn't good enough, namely deep embedded Julia sets).

A reference point outside the image could be a good one, provided you have enough bits of precision to distinguish neighbouring pixels in the delta.  A formula "G" involving the distance of a pixel from the reference and the distance between neighbouring pixels could be worked out.

Quote
- What to do when (2) does not converge fast enough with A, B and C alone? Fallback on (1)? Search a new and better reference point?

(2) should be good until about 2P, but only for points that have P as a partial.

Embedded Julia sets are hard because there are a lot of diverging partials, see the last image on:
http://mrob.com/pub/muency/atomdomain.html
http://mrob.com/images/0-muency/ejs2-7e-17-apd.jpg

Each coloured region would have a different best reference point - depending on the formula "G" mentioned above the outermost yellow region might be ok for the whole image, but for deep embedded Julia sets it would fail, and the innermost blue region wouldn't be a suitable reference for most of the image.
Logged
Ben Weiss
Forums Newbie
*
Posts: 3


« Reply #96 on: June 29, 2013, 12:23:20 AM »

Interesting to read this thread.

I discovered and implemented this same perturbation technique in the engine for KPT Fraxplorer in the 1990's, allowing it to do zooms to about 10^300. (You can still find this software sold for PC as part of Corel's KPT Collection.) The trick (as you've discovered) is to try to find the deepest point in the window to use as the high-precision "central orbit" around which the double-precision perturbations are computed. For deep M-set zooms, the central point will ideally be in a mini-Mandel, if there is one. Deep zooms containing multiple significant-sized mini-Mandels are problematic, though the minis tend to get sparser (relative to their sizes) as you zoom deeper.

Bignum libraries are helpful here, because multiplications on 1024-bit numbers can be done significantly faster than the naive n^2 schoolbook approach. (See the Karatsuba algorithm, for instance.)
Logged
Kalles Fraktaler
Fractal Senior
******
Posts: 1458



kallesfraktaler
WWW
« Reply #97 on: June 29, 2013, 11:04:19 AM »

The location was in the text file attached. To this post, I've attached the text file with the coordinates for the image I linked to.
If you blame Windows or the hardware you are usually very wrong, but this time I really think I found a bug, and it's more probably in Intel's CPU than in Windows.

Between e300 and e600 I am using scalinig and that works fine when I tested the 32-bit version of my program. But when I tested the 64-bit version, it does not work from e451 (or something). It is the exact same code, just compiled with different options.

I need to investigate this further, to be sure that it is not because of me. But it seems that precision is lost when multiplying and dividing with a large value of format 1eP (where 151<=P<=300) - by 64-bit code and not by 32-bit code - in the same machine and processor.
Logged

Want to create DEEP Mandelbrot fractals 100 times faster than the commercial programs, for FREE? One hour or one minute? Three months or one day? Try Kalles Fraktaler http://www.chillheimer.de/kallesfraktaler
http://www.facebook.com/kallesfraktaler
Syntopia
Fractal Molossus
**
Posts: 681



syntopiadk
WWW
« Reply #98 on: June 29, 2013, 12:00:31 PM »

If you blame Windows or the hardware you are usually very wrong, but this time I really think I found a bug, and it's more probably in Intel's CPU than in Windows.

Between e300 and e600 I am using scalinig and that works fine when I tested the 32-bit version of my program. But when I tested the 64-bit version, it does not work from e451 (or something). It is the exact same code, just compiled with different options.

I need to investigate this further, to be sure that it is not because of me. But it seems that precision is lost when multiplying and dividing with a large value of format 1eP (where 151<=P<=300) - by 64-bit code and not by 32-bit code - in the same machine and processor.

If your program is a Windows C++ application, I think there is a more likely explanation:

64-bit code use SSE2 per default (which means 64-bit of internal precision for doubles), while 32-bit code use x87 FPU (which had 80-bit of internal precision) per default.

Try running with fp:strict (or is precise?) (http://msdn.microsoft.com/en-us/library/e7s85ffb%28v=vs.80%29.aspx) and see if the output matches.
Logged
Kalles Fraktaler
Fractal Senior
******
Posts: 1458



kallesfraktaler
WWW
« Reply #99 on: June 29, 2013, 03:05:35 PM »

If your program is a Windows C++ application, I think there is a more likely explanation:

64-bit code use SSE2 per default (which means 64-bit of internal precision for doubles), while 32-bit code use x87 FPU (which had 80-bit of internal precision) per default.

Try running with fp:strict (or is precise?) (http://msdn.microsoft.com/en-us/library/e7s85ffb%28v=vs.80%29.aspx) and see if the output matches.
Microsoft's C++ compiler does not support 80-bit fp anymore since at least VC5. So no, I don't think this is the case here.
I've tried all fp options with the same result, works in 32-bit but not in 64-bit. I suspect this is relevant on mcflays problem too
Logged

Want to create DEEP Mandelbrot fractals 100 times faster than the commercial programs, for FREE? One hour or one minute? Three months or one day? Try Kalles Fraktaler http://www.chillheimer.de/kallesfraktaler
http://www.facebook.com/kallesfraktaler
Syntopia
Fractal Molossus
**
Posts: 681



syntopiadk
WWW
« Reply #100 on: June 29, 2013, 06:17:39 PM »

Microsoft's C++ compiler does not support 80-bit fp anymore since at least VC5. So no, I don't think this is the case here.

All versions of VC++ for x86 supports x87 FPU instructions (which means 80-bit registers may be used for intermediates).

Even Visual Studio 2012 (where /arch:SSE2 became default), will emit x87 instructions: "...As a result, your code will actually use a mixture of both x87 and SSE/SSE2 for floating-point computations. "
http://msdn.microsoft.com/en-us/library/7t5yh4fd%28v=vs.110%29.aspx
Logged
Kalles Fraktaler
Fractal Senior
******
Posts: 1458



kallesfraktaler
WWW
« Reply #101 on: June 30, 2013, 01:08:15 AM »

Syntopia, thanks a lot for this info, I will investigate this further!
Logged

Want to create DEEP Mandelbrot fractals 100 times faster than the commercial programs, for FREE? One hour or one minute? Three months or one day? Try Kalles Fraktaler http://www.chillheimer.de/kallesfraktaler
http://www.facebook.com/kallesfraktaler
Syntopia
Fractal Molossus
**
Posts: 681



syntopiadk
WWW
« Reply #102 on: June 30, 2013, 09:40:23 AM »

Here is an example, which will give different results on 32-bit (if using x87) and 64-bit:

Code:
int _tmain(int argc, _TCHAR* argv[])
{
double a = 1E-323;
double b = 1E-1;
double c = 1E1;
printf("a         : %.34e\n", a);
double d = (a*b)*c;
printf("(a*0.1)*10: %.34e\n", d); // <--d will be zero on SSE2, d will be equal to a on x87
}

When running this on VC++ 2008 (x86) with default settings, I get equal results, as expected since x87 registers have 15 bits of exponent.
When running this on VC++ 2013 (x64) with default settings, I get a zero for the product. SSE registers only have 11 bits of exponent (standard doubles).

The default settings for VC++ 2013 has changed, but when running this on VC++ 2013 (x86) with "/arch:IA32", I get equal results, and with "/arch:SSE2", I get a zero for the product. But, as the docs state, the compile will choose as it seems fit whether to use SSE2 or x87 in this mode.

Running with strict fp, does not change this behavior, which is rather interesting. According to the docs, the strictness seems to be only enforced at function calls.

Bruce Dawson (Fractal Extreme) has written more about intermediates: http://www.altdevblogaday.com/2012/03/22/intermediate-floating-point-precision/

Now, since you cannot force a 32-bit program to only use SSE registers, and you cannot make a 64-bit program use x87 registers, I guess that means you cannot create programs (in VC++ at least) that will guarantee the same results on both 32-bit and 64-bit?
Logged
Kalles Fraktaler
Fractal Senior
******
Posts: 1458



kallesfraktaler
WWW
« Reply #103 on: June 30, 2013, 10:51:40 PM »

Now, since you cannot force a 32-bit program to only use SSE registers, and you cannot make a 64-bit program use x87 registers, I guess that means you cannot create programs (in VC++ at least) that will guarantee the same results on both 32-bit and 64-bit?
And what about other languages, as Java?
Logged

Want to create DEEP Mandelbrot fractals 100 times faster than the commercial programs, for FREE? One hour or one minute? Three months or one day? Try Kalles Fraktaler http://www.chillheimer.de/kallesfraktaler
http://www.facebook.com/kallesfraktaler
Syntopia
Fractal Molossus
**
Posts: 681



syntopiadk
WWW
« Reply #104 on: June 30, 2013, 11:06:38 PM »

I wondered about that as well, since we use 32-bit and 64-bit Java code at work for scientific calculations.

According to the JLS, if you use the 'strictfp' keyword: "Within an FP-strict expression, all intermediate values must be elements of the float value set or the double value set, implying that the results of all FP-strict expressions must be those predicted by IEEE 754 arithmetic on operands represented using single and double formats. "

As I read this, Java calculations (with strictfp) must be exactly equal on all platforms.
Logged
Pages: 1 ... 5 6 [7] 8 9 ... 17   Go Down
  Print  
 
Jump to:  

Related Topics
Subject Started by Replies Views Last post
Java applet for exploring the Mandelbrot set Announcements & News Paton 5 6602 Last post March 26, 2007, 06:03:34 PM
by Paton
What range/precision for fractional escape counts for Mandelbrot/Julia sets? Programming Duncan C 7 9935 Last post May 01, 2007, 08:23:13 PM
by Duncan C
Java Mandelbrot segment Help & Support fractalwizz 10 1897 Last post December 29, 2008, 08:01:24 PM
by cKleinhuis
[Java] Double-double library for 128-bit precision. Programming Zom-B 10 16970 Last post December 20, 2010, 04:03:48 AM
by David Makin
*Continued* SuperFractalThing: Arbitrary precision mandelbrot set rendering in Java. Announcements & News « 1 2 ... 23 24 » hapf 347 47000 Last post September 28, 2017, 10:20:19 PM
by claude

Powered by MySQL Powered by PHP Powered by SMF 1.1.21 | SMF © 2015, Simple Machines

Valid XHTML 1.0! Valid CSS! Dilber MC Theme by HarzeM
Page created in 0.256 seconds with 25 queries. (Pretty URLs adds 0.017s, 2q)