Logo by yv3 - Contribute your own Logo!

END OF AN ERA, FRACTALFORUMS.COM IS CONTINUED ON FRACTALFORUMS.ORG

it was a great time but no longer maintainable by c.Kleinhuis contact him for any data retrieval,
thanks and see you perhaps in 10 years again

this forum will stay online for reference
News: Visit us on facebook
 
*
Welcome, Guest. Please login or register. November 30, 2025, 06:25:25 AM


Login with username, password and session length


The All New FractalForums is now in Public Beta Testing! Visit FractalForums.org and check it out!


Pages: [1]   Go Down
  Print  
Share this topic on DiggShare this topic on FacebookShare this topic on GoogleShare this topic on RedditShare this topic on StumbleUponShare this topic on Twitter
Author Topic: JWildfireC(UDA) render example 7  (Read 884 times)
0 Members and 1 Guest are viewing this topic.
thargor6
Fractal Molossus
**
Posts: 789



WWW
« on: August 25, 2012, 11:36:01 PM »

(JWildfireC(UDA), resolution=1920x1080, quality=200, filter=on, render time=30.6s on my i7)
Logged
cKleinhuis
Administrator
Fractal Senior
*******
Posts: 7044


formerly known as 'Trifox'


WWW
« Reply #1 on: August 25, 2012, 11:56:22 PM »

wohoo, if you are finished with the gpu stuff let me know, and be suro to include such params in the package cheesy
Logged

---

divide and conquer - iterate and rule - chaos is No random!
Pauldelbrot
Fractal Senior
******
Posts: 2592



pderbyshire2
« Reply #2 on: August 26, 2012, 12:06:01 AM »

Repeating Zooming Self-Silimilar Thumb Up, by Craig
Logged

thargor6
Fractal Molossus
**
Posts: 789



WWW
« Reply #3 on: August 26, 2012, 12:56:45 AM »

if you are finished with the gpu stuff let me know
I must admit, I'm currently unsure about the CUDA part. Unfortunately, there are not those many parts which can be run in parallel in the fractal flame algorithm. But it's possible to certain parts, with some tricks... but if I do so, the memory transfer between CPU and GPU always eats up the advantage of any CUDA computation on my system. In fact, currently I have the complete set ported to the GPU, and can choose which parts to run on it. But it always seems to loose, the pure C solution (which I optimized in parallel) is just faster.
So I'm currently asking myself either to create a totally platform independent CPU solution (without any CUDA features) or to continue with the (for me) more appealing CUDA code.
Any thoughts? :-)
Logged
taurus
Fractal Supremo
*****
Posts: 1175



profile.php?id=1339106810 @taurus_arts_66
WWW
« Reply #4 on: August 26, 2012, 12:57:49 AM »

this looks great! much depth in here.
a pitty, that you implement gpu code in proprietary nvidia language sad
Logged

when life offers you a lemon, get yourself some salt and tequila!
thargor6
Fractal Molossus
**
Posts: 789



WWW
« Reply #5 on: August 26, 2012, 01:11:53 AM »

this looks great! much depth in here.
a pitty, that you implement gpu code in proprietary nvidia language sad
Thank you very much :-)

And this was just an "overlapping" post :-), I'm unsure about the CUDA part, plz see above.
And I just chose CUDA as platform for playing around because it works for ME, I'm no nvidia-advocate or something :-) If the CUDA parallel implementation would work great there would be no reason to move it not to OpenCL or something, I'm just here in the forums to share such thoughts ;-)
« Last Edit: August 26, 2012, 01:13:52 AM by thargor6 » Logged
cKleinhuis
Administrator
Fractal Senior
*******
Posts: 7044


formerly known as 'Trifox'


WWW
« Reply #6 on: August 26, 2012, 02:26:03 AM »

to move into opencl would be because then it could run on ati cards as well,
it is a pitty, but ati and nvidia develop different implementations, opencl is a try to unite those branches, so if it could server/suffice your needs, try sticking to opencl, because then it runs on any platform ( ati,nvidia,linus,mac...even mobiles like ipad...)
Logged

---

divide and conquer - iterate and rule - chaos is No random!
thargor6
Fractal Molossus
**
Posts: 789



WWW
« Reply #7 on: August 26, 2012, 02:39:28 AM »

even mobiles like ipad...)
Yes, this is my dream/vision: that you can "craft" fractal flames with you fingertips. Not just downloading precalculated stuff (like electric sheep for Android). You DO do it and you LOVE it because you can PLAY with it. But this will need some further investigation/ressources to do, the current project is just one step into this direction :-)
Logged
Saquedon
Conqueror
*******
Posts: 108



saquedon Saquedon saquedon
WWW
« Reply #8 on: August 26, 2012, 02:52:47 PM »

Very nice... checking out jWildfire...

I doubt you will ever be able to get CUDA to run efficiently with Java though, it just wasn't made for graphics performance.
« Last Edit: August 26, 2012, 02:55:17 PM by Saquedon » Logged

thargor6
Fractal Molossus
**
Posts: 789



WWW
« Reply #9 on: August 26, 2012, 10:51:51 PM »

I doubt you will ever be able to get CUDA to run efficiently with Java though
I'm running this with C++, i. E. I'm re-creating the Java renderer from scratch. Therefore I'm easy, because it will be faster in any way :-)
The Java renderer will survive because it renders with more accuracy, has more features (which are not that easy to port) and is much friendlier to try out new stuff.

Best regards
Logged
Syntopia
Fractal Molossus
**
Posts: 681



syntopiadk
WWW
« Reply #10 on: August 27, 2012, 04:37:48 PM »

Hi Thargor6,

You are really doing some pretty amazing stuff with you JWildfire software.

Just a single questions:
- You say you don't see any CUDA performance gains, but how do you parallellize the code to the GPU? This papers gets good speedups, but you need to calculate the inverse transformations: http://www.cs.uaf.edu/~olawlor/2011/gpuifs/

Wrt mobile devices, I don't think you can run OpenCL on mobile devices - it is not supported on iOS, and Android use RenderScript.

Logged
thargor6
Fractal Molossus
**
Posts: 789



WWW
« Reply #11 on: August 27, 2012, 09:20:34 PM »

Hello Syntopia,

thanks for your feedback on my work and the interesting link, I will study this as soon as possible :-)

You are asking the right question, the algorithm I use is not optimal for CUDA. But I was still surprised by the poor the results.

I'm using the "standard" way of iterating, but using a large pool of points . At every iteration step I randomly chose a small number of points of this pool and perform the same operation on all those points (in parallel on the GPU). After the iteration the points are written back to the pool. This both ensures that I can do things in parallel and that I walk through many different paths (One thing I tried before was just iterating the whole pool all the time. This gives interesting but unwanted results).

This all works fine but collecting/preparing/coordinating things on the CPU always eats up any advantage from the parallelism. I must admit that I only tested on my i7 CPU which is rather fast and my graphics card is a more mediocre one. So the solution would be not useless at all if I include some more "tricks", but it is not really a "burner" on the other side ;-)

So you led me into the right direction which I avoided at first try because I was a little bit lazy and under pressure from the current "load" of features. Users expect a  faster renderer with all features they know from the Java version. Those are rather numerous so it's really hard to start with a new algorithm under the hood.
But what I have seen is really convincing so I think it makes sense to have three renderers: full-fledged-Java renderer with integrated compiler for experiments etc., fast renderer for doing movies in high quality (the thing I'm currently on) and realtime renderer with less features (at least at the first place).

Need to be a cat... ;-)

Best regards,
Andreas





Logged
Pages: [1]   Go Down
  Print  
 
Jump to:  


Powered by MySQL Powered by PHP Powered by SMF 1.1.21 | SMF © 2015, Simple Machines

Valid XHTML 1.0! Valid CSS! Dilber MC Theme by HarzeM
Page created in 0.387 seconds with 26 queries. (Pretty URLs adds 0.011s, 2q)