Logo by Fiery - Contribute your own Logo!

END OF AN ERA, FRACTALFORUMS.COM IS CONTINUED ON FRACTALFORUMS.ORG

it was a great time but no longer maintainable by c.Kleinhuis contact him for any data retrieval,
thanks and see you perhaps in 10 years again

this forum will stay online for reference
News: Follow us on Twitter
 
*
Welcome, Guest. Please login or register. August 08, 2022, 01:18:07 AM


Login with username, password and session length


The All New FractalForums is now in Public Beta Testing! Visit FractalForums.org and check it out!


Pages: [1]   Go Down
  Print  
Share this topic on DiggShare this topic on FacebookShare this topic on GoogleShare this topic on RedditShare this topic on StumbleUponShare this topic on Twitter
Author Topic: Enhanced rendering using DE - at least on CPU  (Read 2151 times)
0 Members and 1 Guest are viewing this topic.
David Makin
Global Moderator
Fractal Senior
******
Posts: 2286



Makin' Magic Fractals
WWW
« on: March 29, 2012, 12:17:30 PM »

Hi all, if using DE (or the "any direction" deltaDE such as Buddhi's method) I just realised that a massive speed-up should be possible by changing the render/stepping algorithms to share distance information across adjacent rays - not something I'd normally do as it's not possible using the traditional method in UF.

Basically if we start with say the centre ray and the distance calculated to step is d then for all rays adjacent to this we can move the start positions forward to the point where the length of a line from the start point on the centre ray meeting the adjacent ray/s is d.
This applies *on every step* provided the old position on each adjacent ray is also within the radius d for the next step - only when the old position on a ray is not within the bounds of the new step on the central ray must we stop moving forward on that adjacent ray and store the final position found as the start point for that ray.

Has anyone else suggested this or tried it ?

(Note that it's still possible with a little thought to ensure number of steps and similar info remain (reasonably) intact for pseudo-lighting effects etc.)
« Last Edit: March 29, 2012, 12:20:02 PM by David Makin » Logged

The meaning and purpose of life is to give life purpose and meaning.

http://www.fractalgallery.co.uk/
"Makin' Magic Music" on Jango
hobold
Fractal Bachius
*
Posts: 573


« Reply #1 on: March 29, 2012, 04:02:36 PM »

Yes, this has been suggested before. It was implemented in "Gaston", a realtime renderer for quaternion Julia sets on Macs. It did speed up the scalar code path by a factor of two, but the vectorized code path (i.e. SIMD program using what Apple called "Velocity Engine") did not gain much.

It is true that every single DE computation results in information about a solid ball of space, and a bundle of nearby view rays can be intersected with that ball. This makes the most of the precious DE information that we laboriously computed, but it also introduces data dependencies between adjacent view rays. These dependencies are obstacles to the massive brute force parallelism of a GPU (or any other wide SIMD machine).

I think this optimization has much more potential than just a factor of two. But it would require a few more good ideas to spread the DE samples such that the spheres don't overleap too much, but still cover the view nicely. When most rays have approached the surface closely, one should probably switch back to the usual brute force stepping of each individual ray.
Logged
Syntopia
Fractal Molossus
**
Posts: 681



syntopiadk
WWW
« Reply #2 on: March 29, 2012, 07:29:37 PM »

There is also this old thread: http://www.fractalforums.com/mandelbulb-implementation/major-raymarching-optimization/

I also considered this in Fragmentarium. On a GPU it would be possible to render the DE intersection distances in a lower resolution float buffer, and use these distances as starting points for higher resolution buffers. You could even do this at multiple scales (hierarchically).

In the end I decided against it, since the secondary rays (shadows, reflections, occlusions) cannot be accelerated this way. However, since the shadows and occlusions typically are low-frequency these might be calculated at lower resolution. But it don't know if the programming effort is worth it.
Logged
knighty
Fractal Iambus
***
Posts: 819


« Reply #3 on: March 29, 2012, 10:04:13 PM »

I also remember this one www.fractalforums.com/mandelbulb-implementation/potential-optimizations-for-de-based-stepping/smiley
Logged
David Makin
Global Moderator
Fractal Senior
******
Posts: 2286



Makin' Magic Fractals
WWW
« Reply #4 on: March 29, 2012, 10:26:55 PM »

I had thought of the intermediate float destination method for GPU but I've never actually tried rendering to even a float colour buffer so I didn't mention it wink
As to the optimisation itself a key thing to note is that as the render resolution increases the extent of optimisation increases in proportion with the increase in *pixel area* not the linear increase in magnification i.e. doubling the resolution should theoretically quadruple the increase in speed (relative to rendering in the usual way).
Logged

The meaning and purpose of life is to give life purpose and meaning.

http://www.fractalgallery.co.uk/
"Makin' Magic Music" on Jango
David Makin
Global Moderator
Fractal Senior
******
Posts: 2286



Makin' Magic Fractals
WWW
« Reply #5 on: March 29, 2012, 10:31:10 PM »

> When most rays have approached the surface closely, one should probably switch back to the usual brute force stepping of each individual ray.

That threshold will be directly related to the variation in accuracy in the DE as the surface is approached and the ray-density (i.e. pixel resolution), the higher the pixel resolution the closer to the surface you can go before the optimisation is no longer optimum wink
« Last Edit: March 29, 2012, 11:24:49 PM by David Makin » Logged

The meaning and purpose of life is to give life purpose and meaning.

http://www.fractalgallery.co.uk/
"Makin' Magic Music" on Jango
David Makin
Global Moderator
Fractal Senior
******
Posts: 2286



Makin' Magic Fractals
WWW
« Reply #6 on: March 29, 2012, 11:33:34 PM »

Question - on some video cards is it possible to use a vec4 source texture and vec4 destination ?
If so then I need a card like that and I'll get back into proper coding again sharpish wink
Of course it's possible mine (ATI Radeon HD 5870) does it, I still haven\'t really looked into shaders/GLSL/CUDA etc. at least not on PCs wink
(that's PCs in the general sense)
Logged

The meaning and purpose of life is to give life purpose and meaning.

http://www.fractalgallery.co.uk/
"Makin' Magic Music" on Jango
Jesse
Download Section
Fractal Schemer
*
Posts: 1013


« Reply #7 on: March 30, 2012, 08:57:50 PM »

In the end I decided against it, since the secondary rays (shadows, reflections, occlusions) cannot be accelerated this way. However, since the shadows and occlusions typically are low-frequency these might be calculated at lower resolution. But it don't know if the programming effort is worth it.

I did only some few tests, but i also decide against it because the cases you really want to increase speed are those with bad DE's and low raystep factors (high fudge factors,values).  But unfortunately the benefit of this method is the higher the better the distance estimates are and also the bigger these values are.  But on parts the DE's are low and rendering is slow, the benefit shrinks.
Logged
Syntopia
Fractal Molossus
**
Posts: 681



syntopiadk
WWW
« Reply #8 on: March 31, 2012, 01:30:39 AM »

Question - on some video cards is it possible to use a vec4 source texture and vec4 destination ?
If so then I need a card like that and I'll get back into proper coding again sharpish wink
Of course it's possible mine (ATI Radeon HD 5870) does it, I still haven\'t really looked into shaders/GLSL/CUDA etc. at least not on PCs wink
(that's PCs in the general sense)


GLSL is quite versatile - you can sample from different textures in your pixel shader, and render to a offscreen FrameBufferObject. You can also choose between several data types (for instance work with 4-component 32-bit floats for colors, instead of 8-bit RGB). These features should be available on most moderne graphics cards.
Logged
David Makin
Global Moderator
Fractal Senior
******
Posts: 2286



Makin' Magic Fractals
WWW
« Reply #9 on: April 01, 2012, 02:40:14 PM »

GLSL is quite versatile - you can sample from different textures in your pixel shader, and render to a offscreen FrameBufferObject. You can also choose between several data types (for instance work with 4-component 32-bit floats for colors, instead of 8-bit RGB). These features should be available on most moderne graphics cards.

Thanks - I just wasn't sure if cards allowed the destination to be vec4 wink
Logged

The meaning and purpose of life is to give life purpose and meaning.

http://www.fractalgallery.co.uk/
"Makin' Magic Music" on Jango
Syntopia
Fractal Molossus
**
Posts: 681



syntopiadk
WWW
« Reply #10 on: April 10, 2012, 12:18:48 AM »

Somebody at Revision 2012 (a demo scene party) presented a method (cone sphere tracing) for doing exactly this:
<a href="http://www.youtube.com/v/4Q5sgNCN2Jw&rel=1&fs=1&hd=1" target="_blank">http://www.youtube.com/v/4Q5sgNCN2Jw&rel=1&fs=1&hd=1</a>
Logged
Pages: [1]   Go Down
  Print  
 
Jump to:  

Related Topics
Subject Started by Replies Views Last post
Rendering the Mandelbrot set Programming JohnC 4 2269 Last post March 27, 2008, 06:59:04 PM
by JohnC
what to do while rendering? Non-Fractal related Chit-Chat jwm-art 14 2849 Last post January 17, 2011, 11:09:19 PM
by LhoghoNurbs
Enhanced Ultrafractal Jimmie 1 828 Last post August 03, 2013, 06:17:10 PM
by Furan
Enhanced sphere tracing paper 3D Fractal Generation subblue 5 2808 Last post November 11, 2014, 07:01:51 PM
by eiffie
Graphene photodetector enhanced by fractal golden snowflake Fractals Applied or in Nature 0Encrypted0 1 1013 Last post January 17, 2017, 12:23:13 AM
by mclarekin

Powered by MySQL Powered by PHP Powered by SMF 1.1.21 | SMF © 2015, Simple Machines

Valid XHTML 1.0! Valid CSS! Dilber MC Theme by HarzeM
Page created in 0.198 seconds with 24 queries. (Pretty URLs adds 0.01s, 2q)