Welcome to Fractal Forums

Fractal Software => Mandelbulber => Topic started by: mancoast on July 31, 2016, 09:43:29 PM




Title: VTune Results; cRenderWorker::RayRecursion; VolumetricShader vs. RayMarching
Post by: mancoast on July 31, 2016, 09:43:29 PM
Greetings,

I am curious about the Volumetric Shader.
It appears that cLights::GetLight is causing some sort of bottleneck in execution.
Please consider the data below.


This first screenshot shows a tree view summary of CPU instruction usages by functions.
99.5% of total execution cycles are spent within the scope of cRenderWorker::doWork.
For this particular hotspots view in Vtune, percentages are shown in the GUI.
(http://i.imgur.com/nIgFlwj.png)
 
We also know that 99.4% of time is spent executing cRenderWorker::RayRecursion.
One level deeper inside RayRecursion, we see 24.4% of time allocated to cRenderWorker::RayMarching.
Also inside RayRecursion, we see 68.2% of time allocated to cRenderWorker::VolumetricShader.
(http://i.imgur.com/no1JRvb.png)
 
With source code view, see 24.4% of time allocated to cRenderWorker::RayMarching.
(http://i.imgur.com/mbSLhg4.png)
 
With source code view, see 68.2% of time allocated to cRenderWorker::VolumetricShader.
(http://i.imgur.com/xNR6WM9.png)
 
This leads me to believe that we are spending much time in the VolumetricShader.
Within cRenderWorker::VolumetricShader there are two loops retiring billions of instructions to self.
(http://i.imgur.com/AmvEW6J.png)
 
The loop at line 349 of VolumetricShader contains function call to cLights::GetLight.
(http://i.imgur.com/IEHzXq7.png)
 
Also, the loop at line 375 of VolumetricShader contains function call to cLights::GetLight.
(http://i.imgur.com/d9TMGy9.png)
 
It appears that over 50% of all CPU instructions are retired by function calls to cLights::GetLight.
(http://i.imgur.com/SyMNJ3p.png)
 
The loop at line 349 of VolumetricShader contains function call to cLights::GetLight.
This specific call to cLights::GetLight retires approximately 25% of all CPU instructions.
(http://i.imgur.com/Pqpbvlc.png)
 
Also, the loop at line 375 of VolumetricShader contains function call to cLights::GetLight.
This other call to cLights::GetLight also retires approximately 25% of all CPU instructions.
(http://i.imgur.com/RScReG3.png)
 

Why these calls to cLights::GetLight are consuming so many CPU cycles?
Any suggestions for optimization?

Thanks,
coast












Title: Re: VTune Results; cRenderWorker::RayRecursion; VolumetricShader vs. RayMarching
Post by: Buddhi on July 31, 2016, 10:43:00 PM
Could you attach settings which you used for testing?


Title: Re: VTune Results; cRenderWorker::RayRecursion; VolumetricShader vs. RayMarching
Post by: mancoast on July 31, 2016, 10:59:03 PM
6mb too big to attach, heres the link

https://github.com/mancoast/mandelbulber2/raw/k1om/_menger-coastn_anim.fract (https://github.com/mancoast/mandelbulber2/raw/k1om/_menger-coastn_anim.fract)


Title: Re: VTune Results; cRenderWorker::RayRecursion; VolumetricShader vs. RayMarching
Post by: taurus on August 01, 2016, 11:21:44 AM
6mb too big to attach, heres the link

https://github.com/mancoast/mandelbulber2/raw/k1om/_menger-coastn_anim.fract (https://github.com/mancoast/mandelbulber2/raw/k1om/_menger-coastn_anim.fract)

Little noob question inbetween. Did you ever render this animation? Depending on fps it should be around 20 minutes. I wonder wether all the stuff below [frames] was necessary to point out, what you mean.


Title: Re: VTune Results; cRenderWorker::RayRecursion; VolumetricShader vs. RayMarching
Post by: Buddhi on August 01, 2016, 09:40:21 PM
6mb too big to attach, heres the link

https://github.com/mancoast/mandelbulber2/raw/k1om/_menger-coastn_anim.fract (https://github.com/mancoast/mandelbulber2/raw/k1om/_menger-coastn_anim.fract)

It's a HUGE test case! Have you rendered whole animation to do benchmarking or only one frame. I'm asking because I would like to do similar profiling using valgrind and then start to look why cLights::GetLight() was highlighted here.


Title: Re: VTune Results; cRenderWorker::RayRecursion; VolumetricShader vs. RayMarching
Post by: mancoast on August 01, 2016, 11:42:46 PM
Hello,

This render is not yet completed.
For VTune I render a randomly selected frame.
This keeps the results fresh with different samples.

After Vtune/changes, I run overnight on the servers to get frame to frame time differences.

As of now, its about 50%.

I am excited to test your latest commit with the isAnyLight modification.

Thanks,
coast



Title: Re: VTune Results; cRenderWorker::RayRecursion; VolumetricShader vs. RayMarching
Post by: Buddhi on August 02, 2016, 11:26:17 PM
Thanks to your enormous animation I have found and fixed another bug. The program allocated too much memory when flight animation was loaded and previews were used. No it uses about 20% of former memory usage. I have also improved speed of refreshing animation table.