Logo by Dinkydau - Contribute your own Logo!

END OF AN ERA, FRACTALFORUMS.COM IS CONTINUED ON FRACTALFORUMS.ORG

it was a great time but no longer maintainable by c.Kleinhuis contact him for any data retrieval,
thanks and see you perhaps in 10 years again

this forum will stay online for reference
News: Check out the originating "3d Mandelbulb" thread here
 
*
Welcome, Guest. Please login or register. September 26, 2018, 01:43:03 PM


Login with username, password and session length


The All New FractalForums is now in Public Beta Testing! Visit FractalForums.org and check it out!


Pages: [1]   Go Down
  Print  
Share this topic on DiggShare this topic on FacebookShare this topic on GoogleShare this topic on RedditShare this topic on StumbleUponShare this topic on Twitter
Author Topic: VTune Results; cRenderWorker::RayRecursion; VolumetricShader vs. RayMarching  (Read 845 times)
0 Members and 1 Guest are viewing this topic.
mancoast
Alien
***
Posts: 21


« on: July 31, 2016, 09:43:29 PM »

Greetings,

I am curious about the Volumetric Shader.
It appears that cLights::GetLight is causing some sort of bottleneck in execution.
Please consider the data below.


This first screenshot shows a tree view summary of CPU instruction usages by functions.
99.5% of total execution cycles are spent within the scope of cRenderWorker::doWork.
For this particular hotspots view in Vtune, percentages are shown in the GUI.

 
We also know that 99.4% of time is spent executing cRenderWorker::RayRecursion.
One level deeper inside RayRecursion, we see 24.4% of time allocated to cRenderWorker::RayMarching.
Also inside RayRecursion, we see 68.2% of time allocated to cRenderWorker::VolumetricShader.

 
With source code view, see 24.4% of time allocated to cRenderWorker::RayMarching.

 
With source code view, see 68.2% of time allocated to cRenderWorker::VolumetricShader.

 
This leads me to believe that we are spending much time in the VolumetricShader.
Within cRenderWorker::VolumetricShader there are two loops retiring billions of instructions to self.

 
The loop at line 349 of VolumetricShader contains function call to cLights::GetLight.

 
Also, the loop at line 375 of VolumetricShader contains function call to cLights::GetLight.

 
It appears that over 50% of all CPU instructions are retired by function calls to cLights::GetLight.

 
The loop at line 349 of VolumetricShader contains function call to cLights::GetLight.
This specific call to cLights::GetLight retires approximately 25% of all CPU instructions.

 
Also, the loop at line 375 of VolumetricShader contains function call to cLights::GetLight.
This other call to cLights::GetLight also retires approximately 25% of all CPU instructions.

 

Why these calls to cLights::GetLight are consuming so many CPU cycles?
Any suggestions for optimization?

Thanks,
coast










Logged
Buddhi
Fractal Iambus
***
Posts: 895



WWW
« Reply #1 on: July 31, 2016, 10:43:00 PM »

Could you attach settings which you used for testing?
Logged

mancoast
Alien
***
Posts: 21


« Reply #2 on: July 31, 2016, 10:59:03 PM »

6mb too big to attach, heres the link

https://github.com/mancoast/mandelbulber2/raw/k1om/_menger-coastn_anim.fract
Logged
taurus
Fractal Supremo
*****
Posts: 1175



profile.php?id=1339106810 @taurus_arts_66
WWW
« Reply #3 on: August 01, 2016, 11:21:44 AM »


Little noob question inbetween. Did you ever render this animation? Depending on fps it should be around 20 minutes. I wonder wether all the stuff below [frames] was necessary to point out, what you mean.
Logged

when life offers you a lemon, get yourself some salt and tequila!
Buddhi
Fractal Iambus
***
Posts: 895



WWW
« Reply #4 on: August 01, 2016, 09:40:21 PM »


It's a HUGE test case! Have you rendered whole animation to do benchmarking or only one frame. I'm asking because I would like to do similar profiling using valgrind and then start to look why cLights::GetLight() was highlighted here.
Logged

mancoast
Alien
***
Posts: 21


« Reply #5 on: August 01, 2016, 11:42:46 PM »

Hello,

This render is not yet completed.
For VTune I render a randomly selected frame.
This keeps the results fresh with different samples.

After Vtune/changes, I run overnight on the servers to get frame to frame time differences.

As of now, its about 50%.

I am excited to test your latest commit with the isAnyLight modification.

Thanks,
coast

Logged
Buddhi
Fractal Iambus
***
Posts: 895



WWW
« Reply #6 on: August 02, 2016, 11:26:17 PM »

Thanks to your enormous animation I have found and fixed another bug. The program allocated too much memory when flight animation was loaded and previews were used. No it uses about 20% of former memory usage. I have also improved speed of refreshing animation table.
Logged

Pages: [1]   Go Down
  Print  
 
Jump to:  

Related Topics
Subject Started by Replies Views Last post
Major Raymarching Optimization Mandelbulb Implementation keldor314 8 4722 Last post December 27, 2013, 08:48:22 PM
by Syntopia
raymarching optimization using golden ratio ... General Discussion cKleinhuis 4 1619 Last post December 29, 2012, 12:21:32 PM
by cKleinhuis
Raymarching misses fractal? Programming fractalnoob 1 824 Last post June 21, 2013, 06:35:43 AM
by AndyAlias
VTune Results; cRenderWorker::RayMarching -> OpenMP or Threading Building Blocks Mandelbulber mancoast 8 853 Last post July 31, 2016, 02:57:32 PM
by mancoast
VTune Results; Compute vs. CVector3::IsNotANumber. Mandelbulber mancoast 0 788 Last post August 03, 2016, 03:42:16 AM
by mancoast

Powered by MySQL Powered by PHP Powered by SMF 1.1.21 | SMF © 2015, Simple Machines

Valid XHTML 1.0! Valid CSS! Dilber MC Theme by HarzeM
Page created in 0.318 seconds with 28 queries. (Pretty URLs adds 0.027s, 2q)