Logo by Trifox - Contribute your own Logo!
News: Check out the originating "3d Mandelbulb" thread here
 
*
Welcome, Guest. Please login or register. June 27, 2017, 02:03:41 AM


Login with username, password and session length



Pages: [1]   Go Down
  Print  
Share this topic on DiggShare this topic on FacebookShare this topic on GoogleShare this topic on RedditShare this topic on StumbleUponShare this topic on Twitter
Author Topic: VTune Results; cRenderWorker::RayRecursion; VolumetricShader vs. RayMarching  (Read 523 times)
0 Members and 1 Guest are viewing this topic.
mancoast
Forums Freshman
**
Posts: 17


« on: July 31, 2016, 09:43:29 PM »

Greetings,

I am curious about the Volumetric Shader.
It appears that cLights::GetLight is causing some sort of bottleneck in execution.
Please consider the data below.


This first screenshot shows a tree view summary of CPU instruction usages by functions.
99.5% of total execution cycles are spent within the scope of cRenderWorker::doWork.
For this particular hotspots view in Vtune, percentages are shown in the GUI.

 
We also know that 99.4% of time is spent executing cRenderWorker::RayRecursion.
One level deeper inside RayRecursion, we see 24.4% of time allocated to cRenderWorker::RayMarching.
Also inside RayRecursion, we see 68.2% of time allocated to cRenderWorker::VolumetricShader.

 
With source code view, see 24.4% of time allocated to cRenderWorker::RayMarching.

 
With source code view, see 68.2% of time allocated to cRenderWorker::VolumetricShader.

 
This leads me to believe that we are spending much time in the VolumetricShader.
Within cRenderWorker::VolumetricShader there are two loops retiring billions of instructions to self.

 
The loop at line 349 of VolumetricShader contains function call to cLights::GetLight.

 
Also, the loop at line 375 of VolumetricShader contains function call to cLights::GetLight.

 
It appears that over 50% of all CPU instructions are retired by function calls to cLights::GetLight.

 
The loop at line 349 of VolumetricShader contains function call to cLights::GetLight.
This specific call to cLights::GetLight retires approximately 25% of all CPU instructions.

 
Also, the loop at line 375 of VolumetricShader contains function call to cLights::GetLight.
This other call to cLights::GetLight also retires approximately 25% of all CPU instructions.

 

Why these calls to cLights::GetLight are consuming so many CPU cycles?
Any suggestions for optimization?

Thanks,
coast










Logged
Buddhi
Fractal Iambus
***
Posts: 863



WWW
« Reply #1 on: July 31, 2016, 10:43:00 PM »

Could you attach settings which you used for testing?
Logged

mancoast
Forums Freshman
**
Posts: 17


« Reply #2 on: July 31, 2016, 10:59:03 PM »

6mb too big to attach, heres the link

https://github.com/mancoast/mandelbulber2/raw/k1om/_menger-coastn_anim.fract
Logged
taurus
Fractal Supremo
*****
Posts: 1144



profile.php?id=1339106810 @taurus_arts_66
WWW
« Reply #3 on: August 01, 2016, 11:21:44 AM »


Little noob question inbetween. Did you ever render this animation? Depending on fps it should be around 20 minutes. I wonder wether all the stuff below [frames] was necessary to point out, what you mean.
Logged

when life offers you a lemon, get yourself some salt and tequila!
Buddhi
Fractal Iambus
***
Posts: 863



WWW
« Reply #4 on: August 01, 2016, 09:40:21 PM »


It's a HUGE test case! Have you rendered whole animation to do benchmarking or only one frame. I'm asking because I would like to do similar profiling using valgrind and then start to look why cLights::GetLight() was highlighted here.
Logged

mancoast
Forums Freshman
**
Posts: 17


« Reply #5 on: August 01, 2016, 11:42:46 PM »

Hello,

This render is not yet completed.
For VTune I render a randomly selected frame.
This keeps the results fresh with different samples.

After Vtune/changes, I run overnight on the servers to get frame to frame time differences.

As of now, its about 50%.

I am excited to test your latest commit with the isAnyLight modification.

Thanks,
coast

Logged
Buddhi
Fractal Iambus
***
Posts: 863



WWW
« Reply #6 on: August 02, 2016, 11:26:17 PM »

Thanks to your enormous animation I have found and fixed another bug. The program allocated too much memory when flight animation was loaded and previews were used. No it uses about 20% of former memory usage. I have also improved speed of refreshing animation table.
Logged

Pages: [1]   Go Down
  Print  
 
Jump to:  

Related Topics
Subject Started by Replies Views Last post
Major Raymarching Optimization Mandelbulb Implementation keldor314 8 3782 Last post December 27, 2013, 08:48:22 PM
by Syntopia
raymarching optimization using golden ratio ... General Discussion cKleinhuis 4 1128 Last post December 29, 2012, 12:21:32 PM
by cKleinhuis
Raymarching misses fractal? Programming fractalnoob 1 642 Last post June 21, 2013, 06:35:43 AM
by AndyAlias
VTune Results; cRenderWorker::RayMarching -> OpenMP or Threading Building Blocks Mandelbulber mancoast 8 512 Last post July 31, 2016, 02:57:32 PM
by mancoast
VTune Results; Compute vs. CVector3::IsNotANumber. Mandelbulber mancoast 0 303 Last post August 03, 2016, 03:42:16 AM
by mancoast

Powered by MySQL Powered by PHP Powered by SMF 1.1.21 | SMF © 2015, Simple Machines

Valid XHTML 1.0! Valid CSS! Dilber MC Theme by HarzeM
Page created in 0.155 seconds with 28 queries. (Pretty URLs adds 0.015s, 2q)