VTune Results; cRenderWorker::RayMarching -> OpenMP or Threading Building Blocks

mancoast

Alien

Posts: 21

VTune Results; cRenderWorker::RayMarching -> OpenMP or Threading Building Blocks

« on: July 27, 2016, 10:05:26 PM »

Hello,

Please review these VTune results.
In my opinion, it appears that the application is reaching 100% usage, but with a few deadly mutex locks.
The Ray Marcher is implemented using QT threads. There is room for optimization, but I am not spun up on the codebase.
Please consider starting a discussion enumerating all the requirements for OpenMP or TBB.
I'd be more than happy to take on this devop, but I require direction and specifics.

Thanks,
coast


	Logged

Buddhi

Fractal Iambus

Posts: 895

Re: VTune Results; cRenderWorker::RayMarching -> OpenMP or Threading Building Blocks

« Reply #1 on: July 28, 2016, 10:51:25 PM »

Thanks for this analyse. It's very interesting. It looks like Mandelbulber uses threads in very efficient way. It's difficult to find here fields for significant improvement (correct me if I'm wrong, because you wrote that you can see room for optimization. Where?).
However I can't understand some of numbers in this report. Why Random() function shows here about 6000s of CPU time when at the same time there is no Compute() function in this report, which the most CPU consuming?
What is CPI rate?
By the way I don't know this tools, so my interpretation of data could be wrong.

About OpenMP, it's already used for Depth Of Field calculation and updating of image preview (scaling of image with interpolation). It's difficult to find another places in program where it could be used, because in that places I use QThreads, which are much more efficient. If you see some place to use this kind of optimization, please let me know.
In the future I'm going to implement GPU support like for Mandelbulber 1.21


	Logged

My fractal gallery: http://krzysztofmarczak.deviantart.com
fractal animations: http://www.youtube.com/user/xlace
Mandelbulber program: https://github.com/buddhi1980/mandelbulber2

mancoast

Alien

Posts: 21

Re: VTune Results; cRenderWorker::RayMarching -> OpenMP or Threading Building Blocks

« Reply #2 on: July 30, 2016, 03:26:32 AM »

Hello buddhi,

the report shows much time used for lock and unlock.
I am thinking this probably means there is a mutex somewhere in the raymarching algo.

This report is from the 240 threads of Xeon phi coprocessor

Thanks,
Coast


	Logged

Buddhi

Fractal Iambus

Posts: 895

Re: VTune Results; cRenderWorker::RayMarching -> OpenMP or Threading Building Blocks

« Reply #3 on: July 30, 2016, 09:30:20 AM »

Only place where I used mutex intentionally is cScheduler class
https://github.com/buddhi1980/mandelbulber2/blob/master/mandelbulber2/src/scheduler.cpp

In cScheduler::NextLine() function and cScheduler::UpdateDoneLines() I use QMutex, because there is one common scheduler for all rendering threads. Mutex is needed in that parts which are responsible for decision which line should be rendered next.

In this report there is showed Random() function which uses rand() function taken from c library. Is this mean that rand() uses any mutex locks?


	Logged

My fractal gallery: http://krzysztofmarczak.deviantart.com
fractal animations: http://www.youtube.com/user/xlace
Mandelbulber program: https://github.com/buddhi1980/mandelbulber2

Buddhi

Fractal Iambus

Posts: 895

Re: VTune Results; cRenderWorker::RayMarching -> OpenMP or Threading Building Blocks

« Reply #4 on: July 30, 2016, 09:45:37 AM »

I have checked how much this Random() function slows down rendering. It's 10% !!! Now I see in the glibc source code, that there are used __libc_lock_lock () function. I need to use my own simple rand() function, which don't need to be accurate and thread safe.
Now I see real benefits from your investigation. Thanks a lot!


	Logged

My fractal gallery: http://krzysztofmarczak.deviantart.com
fractal animations: http://www.youtube.com/user/xlace
Mandelbulber program: https://github.com/buddhi1980/mandelbulber2

mancoast

Alien

Posts: 21

Re: VTune Results; cRenderWorker::RayMarching -> OpenMP or Threading Building Blocks

« Reply #5 on: July 31, 2016, 02:29:56 AM »

Hello again buddhi,

I'm very happy that the report is useful.

still curious if openmp or tbb is possible for ray marching algo.
perhaps with the trend toward higher core count, these frameworks offer additional value.
openmp or tbb offers future proof highly parallel foundation.

Thanks,
coast


	Logged

quaz0r

Fractal Molossus

Posts: 652

Re: VTune Results; cRenderWorker::RayMarching -> OpenMP or Threading Building Blocks

« Reply #6 on: July 31, 2016, 04:34:31 AM »

openmp is a kludgy hack. if you are interested in next-level parallelism and future-proofing why not go all the way and take a look at coroutines or a nice hpc framework built on coroutines like hpx


	Logged

Buddhi

Fractal Iambus

Posts: 895

Re: VTune Results; cRenderWorker::RayMarching -> OpenMP or Threading Building Blocks

« Reply #7 on: July 31, 2016, 10:02:12 AM »

Quote from: quaz0r on July 31, 2016, 04:34:31 AM

openmp is a kludgy hack. if you are interested in next-level parallelism and future-proofing why not go all the way and take a look at coroutines or a nice hpc framework built on coroutines like hpx

@quaz0r, I agree with you. OpenMP is mostly for lazy programmers which want get effect of paralelizm just by adding one line into the code. OpenMP in many cases is not efficient and has many limitations. One the biggest limitation for me is OpenMP doesn't work over the network. My scheduling algorithm allows to share tasks between many computers in very efficient way.

@mancoast, I was a lazy programmer in one piece of the code. It's DOF algorithm. This is the place where I use OpenMP. It doesn't utilize all CPU cores as I want. It reaches not more than 70% of CPU load. This is the place where you can look and do better implementation of paralelizm.


	Logged

My fractal gallery: http://krzysztofmarczak.deviantart.com
fractal animations: http://www.youtube.com/user/xlace
Mandelbulber program: https://github.com/buddhi1980/mandelbulber2

mancoast

Alien

Posts: 21

Re: VTune Results; cRenderWorker::RayMarching -> OpenMP or Threading Building Blocks

« Reply #8 on: July 31, 2016, 02:57:32 PM »

Thanks for the pointers.
i will investigate the DOF algorithm and see what I can find.


	Logged

Pages: [1] Go Down

« previous next »

	Author	Topic: VTune Results; cRenderWorker::RayMarching -> OpenMP or Threading Building Blocks (Read 4098 times)
		Description: we are spending much time in spin lock with threads > 100
0 Members and 1 Guest are viewing this topic.

Related Topics
	Subject	Started by	Replies	Views	Last post
	Self-similarity of variable stars and their atomic scale building blocks Fractals Applied or in Nature	rloldershaw	3	3359	October 16, 2006, 05:50:20 PM by rloldershaw
	Building Blocks for a King/Prince Images Showcase (Rate My Fractal)	Buddy	0	1696	November 08, 2010, 11:35:19 PM by Buddy
	Building Blocks Images Showcase (Rate My Fractal)	CorrectJeans	0	1082	April 19, 2014, 04:09:33 AM by CorrectJeans
	VTune Results; cRenderWorker::RayRecursion; VolumetricShader vs. RayMarching Mandelbulber	mancoast	6	8266	August 02, 2016, 11:26:17 PM by Buddhi
	VTune Results; Compute vs. CVector3::IsNotANumber. Mandelbulber	mancoast	0	5737	August 03, 2016, 03:42:16 AM by mancoast

END OF AN ERA, FRACTALFORUMS.COM IS CONTINUED ON FRACTALFORUMS.ORG

it was a great time but no longer maintainable by c.Kleinhuis contact him for any data retrieval,
thanks and see you perhaps in 10 years again

The All New FractalForums is now in Public Beta Testing! Visit FractalForums.org and check it out!

	Welcome, Guest. Please login or register.	January 09, 2026, 11:39:00 PM
		Login with username, password and session length

END OF AN ERA, FRACTALFORUMS.COM IS CONTINUED ON FRACTALFORUMS.ORG

it was a great time but no longer maintainable by c.Kleinhuis contact him for any data retrieval, thanks and see you perhaps in 10 years again

The All New FractalForums is now in Public Beta Testing! Visit FractalForums.org and check it out!

it was a great time but no longer maintainable by c.Kleinhuis contact him for any data retrieval,
thanks and see you perhaps in 10 years again