Logo by Trifox - Contribute your own Logo!

END OF AN ERA, FRACTALFORUMS.COM IS CONTINUED ON FRACTALFORUMS.ORG

it was a great time but no longer maintainable by c.Kleinhuis contact him for any data retrieval,
thanks and see you perhaps in 10 years again

this forum will stay online for reference
News: Follow us on Twitter
 
*
Welcome, Guest. Please login or register. April 18, 2024, 10:23:02 AM


Login with username, password and session length


The All New FractalForums is now in Public Beta Testing! Visit FractalForums.org and check it out!


Pages: [1]   Go Down
  Print  
Share this topic on DiggShare this topic on FacebookShare this topic on GoogleShare this topic on RedditShare this topic on StumbleUponShare this topic on Twitter
Author Topic: cluster scalability problems (24 + 6 cores)  (Read 3932 times)
0 Members and 1 Guest are viewing this topic.
ker2x
Fractal Molossus
**
Posts: 795


WWW
« on: September 19, 2010, 02:42:57 PM »

Hi !
I'm running ultrafractal client on my local computer (4 cores) + connected to a 24 cores server (running wine) + a 6 core server (running wine).
i launched 24 + 6 nodes from my client, and a 10000x7000 3D Rendering, the cpu usage rarely go over 200% usage on the 24 cores server instead of close to 2400%

What's happening ?

(fiber optic network on both side, the ping is ~20ms (from toulouse to paris))

Logged

often times... there are other approaches which are kinda crappy until you put them in the context of parallel machines
(en) http://www.blog-gpgpu.com/ , (fr) http://www.keru.org/ ,
Sysadmin & DBA @ http://www.over-blog.com/
bib
Global Moderator
Fractal Senior
******
Posts: 2070


At the borders...


100008697663777 @bib993
WWW
« Reply #1 on: September 19, 2010, 03:16:03 PM »

Although I don't have access to such a setup, I have also seen that behaviour on my local machines, depending on the fractal formula. Have you tested it for example with a basic Mandelbrot zoom?
Logged

Between order and disorder reigns a delicious moment. (Paul Valéry)
ker2x
Fractal Molossus
**
Posts: 795


WWW
« Reply #2 on: September 19, 2010, 04:21:09 PM »

i found that the nodes are computing very tiny block. usually 1 or 2s of computing per block, which is really too small.
I decided to run UF5 client directly on the 24 cores monster and forget about remote nodes. Much faster and 2100% CPU \o/

10000x7000 mandelbox in progress... 22h remaining.
Next i'll try with mandelbulber 3D, if possible.
Logged

often times... there are other approaches which are kinda crappy until you put them in the context of parallel machines
(en) http://www.blog-gpgpu.com/ , (fr) http://www.keru.org/ ,
Sysadmin & DBA @ http://www.over-blog.com/
ker2x
Fractal Molossus
**
Posts: 795


WWW
« Reply #3 on: September 19, 2010, 04:49:38 PM »

i will be able to play with 216 cores (yes, 9x24cores) for a few days ... too bad it won't scale sad

edit : just for the fun of it :


Cpu0  :100.0%us,  0.0%sy,  0.0%ni,  0.0%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu1  :100.0%us,  0.0%sy,  0.0%ni,  0.0%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu2  :100.0%us,  0.0%sy,  0.0%ni,  0.0%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu3  :100.0%us,  0.0%sy,  0.0%ni,  0.0%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu4  : 99.7%us,  0.3%sy,  0.0%ni,  0.0%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu5  :100.0%us,  0.0%sy,  0.0%ni,  0.0%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu6  :100.0%us,  0.0%sy,  0.0%ni,  0.0%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu7  :100.0%us,  0.0%sy,  0.0%ni,  0.0%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu8  :100.0%us,  0.0%sy,  0.0%ni,  0.0%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu9  : 99.7%us,  0.0%sy,  0.0%ni,  0.3%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu10 :100.0%us,  0.0%sy,  0.0%ni,  0.0%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu11 :100.0%us,  0.0%sy,  0.0%ni,  0.0%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu12 : 97.3%us,  0.0%sy,  0.0%ni,  2.7%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu13 :100.0%us,  0.0%sy,  0.0%ni,  0.0%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu14 :100.0%us,  0.0%sy,  0.0%ni,  0.0%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu15 :100.0%us,  0.0%sy,  0.0%ni,  0.0%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu16 :  0.0%us,  0.0%sy,  0.0%ni,100.0%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu17 :100.0%us,  0.0%sy,  0.0%ni,  0.0%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu18 :  0.0%us,  0.0%sy,  0.0%ni,100.0%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu19 :100.0%us,  0.0%sy,  0.0%ni,  0.0%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu20 :100.0%us,  0.0%sy,  0.0%ni,  0.0%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu21 :100.0%us,  0.0%sy,  0.0%ni,  0.0%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu22 :  0.6%us,  0.6%sy,  0.0%ni, 98.8%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu23 :100.0%us,  0.0%sy,  0.0%ni,  0.0%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
« Last Edit: September 19, 2010, 05:11:04 PM by ker2x » Logged

often times... there are other approaches which are kinda crappy until you put them in the context of parallel machines
(en) http://www.blog-gpgpu.com/ , (fr) http://www.keru.org/ ,
Sysadmin & DBA @ http://www.over-blog.com/
David Makin
Global Moderator
Fractal Senior
******
Posts: 2286



Makin' Magic Fractals
WWW
« Reply #4 on: September 19, 2010, 08:24:49 PM »

I think UF5's network capabilities were designed with a local network in mind, not going via the internet - hence the size of blocks per thread used are small, this is fine over a local network but subject to relatively large delays over the internet.
Also I seem to recall that network rendering will proceed at the rate of the *slowest* machine on the network - so if one computer has 24 cores and another networked has just 6 then it's likely that only 6 will be used on the 24 core system, or if I networked my P4HT with a quadcore system even locally then the best speed I would get is double the speed of my P4HT which is actually slower than a quadcore on its own.
Logged

The meaning and purpose of life is to give life purpose and meaning.

http://www.fractalgallery.co.uk/
"Makin' Magic Music" on Jango
ker2x
Fractal Molossus
**
Posts: 795


WWW
« Reply #5 on: September 19, 2010, 11:22:49 PM »

I think UF5's network capabilities were designed with a local network in mind, not going via the internet - hence the size of blocks per thread used are small, this is fine over a local network but subject to relatively large delays over the internet.
Also I seem to recall that network rendering will proceed at the rate of the *slowest* machine on the network - so if one computer has 24 cores and another networked has just 6 then it's likely that only 6 will be used on the 24 core system, or if I networked my P4HT with a quadcore system even locally then the best speed I would get is double the speed of my P4HT which is actually slower than a quadcore on its own.

Mmm, i will try then.
The 9x24 cores will be in the same datacenter with a very good network smiley
Logged

often times... there are other approaches which are kinda crappy until you put them in the context of parallel machines
(en) http://www.blog-gpgpu.com/ , (fr) http://www.keru.org/ ,
Sysadmin & DBA @ http://www.over-blog.com/
Pages: [1]   Go Down
  Print  
 
Jump to:  

Related Topics
Subject Started by Replies Views Last post
Multiprocessor / cores support feature request lutcho 1 739 Last post September 28, 2012, 02:04:10 AM
by lutcho
Axolotl Star Cluster Axolotl Alef 0 1012 Last post May 22, 2013, 04:30:18 PM
by Alef
The Fibonacci Cluster Images Showcase (Rate My Fractal) Pauldelbrot 0 978 Last post February 12, 2014, 03:47:50 PM
by Pauldelbrot
CPU Cores limit Mandelbulb 3d xahhax 3 1472 Last post January 25, 2016, 11:53:26 AM
by barcud
Queue render ignores MAX # of cores setting Bug Reporting paigan0 1 3370 Last post May 06, 2017, 04:27:33 PM
by Buddhi

Powered by MySQL Powered by PHP Powered by SMF 1.1.21 | SMF © 2015, Simple Machines

Valid XHTML 1.0! Valid CSS! Dilber MC Theme by HarzeM
Page created in 0.16 seconds with 26 queries. (Pretty URLs adds 0.008s, 2q)