Title: cluster scalability problems (24 + 6 cores) Post by: ker2x on September 19, 2010, 02:42:57 PM Hi !
I'm running ultrafractal client on my local computer (4 cores) + connected to a 24 cores server (running wine) + a 6 core server (running wine). i launched 24 + 6 nodes from my client, and a 10000x7000 3D Rendering, the cpu usage rarely go over 200% usage on the 24 cores server instead of close to 2400% What's happening ? (fiber optic network on both side, the ping is ~20ms (from toulouse to paris)) Title: Re: cluster scalability problems (24 + 6 cores) Post by: bib on September 19, 2010, 03:16:03 PM Although I don't have access to such a setup, I have also seen that behaviour on my local machines, depending on the fractal formula. Have you tested it for example with a basic Mandelbrot zoom?
Title: Re: cluster scalability problems (24 + 6 cores) Post by: ker2x on September 19, 2010, 04:21:09 PM i found that the nodes are computing very tiny block. usually 1 or 2s of computing per block, which is really too small.
I decided to run UF5 client directly on the 24 cores monster and forget about remote nodes. Much faster and 2100% CPU \o/ 10000x7000 mandelbox in progress... 22h remaining. Next i'll try with mandelbulber 3D, if possible. Title: Re: cluster scalability problems (24 + 6 cores) Post by: ker2x on September 19, 2010, 04:49:38 PM i will be able to play with 216 cores (yes, 9x24cores) for a few days ... too bad it won't scale :(
edit : just for the fun of it : Cpu0 :100.0%us, 0.0%sy, 0.0%ni, 0.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu1 :100.0%us, 0.0%sy, 0.0%ni, 0.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu2 :100.0%us, 0.0%sy, 0.0%ni, 0.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu3 :100.0%us, 0.0%sy, 0.0%ni, 0.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu4 : 99.7%us, 0.3%sy, 0.0%ni, 0.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu5 :100.0%us, 0.0%sy, 0.0%ni, 0.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu6 :100.0%us, 0.0%sy, 0.0%ni, 0.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu7 :100.0%us, 0.0%sy, 0.0%ni, 0.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu8 :100.0%us, 0.0%sy, 0.0%ni, 0.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu9 : 99.7%us, 0.0%sy, 0.0%ni, 0.3%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu10 :100.0%us, 0.0%sy, 0.0%ni, 0.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu11 :100.0%us, 0.0%sy, 0.0%ni, 0.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu12 : 97.3%us, 0.0%sy, 0.0%ni, 2.7%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu13 :100.0%us, 0.0%sy, 0.0%ni, 0.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu14 :100.0%us, 0.0%sy, 0.0%ni, 0.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu15 :100.0%us, 0.0%sy, 0.0%ni, 0.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu16 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu17 :100.0%us, 0.0%sy, 0.0%ni, 0.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu18 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu19 :100.0%us, 0.0%sy, 0.0%ni, 0.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu20 :100.0%us, 0.0%sy, 0.0%ni, 0.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu21 :100.0%us, 0.0%sy, 0.0%ni, 0.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu22 : 0.6%us, 0.6%sy, 0.0%ni, 98.8%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu23 :100.0%us, 0.0%sy, 0.0%ni, 0.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Title: Re: cluster scalability problems (24 + 6 cores) Post by: David Makin on September 19, 2010, 08:24:49 PM I think UF5's network capabilities were designed with a local network in mind, not going via the internet - hence the size of blocks per thread used are small, this is fine over a local network but subject to relatively large delays over the internet.
Also I seem to recall that network rendering will proceed at the rate of the *slowest* machine on the network - so if one computer has 24 cores and another networked has just 6 then it's likely that only 6 will be used on the 24 core system, or if I networked my P4HT with a quadcore system even locally then the best speed I would get is double the speed of my P4HT which is actually slower than a quadcore on its own. Title: Re: cluster scalability problems (24 + 6 cores) Post by: ker2x on September 19, 2010, 11:22:49 PM I think UF5's network capabilities were designed with a local network in mind, not going via the internet - hence the size of blocks per thread used are small, this is fine over a local network but subject to relatively large delays over the internet. Also I seem to recall that network rendering will proceed at the rate of the *slowest* machine on the network - so if one computer has 24 cores and another networked has just 6 then it's likely that only 6 will be used on the 24 core system, or if I networked my P4HT with a quadcore system even locally then the best speed I would get is double the speed of my P4HT which is actually slower than a quadcore on its own. Mmm, i will try then. The 9x24 cores will be in the same datacenter with a very good network :) |