Logo by miles - Contribute your own Logo!

END OF AN ERA, FRACTALFORUMS.COM IS CONTINUED ON FRACTALFORUMS.ORG

it was a great time but no longer maintainable by c.Kleinhuis contact him for any data retrieval,
thanks and see you perhaps in 10 years again

this forum will stay online for reference
News: Follow us on Twitter
 
*
Welcome, Guest. Please login or register. September 23, 2019, 10:58:45 PM


Login with username, password and session length


The All New FractalForums is now in Public Beta Testing! Visit FractalForums.org and check it out!


Pages: [1]   Go Down
  Print  
Share this topic on DiggShare this topic on FacebookShare this topic on GoogleShare this topic on RedditShare this topic on StumbleUponShare this topic on Twitter
Author Topic: cpu performance  (Read 1162 times)
0 Members and 1 Guest are viewing this topic.
quaz0r
Fractal Molossus
**
Posts: 652



« on: October 16, 2016, 05:43:37 PM »

(my cpu is an intel haswell 4770k)

so i was testing my mandelbrot program yesterday and was experiencing some intermittent shockingly-fast performance, seemingly without rhyme or reason.  it would go really fast for a little while then start going slower.  the instances where it would go really fast, i could hear my cpu fan spin up rather loud (it usually doesnt spin up at all).  in fact, this was the first time ive heard my new cpu fan spin up at all since i got this monstrous new HSF a few months ago.  i just figured the heatsink is so big maybe the fan doesnt need to spin up much (it IS spinning though; you can see it is always spinning at a low setting), though it seemed kind of strange as i would still expect the fan to spin up while under load running mandelbrot renders all the time and such.  anyway the whole thing has me thinking, and wondering what exactly is going on, and what if anything might be able to be better configured to coax out that high performance.

the difference in performance is indeed rather shocking too:  one of the times that it went fast, the render completed in 27 seconds.  so i started re-running the same exact render to see what kind of performance difference i might notice:  the second time it took 2m33s.  the third time it took 3m0s.  the fourth time it took 2m0s.  then on successive tries after that, it would seemingly randomly take either 2m, 2.5m, or 3m.  and i know my code is taking the same path and doing the same exact thing each time too, so its not just that my program is doing something different.  i even have lots of timers and progress bars for all the different stages of computations:  for reference iteration, delta initialization, perturbation iteration, and rendering.  and you can clearly see each stage uniformly going faster or slower across the board.

that is upward of >6x performance difference!  that is positively insane.  i simply cant picture normal expected operating behavior of either turbo boosting or throttling accounting for that huge of a difference.

possibly related (my program is explicitly vectorized), possibly not:  recently i noticed a discussion amongst some HPC people discussing some of the black-box voodoo that goes on inside modern cpus nowadays, specifically intel chips with avx.  one interesting thing they mentioned was that when the cpu first starts getting hit with avx instructions, it will emulate them for a short while before it actually fires up the vector units.  a more concerning thing mentioned was the existence of avx-specific throttling.  googling for avx throttling finds some discussions related to xeon chips, and i found some stuff claiming that avx throttling isnt supposed to happen on desktop chips.  not really sure what to make of this.

so anyhow, does anybody have any experience with / thoughts about any of this?  not only is it maddening to think there could be that much latent performance in my system that i am rarely if ever accessing, it also makes it kind of a ridiculous prospect to work on optimizing your code when your system performance can vary so wildly.  i dont know how i am even supposed to know when my code is better or worse under these conditions...   fiery
Logged
3dickulus
Global Moderator
Fractal Senior
******
Posts: 1558



WWW
« Reply #1 on: October 16, 2016, 07:45:40 PM »

Can you run your program w/o desktop and gui, dos/console only, sending timing info to log or console?
Can/do you run a warm up before starting the render proper? by that I mean a short loop using some avx instructions to set the "state" for rendering.
I found a similar effect when hacking SFT to C, it ran faster after a quick "dummy warm up" run, only by about 10%.
Your >6X definitely warrants some investigation.
Logged

Resistance is fertile...
You will be illuminated!

                            #B^] https://en.wikibooks.org/wiki/Fractals/fragmentarium
claude
Fractal Bachius
*
Posts: 563



WWW
« Reply #2 on: October 17, 2016, 08:04:43 PM »

I suspect something to do with threading affecting the CPU scaling governor operation.  When benchmarking or desirous of fan noise, I run (bash, Linux)
Code:
for core in 0 1 2 3 ; do sudo cpufreq-set -c $core -g performance ; done
Afterwards I run the same command with "ondemand" instead of "performance".  Needs modifying for more / fewer cores too.
Logged
skychurch
Alien
***
Posts: 22



« Reply #3 on: October 18, 2016, 02:02:17 AM »

Are you employing msam to process via the ymm(N) registers, or are you using intrinsics or some other method?
Logged
quaz0r
Fractal Molossus
**
Posts: 652



« Reply #4 on: October 18, 2016, 03:21:18 AM »

Quote from: 3dickulus
Your >6X definitely warrants some investigation.

 hurt

Quote from: claude
I suspect something to do with threading affecting the CPU scaling governor operation

yeah i dont even build the powersave stuff into my kernel to try to make sure it just runs at the max all the time... need to see what goofy stuff the bios might be doing next time i reboot

Quote from: skychurch
Are you employing msam to process via the ymm(N) registers, or are you using intrinsics or some other method?

its a c++ library that employs intrinsics
Logged
hobold
Fractal Bachius
*
Posts: 573


« Reply #5 on: October 18, 2016, 05:04:53 PM »

yeah i dont even build the powersave stuff into my kernel to try to make sure it just runs at the max all the time...
In that case the CPU still has the freedom to throttle its clock when it detects overheating. Is the fan / cooler / thermal paste still in place and working as intended?
Logged
quaz0r
Fractal Molossus
**
Posts: 652



« Reply #6 on: October 20, 2016, 08:41:38 PM »

i think maybe my cpu is constantly keeping itself throttled to 800mhz...  huh?  my bios always says "current speed 800mhz" but i always assumed that the bios is set to only run at the minimum speed because the bios doesnt really need to run at lightning speed or anything.  now im wondering if i was wrong about that.  my /proc/cpuinfo always reports the standard 3500mhz but i wondered if maybe it is just reporting what its set to be and not what it really is running at, especially since i dont have any of the performance setting/monitoring stuff enabled in my kernel.  so i rebooted to a linux usbkey which will have all that stuff enabled to see if the /proc/cpuinfo is different.  indeed it reads 800mhz on all cores.  i tried running a multithreaded pi sieve while on the linux usb key system and the /proc/cpuinfo never changes from 800mhz even though top shows all cores at 100% with the pi thing running.  also like i say my cpu fan never spins up even under load, which would make sense if its always stuck at 800mhz.  ive been looking at all my bios options and trying different stuff for hours and i cant make any progress.  googling for this it sounds like people run some windows program called throttlestop which i guess disables the cpu's internal throttling or something.  not sure how to do it from linux, or why i need to do it at all, or if thats even my problem, or what to do next...

my bios shows the cpu temperature at or below 40' so its not like it is actually burning up or something..
Logged
quaz0r
Fractal Molossus
**
Posts: 652



« Reply #7 on: October 20, 2016, 09:30:06 PM »

ok after reading more posts from people discussing this BD PROCHOT thing and this windows tool called throttlestop, i finally found a post where they mention how to read and write this crap from linux:  msr-tools

i installed msr-tools, did what they said to check this register or whatever it is:  rdmsr 0x1FC

it reported 0x5d, whatever that means.  i did:  wrmsr 0x1FC 0

to set it to zero, and now my computer runs at full speed!   huh?  fiery  huh?  fiery  fiery  fiery

i cant believe i have this badass system and its been running like a pentium2 this whole god damn time.  what in the serious motherloving fluff.  i dont know if its been running like this the whole time ive had this system or if its a more recent development, but i am both elated to have a system that is like 10x faster than i am used to and so god damn furious that this was even happening... ffs
Logged
Pages: [1]   Go Down
  Print  
 
Jump to:  

Related Topics
Subject Started by Replies Views Last post
Animation Rendering Performance Problem Mandelbulb 3d Weber 7 1378 Last post April 14, 2012, 04:51:46 AM
by rahulmukerji
questions: how to colour, performance and zoo Introduction to Fractals and Related Links willvarfar 4 1119 Last post August 10, 2012, 07:52:38 PM
by eiffie
entity's performance at evoke 2012 chaosTube - Gallery cKleinhuis 0 940 Last post August 28, 2012, 05:55:20 PM
by cKleinhuis
Fractal Architect Rendering Performance Macintosh Fractal Software sbrodheadsr 1 1606 Last post January 17, 2014, 04:48:37 AM
by lycium
Rendertime - High Performance Synthclipse cKleinhuis 10 949 Last post September 19, 2014, 08:05:20 PM
by cKleinhuis

Powered by MySQL Powered by PHP Powered by SMF 1.1.21 | SMF © 2015, Simple Machines

Valid XHTML 1.0! Valid CSS! Dilber MC Theme by HarzeM
Page created in 0.308 seconds with 27 queries. (Pretty URLs adds 0.009s, 2q)