Logo by HPDZ - Contribute your own Logo!

END OF AN ERA, FRACTALFORUMS.COM IS CONTINUED ON FRACTALFORUMS.ORG

it was a great time but no longer maintainable by c.Kleinhuis contact him for any data retrieval,
thanks and see you perhaps in 10 years again

this forum will stay online for reference
News: Did you know ? you can use LaTex inside Postings on fractalforums.com!
 
*
Welcome, Guest. Please login or register. April 18, 2024, 04:37:40 AM


Login with username, password and session length


The All New FractalForums is now in Public Beta Testing! Visit FractalForums.org and check it out!


Pages: [1]   Go Down
  Print  
Share this topic on DiggShare this topic on FacebookShare this topic on GoogleShare this topic on RedditShare this topic on StumbleUponShare this topic on Twitter
Author Topic: Intel Sandy Bridge and AVX extension  (Read 3063 times)
0 Members and 1 Guest are viewing this topic.
ker2x
Fractal Molossus
**
Posts: 795


WWW
« on: April 14, 2010, 04:23:19 PM »


To be released in Q1 2011,
Lot of shiny things, including :
The size of the SIMD vector registers is increased from 128-bits XMM registers to 256-bits registers called YMM0 - YMM15. Existing 128-bit instructions use the lower half of the YMM registers. Further extensions to 512 or 1024 bits are expected in the future.

woooooooooooooooooooooohooooooooooooooooo \o/
Logged

often times... there are other approaches which are kinda crappy until you put them in the context of parallel machines
(en) http://www.blog-gpgpu.com/ , (fr) http://www.keru.org/ ,
Sysadmin & DBA @ http://www.over-blog.com/
hobold
Fractal Bachius
*
Posts: 573


« Reply #1 on: April 14, 2010, 04:35:43 PM »

Beware, the first hardware implementations are unlikely to have full width SIMD ALUs. The 256 bits wide vectors will probably be processed as two halves of 128 bit, either occupying two (simple/integer) vector ALUs simultaneously, or one (complex/floating point) ALU for consecutive clock cycles.

Raw throughput will initially not be doubled. But future chip versions might upgrade the hardware to full width.


Intel's SIMD instruction sets have a few other conceptual limitations (lack of generic permutes and a few other processing primitives that require more operands), but fractals are usually embarrassingly parallel. So for our purposes here, AVX should pave the way for ever faster rendering.
Logged
ker2x
Fractal Molossus
**
Posts: 795


WWW
« Reply #2 on: April 14, 2010, 05:28:20 PM »

Beware, the first hardware implementations are unlikely to have full width SIMD ALUs. The 256 bits wide vectors will probably be processed as two halves of 128 bit, either occupying two (simple/integer) vector ALUs simultaneously, or one (complex/floating point) ALU for consecutive clock cycles.

Raw throughput will initially not be doubled. But future chip versions might upgrade the hardware to full width.


Intel's SIMD instruction sets have a few other conceptual limitations (lack of generic permutes and a few other processing primitives that require more operands), but fractals are usually embarrassingly parallel. So for our purposes here, AVX should pave the way for ever faster rendering.


Indeed. This is just a first step to a shiny future.
As far as i understood, there is no way, yet, to do some math on the 256bits registers. eg : a single instruction to add 8x32bits from 2 256bits registers, and put the result in a 3rd register.

Logged

often times... there are other approaches which are kinda crappy until you put them in the context of parallel machines
(en) http://www.blog-gpgpu.com/ , (fr) http://www.keru.org/ ,
Sysadmin & DBA @ http://www.over-blog.com/
hobold
Fractal Bachius
*
Posts: 573


« Reply #3 on: April 14, 2010, 07:59:01 PM »

Well, the processor of the Xbox360 has a few of these "horizontal" instructions, but they are more a convenience than a true addition to the SIMD paradigm. If an algorithm is massively data parallel, and has relatively weak data dependencies, then a resourceful programmer can usually find a data layout that fits the hardware. And when the data flow patterns are trickier, you typically need something more general, like a permute, to implement them.

At the moment it seems more likely that the GPU vendors will implement permute (and perhaps conditional split, I heard Nvidia calls it "warp reforming"), because they have more of an incentive to push their hardware to general purpose. Intel already has THE general purpose processors and there is less pressure to make the SIMD extensions more general as well.
Logged
Pages: [1]   Go Down
  Print  
 
Jump to:  

Related Topics
Subject Started by Replies Views Last post
Bridge Mandelbulb3D Gallery openminded 0 912 Last post September 08, 2010, 11:42:48 AM
by openminded
Bridge Mandelbulb3D Gallery kr0mat1k 0 713 Last post February 25, 2011, 09:00:26 PM
by kr0mat1k
collatz fractal extension (new) Theories & Research M Benesi 2 441 Last post January 31, 2014, 05:22:39 PM
by s31415
JWildfire-Extension work Images Showcase (Rate My Fractal) chronologicaldot 2 2012 Last post March 12, 2014, 01:51:31 PM
by alij
Sandy Brot Still Frame - Wildstyle phtolo 0 641 Last post June 21, 2017, 09:38:15 PM
by phtolo

Powered by MySQL Powered by PHP Powered by SMF 1.1.21 | SMF © 2015, Simple Machines

Valid XHTML 1.0! Valid CSS! Dilber MC Theme by HarzeM
Page created in 0.165 seconds with 25 queries. (Pretty URLs adds 0.006s, 2q)