Logo by jwm-art - Contribute your own Logo!

END OF AN ERA, FRACTALFORUMS.COM IS CONTINUED ON FRACTALFORUMS.ORG

it was a great time but no longer maintainable by c.Kleinhuis contact him for any data retrieval,
thanks and see you perhaps in 10 years again

this forum will stay online for reference
News: Follow us on Twitter
 
*
Welcome, Guest. Please login or register. April 25, 2024, 08:06:51 AM


Login with username, password and session length


The All New FractalForums is now in Public Beta Testing! Visit FractalForums.org and check it out!


Pages: [1]   Go Down
  Print  
Share this topic on DiggShare this topic on FacebookShare this topic on GoogleShare this topic on RedditShare this topic on StumbleUponShare this topic on Twitter
Author Topic: Shader 2 coding  (Read 6236 times)
0 Members and 1 Guest are viewing this topic.
David Makin
Global Moderator
Fractal Senior
******
Posts: 2286



Makin' Magic Fractals
WWW
« on: November 14, 2010, 07:01:31 PM »

Hi all,

Am fairly new to coding for shaders and am just starting a general fractal program to use shader 2 (OpenGL ES2) to go on iPhone/iPad etc. and possibly for webGL too.
Anyway the emphasis is going to be on speed rather than deep-zooming so I'm going to be sticking to float rather than attempting to extend to doulble.
The thing is the best way of enhancing user expeience in any graphics software is to keep things as interactive as possible and to this end I want to impliment a progressive resolution increasing algorithm so users get the full view as quickly as possible i.e. the way UF's progressive rendering works - similar to Xaos for those unfamiliar with UF.
To do this you render at say 1/16 resolution then fill in the 3 parts of each pixel (top-right and bottom two) at 1/4 resolution then fill in the 3 parts of each pixel (top-right and bottom two) at full resolution - obviously for particularly slow renders one could start at 1/32 or 1/64 etc.

My question is what is the best way to do this using shader 2 fragments ? - I can think quickly of 2 possible alternatives:

1. Render all the individual boxes as separate sections of a large texture buffer (in a given time), then have a separate shader that combines the boxes to the current resolution achieved in the time.
2. Render the first box to one texture buffer then use that as a source so these pixels are just fetched on the next pass rather than re-rendered and the other 3/4 are calculated.

Of course here I'm assuming that I'm correct in that I can't find a method whereby the destination for shaders can be set to skip pixels in some way ? Though maybe this could be done by fudging the pixel/colour format information of the destination texture ?

Logged

The meaning and purpose of life is to give life purpose and meaning.

http://www.fractalgallery.co.uk/
"Makin' Magic Music" on Jango
marius
Fractal Lover
**
Posts: 206


« Reply #1 on: November 14, 2010, 08:01:35 PM »

Hi all,

Am fairly new to coding for shaders and am just starting a general fractal program to use shader 2 (OpenGL ES2) to go on iPhone/iPad etc. and possibly for webGL too.
Anyway the emphasis is going to be on speed rather than deep-zooming so I'm going to be sticking to float rather than attempting to extend to doulble.
The thing is the best way of enhancing user expeience in any graphics software is to keep things as interactive as possible and to this end I want to impliment a progressive resolution increasing algorithm so users get the full view as quickly as possible i.e. the way UF's progressive rendering works - similar to Xaos for those unfamiliar with UF.
To do this you render at say 1/16 resolution then fill in the 3 parts of each pixel (top-right and bottom two) at 1/4 resolution then fill in the 3 parts of each pixel (top-right and bottom two) at full resolution - obviously for particularly slow renders one could start at 1/32 or 1/64 etc.

My question is what is the best way to do this using shader 2 fragments ? - I can think quickly of 2 possible alternatives:

1. Render all the individual boxes as separate sections of a large texture buffer (in a given time), then have a separate shader that combines the boxes to the current resolution achieved in the time.
2. Render the first box to one texture buffer then use that as a source so these pixels are just fetched on the next pass rather than re-rendered and the other 3/4 are calculated.

Of course here I'm assuming that I'm correct in that I can't find a method whereby the destination for shaders can be set to skip pixels in some way ? Though maybe this could be done by fudging the pixel/colour format information of the destination texture ?



There are probably better ways but have a look at the tweak I did for boxplorer's vertex.glsl to do cross-eyed or over-under 3d. In the vertex shader you can step over the 'grid' skipping rays. Hardware blit can scale it up pretty quickly I imagine. Not clear how you'd go from 1/4 res to full res w/o recomputing the 1/4 resolution rays though.
A random scatter within the 4x4 or 8x8 block would be nice, then an image will settle to full rez once movement stops. That's what mandelflyer appears to do.
Logged
David Makin
Global Moderator
Fractal Senior
******
Posts: 2286



Makin' Magic Fractals
WWW
« Reply #2 on: November 14, 2010, 09:26:50 PM »

Just found and read the info about using stencils - I guess that's the way to skip pixels smiley

However after further consideration of the options I think I'm going with the method where the initial (fractal) rendering is done using small box areas - 4 @ 1/32, 4 @ 1/16 etc. to areas of a single texture and then when time has run out they are combined from that source to a destination texture using a special separate shader program which combines all the areas from 1/32 res to the current maximum acheived res. - this destination is then displayed.
Then on the next loop if there are no changes then the res. continues (doing 4 @ 1/8, 4 @ 1/4 etc.) again until time runs out and again the other shader code is used to combine again to the destination texture for display (of course if there are changes then we simply restart at 1/32).

I think that method is probably less computationally expensive even than using stencils because the fractal rendering is always a "complete" area.
Logged

The meaning and purpose of life is to give life purpose and meaning.

http://www.fractalgallery.co.uk/
"Makin' Magic Music" on Jango
David Makin
Global Moderator
Fractal Senior
******
Posts: 2286



Makin' Magic Fractals
WWW
« Reply #3 on: November 15, 2010, 08:29:43 PM »

Just thought I'd mention that shader 2 is noticeably faster on the shader 2 enabled iPhone/iTouch/iPad devices than it is on an older Mac mini (using Intel GMA 950).
Logged

The meaning and purpose of life is to give life purpose and meaning.

http://www.fractalgallery.co.uk/
"Makin' Magic Music" on Jango
cbuchner1
Fractal Phenom
******
Posts: 443


« Reply #4 on: November 16, 2010, 01:39:22 AM »

Don't interpret the lack of responses as lack of interest. It's highly intriguing what you are doing.

Isn't pixel shader 2.0 rather limited in terms of instruction count and branching? Or does "Shader 2" represent the particular GLSL dialect of OpenGL ES 2.0?

Last time I programmed the GMA 950 it was quite a pain (that involved OpenGL ARB_fragment_program and ARB_vertex_program - a dead end in shader evolution). 96 instructions per fragment program at most, thereof 64 instructions using the ALU. All in some obscure low level assembly shader language. No branching permitted and limited amount of registers (16) to use. Still I was able to calculate some physics (radio propagation and cellular radio coverage) with it at interactive frame rates.

I am quite amazed how much graphics power handheld device have these days. I recently bought a beagleboard XM ARM development board and it has PowerVR graphics that can more than compete with the graphics cards I had in my desktop PC 10 years ago. At a fraction of costs and power consumption.
« Last Edit: November 16, 2010, 01:51:47 AM by cbuchner1 » Logged
cKleinhuis
Administrator
Fractal Senior
*******
Posts: 7044


formerly known as 'Trifox'


WWW
« Reply #5 on: November 16, 2010, 02:24:19 AM »

you can read my article i worte for the "ShaderX3" book series, in this article i describe how to overcome calculation of large
formula limits with shader2.0 hardware, by using in-between-values-buffers, but in fact i think shader2 programming is oldschool wink
but i see, that current mobile hardware has a need for shader2 programming ...

i see you are using OpenGL ES2, which is wonderful, because it is a high level language

for colouring of your renderings, you can use a simple predefined 1 dimensional texture with an arbitrary gradient

right now i am out of programming graphics hardware ( a real pitty ) but i find it interesting that mobile devices
feature gpu programming ...,
Logged

---

divide and conquer - iterate and rule - chaos is No random!
David Makin
Global Moderator
Fractal Senior
******
Posts: 2286



Makin' Magic Fractals
WWW
« Reply #6 on: November 16, 2010, 04:28:13 AM »

Yes, by "shader 2" I meant using OpenGL ES2..
Here's my main fragment - recompiled on changing "#define"s (inserted where the #define is before compilation) based on users changing the parameters (new formula, colouring etc).
Initially I tried it with all the #conditionals as plain conditionals based on the uniforms so re-compilation wasn't required but that's when I discovered that although it worked it was very slow - it appeared to me that what was happening was the runtime code was too large and iOS4 instead ran the code in emulation mode on the CPU instead, though lyc seemed to think it was just down to the number of branches rather than code size but to me that didn't make sense since the branches concerned would not vary the instruction path from one pixel to another plus the speed chsnge was very abrupt as I reduced the size of the code by removing.some options.

Code:
varying highp vec2 pixel;
uniform sampler2D palettes;
uniform highp vec2 pos;
uniform highp vec2 trapcentre;
uniform highp float bailout;
uniform highp float llb;
uniform highp float smallbail;
uniform highp float cscale;
uniform lowp float pal;
uniform lowp float offset;
uniform int maxiter;

#define

void main()
{
highp vec2 z;
highp vec4 zold;
highp vec2 d1;
highp vec2 d2;
highp vec2 z2;
highp vec2 s;
highp vec4 a;
int i = 0;
a.x = a.y = zold.x = zold.y = 0.0;
#if (mandy==1)
{
z = pos;
}
#else
{
z = pixel;
}
#endif
#if ((usemap==1)||((colouring<5)&&(method==0)))
{
a.x = bailout;
#if (usemap==1)
{
d1 = z-trapcentre;
}
#endif
}
#endif
#if (colouring<5)
{
d2.x = d2.y = 0.0;
}
#elif (colouring==13)
{
d1.x = length(pixel);
}
#endif
z2 = z*z;
a.w = bailout;

do
{
zold.w = zold.y;
zold.z = zold.x;
zold.y = z.y;
zold.x = z.x;
#if (addrot>=4)
{
a.z = z2.x + z2.y;
if (a.z>0.0)
{
a.z = 1.0/sqrt(a.z);
z.y = 2.0*z.x*z.y*a.z;
z.x = (z2.x - z2.y)*a.z;
z2 = z*z;
}
}
#endif

#if (formula==0)
{
z.y = 2.0*z.x*z.y;
z.x = z2.x - z2.y;
}
#elif (formula==1)
{
z.y = z.y*(3.0*z2.x - z2.y);
z.x = z.x*(z2.x - 3.0*z2.y);
}
#elif (formula==2)
{
z.y = 4.0*z.x*z.y*(z2.x - z2.y);
z.x = dot(z2,z2) - 6.0*z2.x*z2.y;
}
#elif (formula==3)
{
a.z = sqrt(z2.x+z2.y);
z2.y = sign(z.y)*sqrt(a.z - z.x);
z2.x = sqrt(a.z + z.x);
z.x = z.x*z2.x - z.y*z2.y;
z.y = z.y*z2.x + zold.x*z2.y;
z *= 0.70710678;
}
#elif (formula==4)
{
z.x = exp(z.x);
z.y = z.x*sin(z.y);
z.x = z.x*cos(zold.y);
}
#endif
#if (mandy==0)
{
z += pos;
}
#else
{
z += pixel;
}
#endif
#if ((addrot==1)||(addrot==3)||(addrot==5)||(addrot==7))
{
z = z + vec2(zold.x,zold.y);
}
#endif
#if ((addrot==2)||(addrot==3)||(addrot==6)||(addrot==7))
{
z = z + vec2(zold.z,zold.w);
}
#endif
if (length(vec2(zold.x,zold.y)-z)<smallbail)
{
i = maxiter;
break;
}
z2 = z*z;
#if (colouring<15)
if ((a.z = z2.x + z2.y)>=bailout)
#else
a.z = z2.x + z2.y;
if (abs(z2.x/z.y)>=bailout)
#endif
{
#if (usemap>0)
{
#if (usemap==1)
{
z.x = abs(d1.x) + 1.0;
z.y = abs(d1.y) + 1.0;
z = log(z);
z.y = -z.y;
}
#else
{
d1.y = log(a.z);
d1.x = 1.0 + atan(z.y,z.x)/6.2831852;
d1.y = 1.0 - (log(d1.y)-llb)/log(d1.y/log(dot(vec2(zold.x,zold.y),vec2(zold.x,zold.y))));
z = d1;
}
#endif
z.x += pal;
z.y += offset;
z *= cscale;
z.x = z.x - floor(z.x);
z.y = z.y - floor(z.y);
break;
}
#endif

#if (colouring<5)
{
#if (colouring==0)
{
z.y = a.x;
}
#elif (colouring==1)
{
z.y = log(a.y + 1.0);
}
#elif (colouring==2)
{
z.y = length(d2 - trapcentre);
}
#elif (colouring==3)
{
z.y = abs(d2.x-trapcentre.x);
}
#else// if (colouring==4)
{
d2 = d2 - trapcentre;
z.y = (5.0/3.1415926)*abs(atan(d2.y,d2.x));
}
#endif
if (z.y>threshold)
{
z.y = threshold;
}
z.y *= 0.15;
}
#else
{
d1.x = log(a.z);
d1.x = (log(d1.x)-llb)/log(d1.y = d1.x/log(dot(vec2(zold.x,zold.y),vec2(zold.x,zold.y))));
#if (orbitfx==1)
{
s.x = z.x;
z.x = z.x - cos(2.0*z.y);
z.y = z.y - 2.0*sin(s.x);
}
#elif (orbitfx==2)
{
s.x = z.x;
z.x = 0.2*sqrt(a.z);
z.y = atan(z.y,s.x);
z.x = 3.1415926*(z.x-floor(z.x));
}
#endif
#if (colouring==15)
{
z.y = float(i)/float(maxiter);//sqrt(log(1.0+1.71828*(float(i) + 1.0)/100.0));
}
#elif (colouring==5)
{
z.y = sqrt(log(1.0+1.71828*(float(i) + 1.0 - d1.x)/100.0));
}
#elif (colouring==6)
{
z.y = ((1.0-d1.x)*abs(atan(z.y,z.x)) + d1.x*abs(atan(zold.y,zold.x)))/3.1415926;
}
#elif (colouring==7)
{
z.x = atan(z.y,z.x);
z.y = atan(zold.y,zold.x);
if (z.x<0.0)
{
z.x += 6.2831852;
}
if (z.y<0.0)
{
z.y += 6.2831852;
}
z.y = (z.x + d1.x*(z.y - z.x) + 0.5)/7.2831852;
}
#elif (colouring==8)
{
a.x /= (float(i)+2.0);
a.y /= (float(i)+1.0);
z.y = 0.5*(a.x + d1.x*(a.y-a.x));
}
#elif (colouring==9)
{
a.x = abs(atan(z.y,z.x)/3.1415926);
z.y = abs(2.0*d1.x - 1.0);
if (a.x>z.y)
{
z.y = a.x;
}
}
#elif (colouring==10)
{
d1.x -= 0.5;
a.x = atan(z.y,z.x)/3.1415926;
z.y = (1.0 - a.x*a.x)*(1.0 - 4.0*d1.x*d1.x);
}
#elif (colouring==11)
{
a.x = atan(z.y,z.x)/6.2831852;
if (a.x<0.0)
{
a.x = 1.0 + a.x;
}
d1.y = a.x*floor(d1.y+0.5);
a.x = abs(1.0 - 2.0*a.x);
z.y = a.x + (1.0-d1.x)*(abs(1.0 - 2.0*(d1.y - floor(d1.y))) - a.x);
}
#elif (colouring==12)
{
a.x = atan(z.y,z.x)/6.2831852;
a.z = atan(zold.y,zold.x)/6.2831852;
d1.y = floor(d1.y+0.5);
if (a.x<0.0)
{
a.x = 1.0 + a.x;
}
a.y = a.x*d1.y;
if (a.z<0.0)
{
a.z = 1.0 + a.z;
}
a.x = abs(1.0 - 2.0*a.x);
d1.y = a.z*d1.y;
a.z = abs(1.0 - 2.0*a.z);
a.x = a.x + (1.0-d1.x)*(abs(1.0 - 2.0*(a.y - floor(a.y))) - a.x);
a.z = a.z + d1.x*(abs(1.0 - 2.0*(d1.y - floor(d1.y))) - a.z);
z.y = a.x + (a.z-a.x)*d1.x;
}
#elif (colouring==13)
{
a.x /= (float(i)+1.0);
if (i>0)
{
a.y /= float(i);
z.y = 2.0*(a.x + (a.y-a.x)*d1.x);
}
else
{
z.y = a.x;
}
}
#elif (colouring==14)
{
if (i>0)
{
a.x /= (float(i)+1.0);
a.y /= float(i);
z.y = sqrt(a.x + (a.y-a.x)*d1.x);
}
else
{
z.y = 0.0;
}
}
#endif
}
#endif

z.y = cscale*(z.y + offset);
z.x = pal;
z.y = z.y - floor(z.y);
break;
}

if (a.z<a.w)
{
a.w = a.z;
}
#if (orbitfx>0)
{
s = z;
#if (orbitfx==1)
{
z.x = z.x - cos(2.0*z.y);
z.y = z.y - 2.0*sin(s.x);
}
#else// if (orbitfx==2)
{
z.x = 0.2*sqrt(a.z);
z.y = atan(z.y,s.x);
z.x = 3.1415926*(z.x-floor(z.x));
}
#endif
}
#endif

#if (usemap==1)
{
a.z = length(d2 = z-trapcentre);
if (a.z<a.x)
{
a.x = a.z;
d1 = d2;
}
}
#elif (usemap==0)
{
#if (colouring<5)
{
d1 = z - trapcentre;
#if (shape==0)
{
a.z = length(d1);
}
#elif (shape==1)
{
a.z = (5.0/3.1415926)*abs(atan(d1.y,d1.x));
}
#elif (shape==2)
{
a.z = abs(d1.x);
}
#else
{
a.z = abs(d1.x);
d1.y = abs(d1.y);
a.z = a.z + d1.y;
}
#endif
if (((method==0)&&(a.z<a.x))||(((method>0)&&(a.z>a.x)&&(a.z<threshold))))
{
a.x = a.z;
d2 = z;
a.y = float(i);
}
}
#elif (colouring==8)
{
a.y = a.x;
a.x = a.x + log(1.0 + 0.5*log(a.z));
}
#elif (colouring==13)
{
a.y = a.x;
d1.y = 0.1 + length(z - pixel);
d2.x = 0.5*abs(d1.y - d1.x);
d1.y = d1.y + d1.x - d2.x;
a.x += (sqrt(a.z) - d2.x)/d1.y;
}
#elif (colouring==14)
{
a.y = a.x;
if (i>0)
{
d1 = z - vec2(zold.x,zold.y);
d2 = vec2(zold.x,zold.y) - vec2(zold.z,zold.w);
a.z = d1.x*d2.x + d1.y*d2.y;
d1.y = d1.y*d2.x - d1.x*d2.y;
a.x += abs(atan(d1.y,a.z));
}
}
#endif
}
#endif

#if (orbitfx>0)
{
z = s;
}
#endif

} while (++i<maxiter);
if (i>=maxiter)
{
z.y = 2.0*sqrt(a.w);//float(maxiter);//
}
gl_FragColor = texture2D(palettes, z);
}
Logged

The meaning and purpose of life is to give life purpose and meaning.

http://www.fractalgallery.co.uk/
"Makin' Magic Music" on Jango
cKleinhuis
Administrator
Fractal Senior
*******
Posts: 7044


formerly known as 'Trifox'


WWW
« Reply #7 on: November 16, 2010, 01:14:19 PM »

damn, those are alot branches, i think by using constants and recompiling, most of the branches will be removed
Logged

---

divide and conquer - iterate and rule - chaos is No random!
cbuchner1
Fractal Phenom
******
Posts: 443


« Reply #8 on: November 16, 2010, 03:40:37 PM »

Yes, by "shader 2" I meant using OpenGL ES2..

How do you emulate OpenGL ES2 on the Intel GMA 950? As far as I know Intel only provides a (rather buggy) OpenGL 1.4 implementation for this chip - no ES in sight wink  And this chip is EOL'ed too, meaning driver bugs won't ever get fixed.

Logged
David Makin
Global Moderator
Fractal Senior
******
Posts: 2286



Makin' Magic Fractals
WWW
« Reply #9 on: November 16, 2010, 09:04:29 PM »

Yes, by "shader 2" I meant using OpenGL ES2..

How do you emulate OpenGL ES2 on the Intel GMA 950? As far as I know Intel only provides a (rather buggy) OpenGL 1.4 implementation for this chip - no ES in sight wink  And this chip is EOL'ed too, meaning driver bugs won't ever get fixed.



If you look here:
http://www.intel.com/products/chipsets/gma950/index.htm
You'll see it's quoted as having DirectX 9 shader 2 acceleration - I assume Apple have simply extended support for this into the Mac implimentation of OpenGL ES2.
But to answer your question directly - I don't know exactly, I just wrote the code in Xcode for the iPhone/iPad and it ran fine in the iPad/iPhone simulators on my mini.

Incidentally if I try this using the Webkit enhanced Safari:
http://www.ibiblio.org/e-notes/webgl/makin.html
In a full-screen window on the mini I get 0 to 3 fps - on the MacPro at work I get 60 to 200 smiley
Logged

The meaning and purpose of life is to give life purpose and meaning.

http://www.fractalgallery.co.uk/
"Makin' Magic Music" on Jango
David Makin
Global Moderator
Fractal Senior
******
Posts: 2286



Makin' Magic Fractals
WWW
« Reply #10 on: November 16, 2010, 09:13:36 PM »

damn, those are alot branches, i think by using constants and recompiling, most of the branches will be removed

Erm - is using #defines and recompiling.
I think there is an absolute maximum of 5 runtime conditionals when the "#" conditionals have been applied, the runtime ones aren't nested more than 1 level and most are just if..endif without any else's.
« Last Edit: November 16, 2010, 09:17:36 PM by David Makin » Logged

The meaning and purpose of life is to give life purpose and meaning.

http://www.fractalgallery.co.uk/
"Makin' Magic Music" on Jango
Pages: [1]   Go Down
  Print  
 
Jump to:  

Related Topics
Subject Started by Replies Views Last post
DirectX 11 compute shader / Directx 10 Pixel shader implementation Mandelbulb Implementation cbuchner1 2 16923 Last post November 29, 2009, 12:32:26 AM
by cbuchner1
fractal coding. Non-Fractal related Chit-Chat teamfresh 7 2180 Last post October 13, 2010, 09:02:38 PM
by The Rev
Shader 2 coding - render to a texture Programming David Makin 3 3235 Last post January 31, 2011, 03:09:43 AM
by marius
realtime rendering with gl shader Movies Showcase (Rate My Movie) sleeplessmonk 2 2039 Last post October 12, 2014, 09:04:47 PM
by SeryZone
Some questions about coding formulas and more Programming smadar 10 2550 Last post June 05, 2015, 09:11:31 AM
by smadar

Powered by MySQL Powered by PHP Powered by SMF 1.1.21 | SMF © 2015, Simple Machines

Valid XHTML 1.0! Valid CSS! Dilber MC Theme by HarzeM
Page created in 0.166 seconds with 24 queries. (Pretty URLs adds 0.01s, 2q)