Logo by Pauldelbrot - Contribute your own Logo!

END OF AN ERA, FRACTALFORUMS.COM IS CONTINUED ON FRACTALFORUMS.ORG

it was a great time but no longer maintainable by c.Kleinhuis contact him for any data retrieval,
thanks and see you perhaps in 10 years again

this forum will stay online for reference
News: Visit us on facebook
 
*
Welcome, Guest. Please login or register. August 18, 2022, 06:23:24 PM


Login with username, password and session length


The All New FractalForums is now in Public Beta Testing! Visit FractalForums.org and check it out!


Pages: [1]   Go Down
  Print  
Share this topic on DiggShare this topic on FacebookShare this topic on GoogleShare this topic on RedditShare this topic on StumbleUponShare this topic on Twitter
Author Topic: CPU shader emulator  (Read 5770 times)
0 Members and 1 Guest are viewing this topic.
TruthSerum
Guest
« on: June 17, 2015, 05:00:11 PM »

Would there be a use case for such a thing? It would be very slow, of course, but it would permit e.g cloud based (non-gpu vps instances) to generate/automate rendering of shadertoy preview videos.

Given we have this khronos reference compiler at hand, it would not be too much effort to implement for us.

I see it consisting of two main components:

  • AST serialisation to your favourite language (not necessarily c++)
  • Support methods for this language that implement glsl intrinsic functions.

I've done something similar in the past using my own parser (not very good, I must admit), but the C++ support methods can be found here.
Logged
claude
Fractal Bachius
*
Posts: 563



WWW
« Reply #1 on: June 17, 2015, 06:34:27 PM »

mesa has a software rasterizer:  http://mesa3d.org/llvmpipe.html

using a GPU for OpenGL on a remote machine seems tricky
http://renderingpipeline.com/2012/05/windowless-opengl/
http://renderingpipeline.com/2012/05/remote-opengl/
(these examples assume the user has an X session already logged in at the remote machine with a monitor connected - normal X usage sends the drawing commands back to the client (your local machine) to do the rendering, which isn't what you want)

this answer suggests using Xvfb
http://stackoverflow.com/a/8961649

the answers here suggest Xvfb  isn't GPU accelerated and doesn't support much advanced OpenGL, and suggest some other alternatives
http://serverfault.com/questions/186805/remote-offscreen-rendering

so if it were me, I'd first try Xvfb with mesa llvmpipe; and if the remote headless server has a beefy GPU then start to look into the other options

(my experience is with linux, no clue about windows or os x)
Logged
eiffie
Guest
« Reply #2 on: June 17, 2015, 08:20:04 PM »

I was thinking of this for a different purpose. I like to have a CPU script running in parallel to the GPU feeding it uniforms, textures etc. It would be nice to have these scripts in a uniform language. I am also dreaming of a website like shadertoy that let's you program game logic along side your shader. So a parse of glsl into JavaScript then run it with exec?? It would also make a nice debugger so I see many uses that can all be achieved with other methods but still this would be nice to have.
Logged
Syntopia
Fractal Molossus
**
Posts: 681



syntopiadk
WWW
« Reply #3 on: June 18, 2015, 12:36:04 AM »

Yes, it would be nice to have a control language with GLSL syntax. But most importantly it would need to be dynamically compilable.

There is a nice GLSL to Asm.js draft here: https://github.com/devongovett/glsl.js - which seems abandoned though. I also think it would be pretty big task. Stuff like this might get a more common, once the very new WebAssembly standard begins to be adopted: http://techcrunch.com/2015/06/17/google-microsoft-mozilla-and-others-team-up-to-launch-webassembly-a-new-binary-format-for-the-web/

I also worry a bit about controlling everything from within the browser. What about for instance export of images? Is it possible to have a long-running task exporting a series of images to folder, or is a browser to sandboxed an environment?

I've done something similar in the past using my own parser (not very good, I must admit), but the C++ support methods can be found here.

That vector/matrix library is very nicely done.
Logged
marius
Fractal Lover
**
Posts: 206


« Reply #4 on: June 18, 2015, 08:00:59 AM »

Yes, it would be nice to have a control language with GLSL syntax. But most importantly it would need to be dynamically compilable.
Yeah, the source level interactivity we get w/ glsl is very addictive.

In the past I played a bit with compiling glsl as C++, which goes quite a ways with some minor glsl discipline and a few macros.
https://code.google.com/p/boxplorer2/source/browse/trunk/glsl.h
https://code.google.com/p/boxplorer2/source/browse/trunk/glsl.cc
You can get the output to neigh pixel match, if you apply sufficient ocd.

My main goal was to get just the DE coded once and have it be available for the C++ navigator for suitable realtime fly-by speed adjustments.

But it turns out you can get the glsl to compute and expose its DE via a tiny float fob fairly efficiently, so I largely lost interest.
And you and had to C++ compile the navigator every time you change a DE  undecided
Logged
eiffie
Guest
« Reply #5 on: June 18, 2015, 04:12:07 PM »

It actually works quite well to save a snippet of user generated code to a text file, execute the compiler and load the generated library (one DE function) dynamically. I was just playing around with that and was happy with the speed and stability of the process. All I have is Windows specific code though.
Logged
TruthSerum
Guest
« Reply #6 on: June 19, 2015, 05:04:46 PM »

using a GPU for OpenGL on a remote machine seems tricky

The idea was to support systems which don't have GPU's. I think it is rare for cloud services to give you GPU access, because I don't think it's possible to share access to GPU's in the same way as CPU and other resources.

I am also dreaming of a website like shadertoy that let's you program game logic along side your shader.

I think, as eiffie said, one of the main appeals for such a CPU emulator would be to run glsl-style logic alongside the shader, so as to provide you with some state between frames.

Thanks @Syntopia, of course the code was never designed to be used directly. It was just a support layer behind the glsl that the programmer would never see. Here is an example of some C++ compiled from a GLSL shader. The main thing to notice is how vector swizzle .xyzw, .rgba .etc are serialized as permute() function calls:

Code:
float sdBox(vec3 p, vec3 b) {
  vec3 d = abs(p) - b;
  return min(max(d.permute(0), max(d.permute(1), d.permute(2))), 0.0f) + length(max(d, 0.0f));
}

float map(vec3 pos) {
  float speed = 1.0f;
  vec3 grid = floor(pos);
  vec3 gmod = mod(grid, 2.0f);
  vec3 rmod = mod(grid, 4.0f) - 2.0f;
  float tm = fract(iGlobalTime * speed);
  rmod *= cos(tm * PI) - 1.0f;
  float g = floor(mod(iGlobalTime * speed, 3.0f));
  if (g == 0.0f) {
    if (gmod.permute(1) * gmod.permute(0) == 1.0f) {
      pos[2] += rmod.permute(0) * rmod.permute(1) * 0.5f;
    }
  } else if (g == 1.0f) {
    if (gmod.permute(1) * gmod.permute(2) == 1.0f) {
      pos[0] += rmod.permute(1);
    }
  } else if (g == 2.0f) {
    if (gmod.permute(2) == 0.0f) {
      pos[1] += rmod.permute(2) * rmod.permute(0) * 0.5f;
    }
  }
  grid = floor(pos);
  pos = pos - grid;
  pos = pos * 2.0f - 1.0f;
  float len = 0.9f;
  float d = sdBox(pos, vec3(len));
  bool skip = false;
  if (mod(grid.permute(0), 2.0f) == 0.0f && mod(grid.permute(1), 2.0f) == 0.0f) {
    skip = true;
  }
  if (mod(grid.permute(0), 2.0f) == 0.0f && mod(grid.permute(2), 2.0f) == 0.0f) {
    skip = true;
  }
  if (mod(grid.permute(1), 2.0f) == 0.0f && mod(grid.permute(2), 2.0f) == 1.0f) {
    skip = true;
  }
  if (skip) {
    d = 100.0f;
    vec3 off = vec3(2.0f, 0.0f, 0.0f);
    for (int i = 0; i < 3; ++i) {
      float a = sdBox(pos + off, vec3(len));
      float b = sdBox(pos - off, vec3(len));
      d = min(d, min(a, b));
      off = off.permute(2, 0, 1);
    }
    d *= 0.5f;
  } else {
    d *= 0.8f;
  }
  return d;
}

vec3 surfaceNormal(vec3 pos) {
  vec3 delta = vec3(0.01f, 0.0f, 0.0f);
  vec3 normal;
  normal[0] = map(pos + delta.permute(0, 1, 2)) - map(pos - delta.permute(0, 1, 2));
  normal[1] = map(pos + delta.permute(1, 0, 2)) - map(pos - delta.permute(1, 0, 2));
  normal[2] = map(pos + delta.permute(2, 1, 0)) - map(pos - delta.permute(2, 1, 0));
  return normalize(normal);
}

float aoc(vec3 origin, vec3 ray) {
  float delta = 0.05f;
  const int samples = 8;
  float r = 0.0f;
  for (int i = 1; i <= samples; ++i) {
    float t = delta * float(i);
    vec3 pos = origin + ray * t;
    float dist = map(pos);
    float len = abs(t - dist);
    r += len * pow(2.0f, -float(i));
  }
  return r;
}

vec3 mainImage(vec2 uv) {
  vec3 eye = normalize(vec3(uv, 1.0f - dot(uv, uv) * 0.33f));
  vec3 origin = vec3(0.0f);
  eye = eye * yrot(iGlobalTime) * xrot(iGlobalTime);
  float speed = 0.5f;
  float j = iGlobalTime * speed;
  float f = fract(j);
  float g = 1.0f - f;
  f = f * f * g + (1.0f - g * g) * f;
  f = f * 2.0f - 1.0f;
  float a = floor(j) + f * floor(mod(j, 2.0f));
  float b = floor(j) + f * floor(mod(j + 1.0f, 2.0f));
  origin.permute(0) += 0.5f + a;
  origin.permute(1) += 0.5f;
  origin.permute(2) += 0.5f + b;
  float t = 0.0f;
  float d = 0.0f;
  for (int i = 0; i < 32; ++i) {
    vec3 pos = origin + eye * t;
    d = map(pos);
    t += d;
  }
  vec3 worldPos = origin + eye * t;
  vec3 norm = surfaceNormal(worldPos);
  float prod = max(0.0f, dot(norm, -eye));
  float amb = 0.0f;
  vec3 ref = reflect(eye, norm);
  vec3 spec = vec3(0.0f);
  prod = pow(1.0f - prod, 2.0f);
  vec3 col = vec3(0.1f, 0.3f, 0.5f);
  spec *= col;
  col = mix(col, spec, prod);
  float shade = pow(max(1.0f - amb, 0.0f), 4.0f);
  float fog = 1.0f / (1.0f + t * t * 0.2f) / shade;
  vec3 final = col;
  final = mix(final, vec3(1.0f), fog);
  fog = 1.0f / (1.0f + t * t * 0.1f);
  return vec3(final * fog);
}

@marius any details about how your float fob thing worked? smiley
Logged
Syntopia
Fractal Molossus
**
Posts: 681



syntopiadk
WWW
« Reply #7 on: June 19, 2015, 11:56:36 PM »

The idea was to support systems which don't have GPU's. I think it is rare for cloud services to give you GPU access, because I don't think it's possible to share access to GPU's in the same way as CPU and other resources.)

It is possible to get cloud GPU instances. Amazon has several different, including:

"Amazon is making two of these new GPU instance types available for now. The g2.2xlarge version comes with 15 GiB memory, 60 GB of local storage, 26 EC2 Compute Units (that’s an Intel Sandy Bridge processor running at 2.6 GHz) and a single NVIDIA Kepler GK104 graphics card (with 1536 CUDA cores). The larger cg1.4xlarge version comes with 22 GiB of memory, 1690 GB of local storage, 33.5 EC2 Compute Units and two NVIDIA Tesla “Fermi” M2050 GPUs. On-demand prices start at $0.65 per hour for the smaller instance and $2.10 for the larger one."

The (expansive) Tesla card is capable of running double precision at half the speed of single precision, which could make it very attractive.

I decided to try installing Fragmentarium on a Amazon GPU cloud instance (the cheap one - g2.2xlarge), which has a Nvidia GRID K520 GPU. It was not entirely trivial to do, because Windows RDP disables OpenGL. But after setting up a VNC server on the machine, I was able to connect and run Fragmentarium without problems. I used this preconfigured Windows image:
https://aws.amazon.com/marketplace/pp/B00SK9DXLG/ref=srh_res_product_title?ie=UTF8&sr=0-4&qid=1434750392467

The K520 has 2448 GFLOPS versus the 1155 GLOPS in my laptops NVIDIA 850M (in single precision).

I did a path traced render test which completed in 317s on the K520 versus 475s on my laptop - which means it for some reason was not twice as fast, as I would have expected.

The price is $0.767/hr (for Windows, Linux is slightly cheaper).
Logged
TruthSerum
Guest
« Reply #8 on: June 20, 2015, 12:21:53 AM »

Interesting result. That is very expensive though. I currently pay about $0.01 an hour for a regular VPS instance, which would be capable of running a CPU emulator.
Logged
Syntopia
Fractal Molossus
**
Posts: 681



syntopiadk
WWW
« Reply #9 on: June 20, 2015, 11:37:20 AM »

I'm not sure it is cheaper. The cheapest instances have "variable" performance, which I think means you might be sharing a core with other user

For the 'compute' instance, take a look at for instance the c4.large instance, which cost $0.11/hour for 2 vCPU (=1 core). It is based on a (special) E5-2666, but assuming this is similar to the Intel Xeon E5-2665, is has a performance of around 20 GFlops per core. That is 5.5 US$/TFlops-hour.

The g2.2xlarge has 2 x 2448 GFLOPS (the K520 is a double GPU, which I missed - though I don't think Fragmentarium uses more than half of this) for $0.767/hr. That is $0.15/TFlops-hour.

Of course the Xeon number is true double precision floats, where the K520 is single precision.

For double precision GPU's the only choice is the cg1.4xlarge (2xTesla M2050) which costs $2.1/hour for 1000 GFlops. That is $2.1/TFlops-hour, which is closer to the CPU number.

So if you can live with single-precision the g2.2xlarge GPU is by far the cheapest choice. Also, any kind of CPU emulation would probably have quite a large performance impact.
« Last Edit: June 20, 2015, 05:24:35 PM by Syntopia, Reason: Fixed a bug in units » Logged
TruthSerum
Guest
« Reply #10 on: June 20, 2015, 02:49:34 PM »

The benefit of the cloud for me would be to keep the thing running all the time and sacrifice performance. Since it's emulated it is not going to be real time, so you can't depend on it for that. But to keep a GPU instance running you're talking ~$20 a day?!

Sure it might render faster, but then if you needed results quickly you could just render it locally. I think it's the "always on" connectivity that makes it attractive.
Logged
Pages: [1]   Go Down
  Print  
 
Jump to:  

Related Topics
Subject Started by Replies Views Last post
DirectX 11 compute shader / Directx 10 Pixel shader implementation Mandelbulb Implementation cbuchner1 2 4186 Last post November 29, 2009, 12:32:26 AM
by cbuchner1
3D fractals using shader 2 3D Fractal Generation David Makin 4 2691 Last post October 14, 2010, 06:10:59 PM
by David Makin
Shader 2 coding Programming David Makin 10 3088 Last post November 16, 2010, 09:13:36 PM
by David Makin
fp64 shader Programming marius 8 2392 Last post February 11, 2012, 06:49:23 PM
by Saquedon
Epimorphism - Video Feedback Emulator General Discussion kram1032 0 1241 Last post November 20, 2016, 11:27:35 PM
by kram1032

Powered by MySQL Powered by PHP Powered by SMF 1.1.21 | SMF © 2015, Simple Machines

Valid XHTML 1.0! Valid CSS! Dilber MC Theme by HarzeM
Page created in 0.166 seconds with 24 queries. (Pretty URLs adds 0.007s, 2q)