News: Support us via Flattr FLATTR Link

## The All New FractalForums is now in Public Beta Testing! Visit FractalForums.org and check it out!

 Pages: 1 ... 6 7 [8]   Go Down
 Author Topic: Rendering 3D fractals without distance estimators  (Read 18459 times) Description: 0 Members and 1 Guest are viewing this topic.
hobold
Fractal Bachius

Posts: 573

 « Reply #105 on: December 14, 2013, 12:40:32 PM »

Isn't this a quite extreme example?
Not extreme in the sense that Buddhi's test scene were unrealistic for fractals. The scale invariance of fractal attractors means that chunks of the set are often surrounded by smaller pebbles, which are in turn surrounded by grains of sand, which are surrounded by motes of dust ...

So it is quite possible that there are many small objects everywhere; in particular near the camera, too.

However, Buddhi's example is extreme in another sense: the probability of the small object being hit by a randomly selected point on the ray is tiny. Regardless of the sample distribution along the ray. A rendering algorithm based on random samples is just not suitable for that kind of object (unless you have prior knowledge of the small object and sample specifically where it is located).

So in my opinion, his test case is really an argument against the brute force raytracer, not an argument against biasing the samples.

There is also a heuristic argument in favour of sample bias: "interesting" objects, i.e. those that the human mind can grasp and decode into "useful" information, usually have a property named "object coherence". That means, looking at just a restricted portion of the object gives enough information that we can make reasonably accurate guesses about the unseen parts that are immediately adjacent. In other words, we expect there to be some surface around some structure, and we expect that the small parts together form a sensible whole.

(Just for completeness: "interesting" also includes some amount of surprise as well. Simple shapes that we can fully grasp immediately don't capture our attention for long.)

Anyway, the biasing of samples was based on that expectation that the object to be rendered has some sort of coherent surface. Likewise, Syntopia's technique of using pixels of previously rendered frames and adjacent pixels to find the currently computed pixel more quickly, is equally based on the assumption of object coherence. This is why brute force raytracing works at all: object coherence exists for many of the fractal bodies we like to look at.
 Logged
kram1032
Fractal Senior

Posts: 1863

 « Reply #106 on: December 14, 2013, 12:59:58 PM »

hmm...

so Bi-Directional Path-tracers work by, instead of tracing paths just from the camera and bouncing around until hitting light sources, they go from light-sources too, which, in turns, helps to locate really tiny light-sources, right?

Now I wonder, would it be possible to do a two-pass-ray-tracer, where the first pass treats all geometry as light-sources and is bi-directional, so it essentially gives an idea of where geometry is located and perhaps how the normals of that geometry look like
And then you take that knowledge for the second pass, where you have a path-tracer that is biased in a way so all the geometry gets sampled uniformly but where ever there "is nothing", no paths are wasted at all?
 Logged
knighty
Fractal Iambus

Posts: 819

 « Reply #107 on: December 14, 2013, 06:01:42 PM »

Just remembering an article about random search seen some years ago (sorry I don't remember where exactly). IIRC it roughtly says that the "best" strategy is to do a "multiscale" random walk. Maybe this would help to find small or thin features.

@Kram1032: Maybe you are looking for something like this. (there are other great articles at http://graphics.pixar.com .
 Logged
hobold
Fractal Bachius

Posts: 573

 « Reply #108 on: December 18, 2013, 04:02:19 PM »

Syntopia mentioned that he reduces noise significantly with another technique: stratification. That means randomness is being applied not as raw chaos, but tamed to have a desirable characteristic: sample points should not cluster near each other too much, nor should they leave too large gaps.

Thinking about this some more, I realized that the brute force raytracer does not need to be based on randomness at all. We need randomness only to decouple adjacent pixels from another, so that we don't get Moiré patterns or other objectionable artifacts.

So what we need is a way to place sample points in an interval that has the following qualities:

- cheap to compute
- sample density is reasonably uniform across the interval (i.e. stratified)
- works for any number of points N (because we keep shortening the sampling interval and then start over)
- the method is incremental, i.e. we can add another point to an existing set of samples, and maintain the other qualities

As it turns out, something like that actually exists. It is related to the Golden Ratio, or more specifically to the Golden Angle. If you look at the stylized flower on the wikipedia page

http://en.wikipedia.org/wiki/Golden_angle

you can probably see how and why this works. New petals are always being added into the largest existing gap. All we need to do is crack the circle open, and roll out its circumference in our [0.0 .. 1.0] interval.

So the algorithm for sampling along the ray becomes something like this:

- prepare a constant for the (normalized) Golden Angle:
const float GoldenAngle = 2.0f - 0.5f*(1.0f + sqrtf(5.0f));

- once per ray, a random seed value from [0. .. 1.0] is passed in as a parameter RandSeed to decouple the rays from each other

- then the core loop for one ray looks like this

for (number of samples ...) {
RandSeed += GoldenAngle;
if ((RandSeed - 1.0f) >= 0.0f) {  // keep wrapping back into [0.0 .. 1.0]
RandSeed = RandSeed - 1.0f;
}
where = RandSeed;                    // current sample location in original parameter interval
where = (2.0f - where) * where;  // bias sample locations as described above in the thread
where = closest * where;            // shorten to current ray length

if (iterate(rayBase + where*rayDirection) == hit) {
closest = where;                      // remember closest hit
}
}
return closest;

With this method, I could halve the number of samples to 150 per ray, and get visually identical results to the example above with 300 biased random samples.

So, to recap: we started with 500 to 1000 samples for pure random sampling. This was reduced to 300 samples with biasing sample locations towards the closest hit. Another reduction to 150 samples per ray was then enabled by the stratification gained from GoldenAngle sampling. All in all the number of samples could be reduced to somewhere between a quarter and an eighth.
 « Last Edit: December 18, 2013, 04:05:34 PM by hobold » Logged
Syntopia
Fractal Molossus

Posts: 681

 « Reply #109 on: December 18, 2013, 10:56:51 PM »

Very interesting! I actually used golden ratio sequences a couple of months ago in order to produce sets of diverse colors, based on this approach (http://martin.ankerl.com/2009/12/09/how-to-create-random-colors-programmatically/)

There is also a chapter on generating low-discrepancy sequences in Physically-Based Rendering (the chapter is freely available here: http://graphics.stanford.edu/~mmp/chapters/pbrt_chapter7.pdf) - I googled a bit to see how their methods compared to golden ratio sets, and it seems golden ratio sets do fine (and they are easier to implement than Hammersley / Halton): http://www.graphics.rwth-aachen.de/media/papers/jgt.pdf

One minor optimization idea: the conditional can be removed by using something like: RandSeed = fract(RandSeed + GoldenAngle)

One final idea: for a given ray, once a point inside the fractal has been found, I think binary search (which is very fast converging) should be used to find the closest boundary to viewer (this will be faster than biasing the samples). After the binary search, the normal (stratified) sampling strategy could continue on the new interval.
 Logged
hobold
Fractal Bachius

Posts: 573

 « Reply #110 on: December 18, 2013, 11:37:41 PM »

One minor optimization idea: the conditional can be removed by using something like: RandSeed = fract(RandSeed + GoldenAngle)

I formulated it in this peculiar way because I was thinking of SIMD machines. The term "RandSeed - 1.0f" is a common subexpression in both the condition and the assignment. A SIMD machine would do a subtraction, a comparison to zero, and then a conditional select.

Special floating point operations working on the internals, like fract() or floor(), used to be slow. Not sure how fast those would run on a GPU, or in a vector instruction set like SSE/AVX.

Quote
One final idea: for a given ray, once a point inside the fractal has been found, I think binary search (which is very fast converging) should be used to find the closest boundary to viewer (this will be faster than biasing the samples). After the binary search, the normal (stratified) sampling strategy could continue on the new interval.

I guess this would be very dependant on the object being rendered. Non-fractal surfaces should react well to binary search. Not so sure about fractals, because of self-similarity and scale invariance. But that is just an excuse, really, and could simply be tested in practice.

For the moment I want to keep the innermost loop simple. Hopefully that will make it easier for the OpenCL compiler (after I ported the code at some point in the future ...) to fully utilize whatever hardware is there. If rays can alternate between the two "modes" bisection and stratified sampling, vector utilization could drop to 50% (i.e. the dreaded "warp divergence").
 Logged
eiffie
Guest
 « Reply #111 on: December 19, 2013, 07:40:34 PM »

This seems to work quite well Here is a sample I tried of a landscape using only 100 samples that runs real-time.

I tried it at shadertoy too but it looks like crap. Texture filtering must be setup differently.
Here is the code:
 Logged
hobold
Fractal Bachius

Posts: 573

 « Reply #112 on: December 19, 2013, 09:24:16 PM »

Surprising that 100 samples look okay-ish, despite the fact that the horizon is fairly far away. I would have expected that the samples are then spaced too sparsely. Nice.
 Logged
Syntopia
Fractal Molossus

Posts: 681

 « Reply #113 on: December 19, 2013, 10:12:15 PM »

I formulated it in this peculiar way because I was thinking of SIMD machines. The term "RandSeed - 1.0f" is a common subexpression in both the condition and the assignment. A SIMD machine would do a subtraction, a comparison to zero, and then a conditional select.

Special floating point operations working on the internals, like fract() or floor(), used to be slow. Not sure how fast those would run on a GPU, or in a vector instruction set like SSE/AVX.

GLSL functions like fract and floor can be expected to be fast. On my machine (Nvidia 310M) floor() takes the same amount of time as an add, while fract() takes twice as long (thus probably implemented as fract(x)=x-floor(x)). Using the conditional takes four times as long.

I tried it at shadertoy too but it looks like crap. Texture filtering must be setup differently.

Yep, looks weird in shadertoy, even with ANGLE disabled. If I initialize RandSeed to zero (instead of the random value), it looks much better, though.
 Logged
hobold
Fractal Bachius

Posts: 573

 « Reply #114 on: December 20, 2013, 12:22:43 PM »

GLSL functions like fract and floor can be expected to be fast. On my machine (Nvidia 310M) floor() takes the same amount of time as an add, while fract() takes twice as long (thus probably implemented as fract(x)=x-floor(x)). Using the conditional takes four times as long.
Okay, I really need to find out what GPU hardware is capable of at the instruction level! Do you happen to be aware of any good documentation?
 Logged
Syntopia
Fractal Molossus

Posts: 681

 « Reply #115 on: December 20, 2013, 03:15:45 PM »

Okay, I really need to find out what GPU hardware is capable of at the instruction level! Do you happen to be aware of any good documentation?

Some of the CUDA documentation is good: http://docs.nvidia.com/cuda/cuda-c-programming-guide/#arithmetic-instructions. But still, it is difficult to know exactly what the GLSL commands are mapped to.

I have also read that some operations are actually free when combined with other arithmetic operations (this should be the case for abs, negate, saturation), but I cannot find any clear documentation. I imagine this kind of information must be very important to game programmers, so it is weird that it is so hard to find.
 Logged
Buddhi
Fractal Iambus

Posts: 895

 « Reply #116 on: December 20, 2013, 03:29:48 PM »

To see how the program is translated into GPU code I use AMD APP Kernel Analyzer (its from AMD SDK). It shows asm output for analyzed kernel. Some example from net: http://www.jarredcapellman.com/wp-content/uploads/2012/06/AMDKernelAnalyzer.png
 Logged

hobold
Fractal Bachius

Posts: 573

 « Reply #117 on: January 16, 2014, 05:11:18 PM »

One final idea: for a given ray, once a point inside the fractal has been found, I think binary search (which is very fast converging) should be used to find the closest boundary to viewer (this will be faster than biasing the samples). After the binary search, the normal (stratified) sampling strategy could continue on the new interval.
I had first rejected this because of thread divergence on GPUs (or other SIMD hardware). Nevertheless I tried the idea in scalar code. When used in isolation, it completely removes noise from smooth surfaces (I used 20 halving steps to pretty much hit the limit of single precision floating point).

So biasing the samples towards the hit is no longer necessary, which improves sample density near the camera. Additionally, as we have now located the surface exactly, in all "simple" cases (non-dusty surface, voluminous object) the remaining samples on the current ray will all hit empty space.

That means we can afford another optimization which was actually detrimental when I tried it without bisection refinement. Specifically: to reset the sample counter (on the current ray) back to zero after a hit has been found and refined. This change is meant to benefit rendering quality in dusty areas, or for grazing rays (i.e. the silhouette of the object). In such hard cases, closer hits may be found very late in the sampling process, and then there are too few samples left to properly check the area in front of the hit.

The "simple" cases used to get very expensive with that change, because they would reset the sample counter way too often. But with bisection refinement, they reset the counter exactly once, and very early in the sampling process. So the quality improvement now invests processor cycles where they matter: in the hard cases of grazing rays or dusty objects.

Both these modifications will cost substantial performance on GPUs, but for scalar (multithreaded, of course) implementations they are a win, because the default number of samples per ray can be reduced while the accuracy of the computed depth image is notably improved.
 Logged
 Pages: 1 ... 6 7 [8]   Go Down