TruthSerum
Guest
|
|
« on: May 24, 2015, 06:52:40 PM » |
|
How can I compute the principle axis of a vector without branching. By this I mean a function takes as input an arbitrary vector, and returns a unit vector that points along the main axis of the vector. For example, if I have the vector (2.0, -4.0, 1.0), then the function I desire would return (0.0, -1.0, 0.0), since this input vector is "mostly" pointing in the negative Y direction. Handling for corner cases has to be consistent, but never having more than a single element set to 1.0. So for example, (3.0, 3.0, 1.0) could be (1.0, 0.0, 0.0), or (0.0, 1.0, 0.0), but not (1.0, 1.0, 0.0). I'm interested in the idea itself more than an implementation in any particular language. I'm sure this has been solved before.
|
|
|
Logged
|
|
|
|
cKleinhuis
|
|
« Reply #1 on: May 24, 2015, 07:15:41 PM » |
|
for this particular case i have no concrete solution, BUT any if query can be stated in terms of greater zero or smaller zero, for example if(a>1.0)then a=1.0 could be written (without branching) a=min(a,1.0);
|
|
|
Logged
|
---
divide and conquer - iterate and rule - chaos is No random!
|
|
|
TruthSerum
Guest
|
|
« Reply #2 on: May 24, 2015, 07:42:41 PM » |
|
Thanks, I tried subtracting the smallest component, which should set it to zero. av = abs(v); v -= sign(v) * min(av.x,min(av.y,av.z)); This will not work for corner vectors though: (1,1,1) and others will all become (0,0,0).
|
|
|
Logged
|
|
|
|
hobold
Fractal Bachius
Posts: 573
|
|
« Reply #3 on: May 24, 2015, 10:37:40 PM » |
|
Thanks, I tried subtracting the smallest component, which should set it to zero. av = abs(v); v -= sign(v) * min(av.x,min(av.y,av.z)); This will not work for corner vectors though: (1,1,1) and others will all become (0,0,0). There is a standard trick to tweak algorithms which require distinct (non-equal) values. The basic idea is to append extra least significant digit(s) to the original input values. A simplified illustrative example, with no regard to implementation: Let's say the input vector's coordinates are stored with two fractional digits. So they can assume the values -1.00, -0.99. -0.98 ... and so on in increments of 0.01. In order to make coordinate values unique, we append one extra digit, distinct and specific to each coordinate axis. Let's say "1" for the X axis, "2" for Y, and "3" for Z. So, the problematic input vector (1, 1, 1) would turn into (1.001, 1.002, 1.003). All values are now unique. In practice, with floating point values, it is probably easier to replace a few least significant bits uniquely instead of appending extra bits. So you would mask out the two least significant bits, and replace them uniquely for each coordinate with "00", "01", or "10". This does make the algorithm imprecise. Depending on where the input data is coming from, the imprecision may be lost in the noise.
|
|
|
Logged
|
|
|
|
dom767
|
|
« Reply #4 on: May 24, 2015, 11:58:13 PM » |
|
Love hobold's suggestion here. Very neat. I got down to two compares robustly... vec /= max(x,y,z)
if (abs(x)==1) return vec(x,0,0) if (abs(y)==1) return vec(0,y,0) return vec(0,0,z)
Losing a bit of precision is probably a good payoff for the conditionals!
|
|
|
Logged
|
|
|
|
TruthSerum
Guest
|
|
« Reply #5 on: May 25, 2015, 01:03:16 AM » |
|
Thanks, this routine might do for now actually. I think a==b can be performed using a mask: float m = abs(sign(a-b)); // m=0.0 if a==b, m=1.0 otherwise
But then you'd need to mask out the other values because you can't return, as in your routine. It probably just works out faster to branch anyway. The offset (adding a small fraction) might not be necessary after all. I think that I tend to avoid epsilon/tolerance values if I can help it.
|
|
|
Logged
|
|
|
|
TruthSerum
Guest
|
|
« Reply #6 on: May 26, 2015, 06:22:44 PM » |
|
Thanks to eiffie for his improved formula. Again, using two branches, but saves on the division for normalization. vec3 paxis(vec3 p) { vec3 a=abs(p),r = vec3(1.0,0.0,0.0); if(a.z>=max(a.x,a.y))r=r.yzx; else if(a.y>=a.x)r=r.zxy; return r*sign(p); }
|
|
« Last Edit: May 27, 2015, 12:19:54 PM by TruthSerum »
|
Logged
|
|
|
|
dom767
|
|
« Reply #7 on: May 26, 2015, 07:28:23 PM » |
|
Eiffie's definitely going to win out on HLSL. Be careful if you're doing this on CPU using C++ because a lot of people do sign() using compares anyway to avoid having to understand their systems FP format, and vector swizzling isn't free on CPU... The code I wrote is bugged btw, you'd need something like this... vec /= max(abs(vec.x), abs(vec.y), abs(vec.z));
if (abs(vec.x)==1) vec = vector(vec.x,0,0); else if (abs(vec.y==1) vec = vector(0,vec.y,0); else vec = vector(0,0,vec.z);
return vec;
And anyway, the standard implementation of abs() in C++ uses a compare... (doh!) Assuming HLSL has a floor function, how about this ugly little beast? av = abs(vec) vec /= max(av.x, max(av.y,av.z)); return floor(abs(vec))*vec;
|
|
|
Logged
|
|
|
|
dom767
|
|
« Reply #8 on: May 26, 2015, 07:35:06 PM » |
|
Ahh ignore me. Same old problem with the vec(1,1,1) argument. So we'd need to use hobold's trick to solve that one...
|
|
|
Logged
|
|
|
|
TruthSerum
Guest
|
|
« Reply #9 on: May 26, 2015, 08:28:42 PM » |
|
The (1,1,1) edge case is annoying indeed. I wonder about the floor of negative -1<x<0 values of x, because I think floor(-0.5) = -1, not 0.0 as you would need here. I don't think there is a function for that, trunc(), or similar, in GLSL.
Also, I had problems with x/x being exactly 1 (so the abs(x)==1.0 comparisons failed sometimes). This seems strange because I thought the FPU guaranteed this.
Thanks for the advice about CPU implementations. I don't think branching is as bad though on PC (x86/x64), or so I have heard.
|
|
|
Logged
|
|
|
|
dom767
|
|
« Reply #10 on: May 26, 2015, 09:59:24 PM » |
|
Floor (abs (vec))×vec takes care of the rounding down issue with negative numbers.
Branch cost on CPU is a pipeline flush multiplied by the probability of misprediction. Probably 5-10 cycles average although i lose track of how deep the pipelines are these days...
On GPU the control logic is usually shared between multiple pipelines so 2 branches which split 50/50 reduces execution speed by a factor of 4.
|
|
|
Logged
|
|
|
|
dom767
|
|
« Reply #11 on: May 26, 2015, 11:49:35 PM » |
|
And just in case anyone was after the branchless principle axis algorithm... av = abs(vec) vec /= max(av.x, max(av.y,av.z)); av = floor(abs(vec)) * vector(1,0.9,0.8) av /= max(av.x, max(av.y, av.z)); vec = floor(av) * vec;
branch-free, untested. The *(1, 0.9, 0.8 ) simulates the trick hobold mentioned. If a single value is set, then the /= max() returns the value to 1.0. If multiple values are set this prioritises x over y over z. I doubt very much that this has any useful purpose other than making me feel satisfied.
|
|
|
Logged
|
|
|
|
TruthSerum
Guest
|
|
« Reply #12 on: May 27, 2015, 12:40:35 AM » |
|
I tested it and you will be glad to know that, with some small syntax fixes, it works perfectly. I will leave it here for completeness: vec3 paxis(vec3 vec) { vec3 av = abs(vec); vec /= max(av.x, max(av.y,av.z)); av = floor(abs(vec)) * vec3(1.0,0.9,0.8); av /= max(av.x, max(av.y, av.z)); vec = floor(av) * vec; return vec; }
I would say that the branching version that eiffie supplied seems better for practical purposes though, so I expect I will continue to use that.
|
|
|
Logged
|
|
|
|
dom767
|
|
« Reply #13 on: May 27, 2015, 12:57:01 AM » |
|
Heh, thanks for testing it out. Using Eiffie's far simpler to comprehend (and probably most often faster) solution sounds reasonable to me. Thanks for the challenge though, enjoyed that one.
|
|
|
Logged
|
|
|
|
TruthSerum
Guest
|
|
« Reply #14 on: May 27, 2015, 12:35:36 PM » |
|
I should mention that it seems to be slightly different to the other routine, maybe in the use of the small "epsilon" offsets. (how it should look) (artifacts visible) If you revisit it you will have to double check those values. I didn't try playing with them to try and fix it, but I just thought I should mention that I am seeing artifacts under some circumstances. Annoying because I thought it worked well too.
|
|
|
Logged
|
|
|
|
|