Title: Writing formula using gcc (Re: Blobby)
Post by: knighty on April 04, 2015, 09:14:59 PM
It wokrs here! :horsie: I used this example posted here a looong time ago by jesse: __attribute__((packed)) struct TIteration3Dext { double Cw,Rold,RStopD,x,y,z,w; // use with neg indizes before C1. w is also used for 3d ifs and abox analytic DE double Px, Py, Pz; // actual position, never change these! Can be used as input. double Cx, Cy, Cz; //+24 these are the constants for adding. Pxyz or the julia seed. Cw is @-56 (begin of struct) void* PVar; //+48 the actual formulas adress of constants and vars, constants at offset 0 increasing, user vars below -8 float SmoothItD; //+52 double Rout; //+56 the square of the vector length, can be used as input int ItResultI; //+64 int maxIt; //+68 float RStop; //+72 int nHybrid[6]; //+76 all formulas iteration counts or single weights in interpol-hybrid void* fHPVar[6]; //+100 all formulas pointer to constants+vars, PVars-8=0.5, use PVar for the actual formula void* fHybrid[6]; //+124 the formulas adresses int CalcSIT; //+148 int DoJulia; //+152 float LNRStop; //+156 int DEoption; //+160 float fHln[6]; //+164 for SmoothIts int iRepeatFrom; //+188 double OTrap; //+192 double VaryScale; //+200 to use in vary by its int bFirstIt; //+208 used also as iteration count, is set to 0 on it-start int bTmp; //+212 tmpBuf, free of use. double Dfree1; //+216 double Dfree2; //+224 double Deriv1; //+232 for 4D first deriv or as full derivs double Deriv2; //+240 double Deriv3; //+248 float SMatrix4[4][4]; //+256 for 4d rotation, used like most other values only by the programs iteration loop procedure };
// fastcall is not quite delphi fastcall. // first two args are ok, third is in ecx in delphi, on stack here. void __attribute__((fastcall)) formula( double* x, // [eax] double* y, // [edx] double* arg, // [ebp+8], points to TIteration3Dext.C1 void* dummy // so we end w/ ret 8 as delphi expects ) { // Compute ptr to proper start of TIteration3Dext struct. struct TIteration3Dext* cfg = (struct TIteration3Dext*)(arg-7);
// Read / write some fields as demonstration.. not a real formula ;-) double r = cfg->x + cfg->y - cfg->z * cfg->w; cfg->Deriv1 = r; }
compile with 'gcc -c -Os -m32'.
Objdump -D can be used to show the generated code.Transformed it a little: #include <math.h> __attribute__((packed)) struct TIteration3Dext { double Cw,Rold,RStopD,x,y,z,w; // use with neg indizes before C1. w is also used for 3d ifs and abox analytic DE double Px, Py, Pz; // actual position, never change these! Can be used as input. double Cx, Cy, Cz; //+24 these are the constants for adding. Pxyz or the julia seed. Cw is @-56 (begin of struct) void* PVar; //+48 the actual formulas adress of constants and vars, constants at offset 0 increasing, user vars below -8 float SmoothItD; //+52 double Rout; //+56 the square of the vector length, can be used as input int ItResultI; //+64 int maxIt; //+68 float RStop; //+72 int nHybrid[6]; //+76 all formulas iteration counts or single weights in interpol-hybrid void* fHPVar[6]; //+100 all formulas pointer to constants+vars, PVars-8=0.5, use PVar for the actual formula void* fHybrid[6]; //+124 the formulas adresses int CalcSIT; //+148 int DoJulia; //+152 float LNRStop; //+156 int DEoption; //+160 float fHln[6]; //+164 for SmoothIts int iRepeatFrom; //+188 double OTrap; //+192 double VaryScale; //+200 to use in vary by its int bFirstIt; //+208 used also as iteration count, is set to 0 on it-start int bTmp; //+212 tmpBuf, free of use. double Dfree1; //+216 double Dfree2; //+224 double Deriv1; //+232 for 4D first deriv or as full derivs double Deriv2; //+240 double Deriv3; //+248 float SMatrix4[4][4]; //+256 for 4d rotation, used like most other values only by the programs iteration loop procedure };
// fastcall is not quite delphi fastcall. // first two args are ok, third is in ecx in delphi, on stack here. void __attribute__((fastcall)) formula( double* x, // [eax] double* y, // [edx] double* arg, // [ebp+8], points to TIteration3Dext.C1 void* dummy // so we end w/ ret 8 as delphi expects ) { // Compute ptr to proper start of TIteration3Dext struct. struct TIteration3Dext* cfg = (struct TIteration3Dext*)(arg-7);
// Read / write some fields as demonstration.. not a real formula ;-) cfg->x=fabs(cfg->x); cfg->y=fabs(cfg->y); cfg->z=fabs(cfg->z); } Compiled it using gcc: gcc -c -O3 -msse2 MB3D-formula-ex.c Then used objdump: objdump -D MB3D-formula-ex.o >> out.asm out.asm looks like this: MB3D-formula-ex.o: file format pe-i386
Disassembly of section .text:
00000000 <@formula@16>: 0: 55 push %ebp 1: 89 e5 mov %esp,%ebp 3: 8b 45 08 mov 0x8(%ebp),%eax 6: 83 e8 38 sub $0x38,%eax 9: dd 40 18 fldl 0x18(%eax) c: d9 e1 fabs e: dd 58 18 fstpl 0x18(%eax) 11: dd 40 20 fldl 0x20(%eax) 14: d9 e1 fabs 16: dd 58 20 fstpl 0x20(%eax) 19: dd 40 28 fldl 0x28(%eax) 1c: d9 e1 fabs 1e: dd 58 28 fstpl 0x28(%eax) 21: c9 leave 22: c2 08 00 ret $0x8 25: 90 nop 26: 90 nop 27: 90 nop Notice how it doesn't call any external function. I wanted it to use SSE2 but it used FPU don't know why because I'm a noob. keeping only the machine code and adding MB3D formula stuff gives this: [OPTIONS] .Version = 2 .DEoption = -1 [CODE] 5589e58b450883e838dd4018d9e1dd5818dd4020d9e1dd5820dd4028d9e1dd5828c9c20800 [END]
Description:
Test of compiling a formula with GCC. It simply set (x,y,z)=abs(x,y,z) saved as gccAbsxyz.m3f and it works! :banana: :chilli: So it is definitely possible to automate it: formula.c --> formula.m3f. Of course there are a lot of details that must be taken into account. (What DEoption means for example?) [/code]
Title: Re: Blobby re:knighty
Post by: knighty on April 29, 2015, 05:07:57 PM
My very first MB3D real formula! :horsie: It draws mandelbrot set heightfield. Beware! Use only one iteration! Coding it was really a pain because not all math library functions are inlined and constants are put in .rdata section. One should also do better using asm which is not my cup of tea. Anyway, it works fine and is fast enought. Here is the c code: /* Mandelbrot set height field Formula. Many thanks to marius, jesse and DarkBeam. knighty. Apr 2015.
No warranty. This is made by a noob ;) Compilation using mingw and msys (stricly speaking msys is not required... in principle): -Compile with (replace "theFormulaFilename" by actual c file name): gcc -c -m32 -O3 -mfpmath=387 -ffast-math -march=pentium4 theFormulaFilename.c SSE2 can be used by setting: -mfpmath=both and adding the switch: -msse2 If you use sse2, use the macros defined below. Otherwise the compiler will use variables stored in .rdata whith will make the formula unusable. -Verify asm code: objdump -D theFormulaFilename.o This is necessary to check that there are no library functions call and that there is only .text "segment" no .rdata or something like that or a call to external function. -Extract machine code: objcopy -Obinary -j .text theFormulaFilename.o theFormulaFilename.bin -Convert machine code from binary to hexadecimal: bin2hex theFormulaFilename.bin theFormulaFilename.m3f (bin2hex is not part of msys or mingw. any other binary file editor would do the job) -edit theFormulaFilename.m3f in a text editor to add MB3D stuff. */ /*Defines**************************************************/ //#define USE_SSE2 #define MB3D_PACK __attribute__((packed, aligned(1))) /*Includes*************************************************/ #include <emmintrin.h> #include <math.h> /*MB3D structures definitions*************************************************/ //This structure is specific to dIFS. The use of most of it is unknown and should not be modified. struct TIteration3Dext { double something; // -0x88 ; used in sphereheightmap.m3f double dum0; // -0x80 ; unknown double x; // -0x78 double y; // -0x70 double z; // -0x68 double dum1[8]; //unknown double DE2T; // -0x20 ; Output: distance estimate to current object double dum2[17]; //unknown double accumulatedScale; // +0x70 double dum3; // +0x78; unknown double OTforCol; // +0x80 double dum4[16]; //unknown //void * sphericalMap; // +0x108; pointer to function } MB3D_PACK; struct Sconsts{//in the same order as m3f file's [constant] section double Two; double Half; double LogTwo;//logarithm of two double BailOut; double LogBailout; double Eps; long long int abscst;//for ABS() long long int negcst;//for NEG() } MB3D_PACK; //We need to use constants in pconst otherwise an .rdata section in generated //These are helpers to access constants #define TWO pconst->Two #define HALF pconst->Half #define LOGTWO pconst->LogTwo #define BAILOUT pconst->BailOut #define LOG_BAILOUT pconst->LogBailout #define EPS pconst->Eps
struct Svars{//in the reverse order wrt m3f file double OTShift; double OTScale; double Xcenter; double Ycenter; double ZoomFactor; double depth; double slope; int Iterations; double Dummy;//not used. What is it good for? } MB3D_PACK; /*Macros**************************************************/ //this macro changes the type of a variable without conversion. #define REINTERPRET(x,type) (*((type *) &(x))) // #define MAX(x,y) ((x)>(y) ? (x) : (y)) #define MIN(x,y) ((x)<(y) ? (x) : (y)) //with sse2, fabs() and negating generates a logical instruction with a constant stored in .rdata //use these instead of writing: fabs(bar); or foo=-bar; when using sse2. #ifdef USE_SSE2 #undef fabs #define ABS(x) (Abs(x, pconst)) #define NEG(x) (Neg(x, pconst)) #else #define ABS(x) (fabs((x))) #define NEG(x) (-(x)) #endif
#ifdef USE_SSE2 static inline double Abs(double x, struct Sconsts* pconst){ __m128d v=_mm_set1_pd (x); __m128d w=_mm_set1_pd (REINTERPRET(pconst->abscst,double)); __m128d r=_mm_and_pd (v,w); return _mm_cvtsd_f64 (r); } static inline double Neg(double x, struct Sconsts* pconst){ __m128d v=_mm_set1_pd (x); __m128d w=_mm_set1_pd (REINTERPRET(pconst->negcst,double)); __m128d r=_mm_and_pd (v,w); return _mm_cvtsd_f64 (r); } #endif /*----------*/ static inline double power(double x, double y){//had to implement this because gcc doesn't inline pow() (and also generates some constants) return exp2(y*log2(x)); } static inline double length(double x, double y){ return sqrt(x*x+y*y); } /*The actual formula*************************************************/ static inline void TheFormula(struct TIteration3Dext* pctx, struct Svars* pvar, struct Sconsts* pconst) { // double x=pctx->x, y=pctx->y, z=pctx->z; //scale and translate x /= pvar->ZoomFactor; y /= pvar->ZoomFactor; x += pvar->Xcenter; y += pvar->Ycenter; //Compute DE to Mandelbrot set double cx=x, cy=y; double r2=x*x+y*y; double dx=1., dy=0.; int i=0.; for(;i < pvar->Iterations && r2<BAILOUT; i++){ double ddx=2.*(x*dx-y*dy)+1.; dy=2.*(x*dy+y*dx); dx=ddx; double xx=x*x-y*y+cx; y=2.*x*y+cy; x=xx; r2=x*x+y*y; } double dr=length(dx,dy);//sqrt(dx*dx+dy*dy); double r=length(x,y);//sqrt(x*x+y*y); double lr=log(r); double de=HALF*r*(lr-power(TWO,(double)(i-pvar->Iterations))*pconst->LogTwo)/(dr*power(r,power(HALF,(double)(i)+TWO))); de = MAX(de,0.); //Convert DE to heightfield de*=pvar->ZoomFactor; double s=pvar->slope; double p=pvar->depth; s=ABS(s); double sp=s*p; p=ABS(p); double f=(s*de+p)*z+sp*de; double g=length(s*z+sp,s*de+p);//sqrt((s*(z+p))*(s*(z+p))+(s*de+p)*(s*de+p)); de=f/g; if(sp>0) de= MAX(de, z); pctx->DE2T = ABS(de); //Give smoothed iteration for coloring double col= -log(MAX(0.,ABS(z/p)-EPS));// pctx->OTforCol = col * pvar->OTScale + pvar->OTShift; // No translation and no scaling } /* Entry point. Not meant to be modified *************************************************/ // Not a standard calling convention. this is called from an asm code in MB3D. // So we need to save ecx register. the compiler generates the code to save/restore ebp // esi and edi (and ebx?) are not modified (or restored?)!??? // // the arguments of this function are in edi and esi registers. // esi points to the context structure // edi points to constants (negative displacements for variables) void formula(void) {//Big overhead. The only solution I could find. Any Idea? char *siarg; char *diarg; __asm__("push %ecx\n\t");//save ecx __asm__("movl %%esi,%0\n\t":"=r"(siarg)::); __asm__("movl %%edi,%0\n\t":"=r"(diarg)::); // Compute ptr to proper start of TIteration3Dext struct. struct TIteration3Dext* pctx = (struct TIteration3Dext*)(siarg-0x88); // get pointer to constants. struct Sconsts* pconst=(struct Sconsts*) (diarg); // get pointer to variables. struct Svars* pvar=(struct Svars*) (((char*)pconst)-sizeof(struct Svars)); TheFormula(pctx, pvar, pconst); __asm__("pop %ecx");//restore ecx } The resulting formula: [OPTIONS] .Version = 6 .DEoption = 20 .Integer Iterations = 200 .Double Slope = 2 .Double Depth = 1 .Double Zoom Factor = 1 .Double Y Center = 0 .Double X Center = 0 .Double OT Scale = 1 .Double OT Shift = 0 [CONSTANTS] Double = 2 Double = 0.5 Double = 0.69314718055994530941723212145818 Double = 1000000 Double = 6.9077552789821370520539743640531 Double = 0.000000000000001 INT64 = $7FFFFFFFFFFFFFFF INT64 = $8000000000000000 [CODE] 5589E5565383EC305189F689FB81EE880000008D4BBCDD4620DD5DD0DD4120DD55C8DC7E10DD 45C8DC7E18D9C9DC4110DD55E8D9C9DC4118DD55E0D9C1D8CAD9C1D8CAD9C0D8C28B513885D2 0F8EA7010000DD4318DD55D8DFF10F86B1010000DDD831C0D9EED9E8EB13DD45D8DFF1765FDD D8D9CAD9CCD9C9D9CBD9C9D9C0D8CED9C2D8CEDEE9D8C0D9E8DEC1D9C9D8CDD9CAD8CEDEC2D9 C9D8C0DD45E8DEC4D9CBDEE2D9CBDECCD9CBD8C0DC45E0D9C3D8CCD9C1D8CAD9C0D8C283C001 39D07CA8DDDEDDD8DDD8DDD8D9C9D9CAEB0CDDDEDDD8DDD8DDD8D9C9D9CA8945F4DB45F4D9CB D8C8D9CAD8C8DEC2D9C9D9FADD5DE8D9FAD9EDD9C1D9F1D9E8DD03D9C1D9C9D9F129D050DA0C 2483C404D9C0D9FCDCE9D9C9D9F0D8C2D9FDDDD9DD4308D9C2D9C9D9F1D9CDDC03DECDD9C4D9 FCDCEDD9CDD9F0D8C2D9C9D9CDD9C9D9FDDDD9D9C3D9C2D9C9D9F1DEC9D9C0D9FCDCE9D9C9D9 F0DEC2D9C9D9FDDDD9D9CADC4B08D9CBDC4B10DEE9DECADC4DE8DEF9D9EED9C9DBF1DAC1DC4D C8DD4130DD4128D9C9D9E1D9C0D8CAD9CAD9E1D9C3D8CAD8C1DD45D0D8C9D9CDD8CCDEC5D9CA DC4DD0D8C3D8C8D9CAD8C8DEC2D9C9D9FADEFBD9C9DFF3DDDA760BDD45D0D9C9DBF1DAC1DDD9 D9E1DD5E68DC7DD0D9E1DC6328D9EED9C9DBF1DAC1DDD9D9EDD9C9D9F1DC4908DC6BBCDD9E08 0100005983C4305B5E5DC3DDDBDDD8DDD8DDD9D9E8DD5DE8D9EED9C931C0E9E2FEFFFFDDDBDD D8DDD8DDD9D9E8DD5DE8D9EED9C931C0E9CAFEFFFF909090 [END]
Warning: Use only one iteration. Mandelbrot set height field dIFS. It is a thin surface. Parameters: "X Center", "Y Center" and "Zoom Factor" are for scaling and translating the heightfield without shapes in other slots. "Slope" is the tangent of the angle of the surface at the edge of the mandelbrot set. "Depth" positive values give plateau like landscape. negative values give lake like landscapes. "OT Scale" and "OT Shift" are for tuning coloring. Use 2nd choice. And the params to test it: Mandelbulb3Dv18{ g.....Y/...g2.../.....6...knIfYS2zeny6PJKHev8LlDcGXA26OsbzPpeqi592nkz8FWU6.pEEhD ................................y68SnHd5932........A./..................y.2...wD ...Uz6........../QU0/.....UE4...8....2E3.....oobIhgQ8bTD/..........m/dkpXm1....U y.UaNadD12..0..........wz.................................U0.....y1...sD...../.. .z1...kDgjsxokIdHsXghLegQoK.yqpB3esPhlOj6rO0P0vTkrfb.1XD7yN.y8pIzgy.eAVjs3NO8Jbh yrno88E9ZxM2ycrWpdI1oBUjU.....oI1.............sD.6....sD..G..................... .............oAnAt1...sD....zw1........................................./....2.. .....Ksulz1.......kz.wzzz5.U..6.P....w/...EB....O....c3...kG....8/...I1....2BR52 821U.ydelyjeYFnzTeOgzf8No.6.5Q1.zzzz..ktPnvka6zD6odIWOor/z1..........6k/8.kXWFH. ..M93P58iz9..mnWK2zwz0.U......../EU0.wzzz1...........s/...................E.2c.. zzzz.............0...................2./8.kzzzD............8.................... /EU0.wzzz1....................................uBZ.U7EgU7zzFoTu2.oQm2oszDnueM.YXB bYHzTt6Ul/k6Ukl6Qs5crI0.a.l0a.bTUSH7.M029M0ly/uBZ.U7EgU7Mw5crI0.a.l0akqTUSH7.M02 9M0kz/uBZ.U7EgU7...crIGJzzFoTuIdyzngi8qdxzZX.4rU................................ E6../6..F2E.....I....U....EHHVYFYZYFH/kPrJaQ.........................6.......... ..................oX./.......U.E..........A........wz.....................E9nAnA nAHaz........................................................................... .....................2.....3....B....A3QcJaQZZYFH/.............................. .sU1.sE1BoU.02E......oAnAnAnAF0E........kz1..................................... ...wz.........zD........kz1.............................kz1........wz.........zj ................................} {Titel: mshf01}
Picture (http://www.fractalforums.com/index.php?action=gallery;sa=view;id=17435).[/code]
Title: Re: Writing formula using gcc (Re: Blobby)
Post by: cKleinhuis on April 30, 2015, 01:04:38 AM
lol, you beautiful brain, binary please! :)
Title: Re: Writing formula using gcc (Re: Blobby)
Post by: knighty on April 30, 2015, 11:28:29 AM
Sorry, I forgot to tell the the name of the formula is MSHFdIFS.m3f. Anyway, here is attached the zip containing c code, formula and example parameter.
Title: Re: Writing formula using gcc (Re: Blobby)
Post by: cKleinhuis on April 30, 2015, 11:36:00 AM
cool thank you, uhm that reaaally takes ages to calculate ;) lets see with what kind of video i can come up with
Title: Re: Writing formula using gcc (Re: Blobby)
Post by: knighty on April 30, 2015, 03:57:05 PM
Use only one iteration. "mandelbrot hills" took about 5mn to render using high quality preset on an i7. in video preset it takes 10s. :)
Title: Re: Writing formula using gcc (Re: Blobby)
Post by: DarkBeam on April 30, 2015, 04:35:54 PM
Good Idea :D now what about kleinian sets, I simply can't do them buddy :beer: Johan will send you a golden bracelet :D
Title: Re: Writing formula using gcc (Re: Blobby)
Post by: KRAFTWERK on May 01, 2015, 10:31:28 AM
Good Idea :D now what about kleinian sets, I simply can't do them buddy :beer: Johan will send you a golden bracelet :D
;D Definitely! at least a print of the kleinian in White Strong and flexible... I think bib will join in on this grand price! :beer: O0
Title: Re: Writing formula using gcc (Re: Blobby)
Post by: knighty on May 01, 2015, 08:26:54 PM
Cool renderings would be more than sufficient. :) The main issue with "simple kleinian" is the initialization part: one have to solve a nonlinear equations system to get inversion spheres positions and radii from angles. It have to be done at each evaluation which is expensive. In the fragmentarium stcript it's not fully solved. A solution would be to forget about the equations system and let the artist find the good parameters (so in general one won't get a real kleinian limit set). The same problem arise in other cool shapes like Doyle spirals...
Title: Re: Writing formula using gcc (Re: Blobby)
Post by: DarkBeam on May 02, 2015, 12:53:14 AM
Aye, the old 'no initialization' problem that always arises in mb3d..
Title: Re: Writing formula using gcc (Re: Blobby)
Post by: teeanDy on November 30, 2015, 04:25:40 PM
:) Rocks!
http://teeandy.blogspot.de/2015/11/mandelbulber3d-d.html (http://teeandy.blogspot.de/2015/11/mandelbulber3d-d.html) O0
Thx for the hints. Do someone have formulas written in c?
Thanks a lot Andy
Title: Re: Writing formula using gcc (Re: Blobby)
Post by: quaz0r on November 30, 2015, 06:51:09 PM
not a mb3d user so maybe i missed something but if all you need is the assembly you may as well just have gcc give you the assembly instead of assembling it and then disassembling it? with the gcc switch -S
-S Stop after the stage of compilation proper; do not assemble. The output is in the form of an assembler code file for each non-assembler input file specified. By default, the assembler file name for a source file is made by replacing the suffix .c, .i, etc., with .s.
Title: Re: Writing formula using gcc (Re: Blobby)
Post by: knighty on December 01, 2015, 02:29:35 PM
We actually need the machine code. The (not yet attained) goal of the game is to not use assembly at all. Tempted? ;D
BTW. Manking it work depends on on the version of the compiler. It worked for me with an old version of gcc (forgot the exact number). It doesn't work as it is with newer versions... but the principles are the same.
But wait! Now MB3D features a JIT compiler!...
Title: Re: Writing formula using gcc (Re: Blobby)
Post by: DarkBeam on December 01, 2015, 06:16:56 PM
But for now it is slow :D
Title: Re: Writing formula using gcc (Re: Blobby)
Post by: knighty on December 01, 2015, 09:23:24 PM
Yes! unfortunatly there are not a lot of lightweight fast JIT libraries out there. the JIT library used in MB3D have a lot of features, too much features. MB3D only uses 1% of them. ;D The only other JIT library I could find is jitlib but it is not mantained anymore... There is also evaldraw which comes with two JIT compilers: a somewhat old and slow one that is opensource and a faster one but it is not available as a library. :-/ But hey! none can beat good old assembly language... So as quaz0r suggested, a compiler can give (very) good assembly code as a starting point... It is all about coding time Vs code speed. :)
Title: Re: Writing formula using gcc (Re: Blobby)
Post by: lycium on December 01, 2015, 09:52:38 PM
Yes! unfortunatly there are not a lot of lightweight fast JIT libraries out there.
From what I've seen of this Pascal JIT library used in next MB3D that Andreas showed, it's extremely weak at optimising, besides being commercial closed source (!!). No idea why anyone would choose this over the legendary LuaJIT, for example.
Title: Re: Writing formula using gcc (Re: Blobby)
Post by: quaz0r on December 01, 2015, 09:59:57 PM
Title: Re: Writing formula using gcc (Re: Blobby)
Post by: knighty on December 01, 2015, 10:18:29 PM
Luajit is a tracing JIT. AFAIU, one can't call luajit functions directly as one can do in c or pascal. Yes, it is more than excellent. A talented (and excperienced) programmer could extract the part that do actual compilation (SSA, optimizations, register allocation, code generation...) and add support for calling conventions. That would be the PERFECT solution. Writing a simple language adapted for fractal formulas would then be a piece of cake... sort of... Anyone? Dreaming about an "unified" language and a common library for fractal formulas that could be used not only by MB3D but also Mandelbulber, Chaotica, apophysis...
Title: Re: Writing formula using gcc (Re: Blobby)
Post by: lycium on December 01, 2015, 10:23:19 PM
Glare Technologies (company making Chaotica and Indigo Renderer, i.e. me and 2 other dudes) plans to opensource our language, Winter, at some point.
You can for example look at the code for the Chaotica 2 transform library here (not final, but it will be opensourced): https://dl.dropboxusercontent.com/u/3038174/standard_transforms.xml
This compiles to code that beats C/C++ in performance, and is *proven* to be safe before compilation. It's possible because it's quite a restricted (pure functional) language, yet it has a lot of power - the whole of Indigo 4 and Chaotica 2 rendering engine are written in it.
It's (unfortunately?) not the case that assembly language beats high level languages, mainly because of the complexity of modern processors, and the ability of optimising compilers to just throw combinatorial optimisation at the the problem. You really have to be best-in-the-world at low level optimisation to beat a modern compiler, and even then your chances are not good; plus, that's just no way to live/program :)
Title: Re: Writing formula using gcc (Re: Blobby)
Post by: knighty on December 01, 2015, 10:41:36 PM
That's what I was dreaming about! I hope it doesn't use LLVM as a back end. It's (unfortunately?) not the case that assembly language beats high level languages, mainly because of the complexity of modern processors, and the ability of optimising compilers to just throw combinatorial optimisation at the the problem. You really have to be best-in-the-world at low level optimisation to beat a modern compiler, and even then your chances are not good; plus, that's just no way to live/program :)
(...) This compiles to code that beats C/C++ in performance, and is *proven* to be safe before compilation. It's possible because it's quite a restricted (pure functional) language, yet it has a lot of power - the whole of Indigo 4 and Chaotica 2 rendering engine are written in it. (...)
especially: (...) It's possible because it's quite a restricted (pure functional) language, yet it has a lot of power (...)
Title: Re: Writing formula using gcc (Re: Blobby)
Post by: lycium on December 01, 2015, 10:44:11 PM
I hope it doesn't use LLVM as a back end.
Why not? LLVM is awesome, and exactly the tool for the job if you want fast code for your domain specific language. Winter does use LLVM for codegen. BTW, you can find some more info on my boss' blog if you're curious: http://www.forwardscattering.org/
Title: Re: Writing formula using gcc (Re: Blobby)
Post by: knighty on December 01, 2015, 11:06:19 PM
Last time I took a look at LLVM. Its JIT wasn't working on windows. Thank you for the link. :)
Title: Re: Writing formula using gcc (Re: Blobby)
Post by: lycium on December 01, 2015, 11:09:23 PM
Yeah that used to be the case a number of years ago (~2012) for Win64 only, Win32 worked fine. But since then they have fixed it and it's 102% awesome, no other way to even begin approaching modern compiler performance without it.
If you'd like to use Winter we should talk :)
Title: Re: Writing formula using gcc (Re: Blobby)
Post by: thargor6 on December 01, 2015, 11:44:21 PM
No idea why anyone would choose this over the legendary LuaJIT, for example.
Simple answer: appropriateness. It works, is stable and does not need much time to integrate it. On the other side, MB3D is technically based on a dead plattform. A large invest into some fancy compiler-solution (in the context of the current code-base) does not make much sense for me. No doubt, that this is on "greenfield-projects" like Chaotica 2 a different thing, but I think, this discussion does not help much. Anyone who has the time and power to help with some more fancy solution, is more than welcome. I'm open for any good ideas, and as stated before, some Lua-based alternative was already on the list. Cheers!
Title: Re: Writing formula using gcc (Re: Blobby)
Post by: lycium on December 01, 2015, 11:51:26 PM
Simple answer: appropriateness.
A closed-source JIT for open source project? That's really the first thing that struck me as a little inappropriate. Anyone who has the time and power to help with some more fancy solution, is more than welcome. I'm open for any good ideas, and as stated before, some Lua-based alternative was already on the list.
I know it doesn't count for much, but I actually do wish I could work in opensource; I recently released my first opensource code since early LuxRender contributions (a silly small thing (https://github.com/lycium/nbody)), and the experience was extremely positive for me with comments, suggestions and patches etc. It's almost certainly the future of programming. Maybe, hopefully, one day...
Title: Re: Writing formula using gcc (Re: Blobby)
Post by: thargor6 on December 02, 2015, 12:29:21 AM
A closed-source JIT for open source project?
This is not completely fair because the whole MB3D project is in fact bound to a commercial and expensive closed development environment. Of course, to use a commercial JIT makes this not better, but it is in my opinion appropriate as a working solution. Because I also see those problems, the integration of the Pax-Compiler is built in a way that it can be simply turned off by setting a compiler-directive (and pluggin in other stuff).
Title: Re: Writing formula using gcc (Re: Blobby)
Post by: lycium on December 02, 2015, 12:45:02 AM
Ah, yeah I see your point. Is there no chance to perhaps switch to the Lazarus IDE with Free Pascal Compiler? They recently had a major update: http://www.lazarus-ide.org/
From the feature list I see they support external C code and auto-conversion of .h headers, so maybe LuaJIT can also be integrated without too much pain? (I know it's a little late to make these suggestions/comments...)
Title: Re: Writing formula using gcc (Re: Blobby)
Post by: thargor6 on December 02, 2015, 01:15:45 AM
Is there no chance to perhaps switch to the Lazarus IDE with Free Pascal Compiler?
Good point, I check the Lazarus project on a regular basis, because I work on other legacy projects, which are also aiming to get rid of the Delphi IDE. Even it is great project, I was not really "blown away" yet, because usually, it is a lot of work, until it finally really works, because the "emulated" Delphi components are not working exactly as the original ones. Additionally, in the case of MB3D, we have much integrated ASM-stuff, I doubt that all of this will translate 1:1 to FreePascal. I think, I would prefer to rewrite the application rather than converting it (but, I will not rewrite it, this goes far beyond my ressources). But, this is only a first guess, when I find the time, I will grab the newest Lazarus and try to convert some simple part to make an estimation.
Title: Re: Writing formula using gcc (Re: Blobby)
Post by: M Benesi on August 21, 2016, 06:09:10 PM
Is there an easy way to grab the binary/hex code from JIT formulas?
Maybe an offset in M3D memspace, so we could grab it from a debugger... or a button to dump the code?
I'd like to write formulas in the JIT, and optimize them outside of the JIT.
Title: Re: Writing formula using gcc (Re: Blobby)
Post by: thargor6 on August 21, 2016, 07:14:48 PM
I think it would be easy to just dump it, but I'm not sure that the result has the quality you really want to optimize. But, I can include such an option in the next build, and you may have a look by yourself :-)
Title: Re: Writing formula using gcc (Re: Blobby)
Post by: DarkBeam on August 21, 2016, 07:17:54 PM
Next build?! Yayyy :D
Title: Re: Writing formula using gcc (Re: Blobby)
Post by: M Benesi on August 30, 2016, 07:21:43 AM
I think it would be easy to just dump it, but I'm not sure that the result has the quality you really want to optimize. But, I can include such an option in the next build, and you may have a look by yourself :-)
Thanks! I've looked at some other code made by gcc (to write a formula) and... yeah.. it might not be easy to optimize. :D
Title: Re: Writing formula using gcc (Re: Blobby)
Post by: DarkBeam on December 09, 2016, 12:49:57 PM
;D I am trying a superlame no-overhead DIFS implement. Uses macro-like statements with hardcoded horrible stuff, and it surprisingly works: #include <math.h> /* struct TIteration3Dext { double something; // -0x88 ; used in sphereheightmap.m3f double dum0; // -0x80 ; unknown double x; // -0x78 double y; // -0x70 double z; // -0x68 double dum1[8]; //unknown double DE2T; // -0x20 ; Output: distance estimate to current object double dum2[17]; //unknown double accumulatedScale; // +0x70 double dum3; // +0x78; unknown double OTforCol; // +0x80 double dum4[16]; //unknown //void * sphericalMap; // +0x108; pointer to function } MB3D_PACK; */
__CRT_INLINE double __cdecl Hypot(double x,double y) ; __CRT_INLINE double __cdecl length2(double x,double y,double z); __CRT_INLINE double __cdecl length(double x,double y,double z); __CRT_INLINE double __cdecl recip (double _x); __CRT_INLINE double __cdecl max (double _a, double _b); __CRT_INLINE double __cdecl min (double _a, double _b);
__CRT_INLINE double __cdecl loadzero (void) // uses constants, assembly required :o) { double res; __asm__ ("fldz": "=t" (res)); return res; }
__CRT_INLINE double __cdecl load1 (void) // uses constants, assembly required :o) { double res; __asm__ ("fld1": "=t" (res)); return res; }
__CRT_INLINE double __cdecl loadX (void) { double res; __asm__ ("fldl -104(%%esi)" // esi-0x068 : "=t" (res) : // nothing in (well except the memory located into esi+stuff but we won't say that - it's okayish) :"%esi" // using esi (clobbered)); ); return res; }
__CRT_INLINE double __cdecl loadY (void) { double res; __asm__ ("fldl -112(%%esi)" // esi-0x070 : "=t" (res) : // nothing in (well except the memory located into esi+stuff but we won't say that - it's okayish) :"%esi" // using esi (clobbered)); ); return res; }
__CRT_INLINE double __cdecl loadZ (void) { double res; __asm__ ("fldl -120(%%esi)" // esi-0x078 : "=t" (res) : // nothing in (well except the memory located into esi+stuff but we won't say that - it's okayish) :"%esi" // using esi (clobbered)); ); return res; }
// WARNING don't do this normally except if you really have changed actual coords, // AND you scaled them by a value... works even in nonlinear changes like Tglad's sphere inv // If you changed coords, scaling them you must also change this (at the end of your fmla) __CRT_INLINE void __cdecl multScaleFactorBy (double _ScaleF) { __asm__ ("fldl %0 " "fmull +112(%%esi)" // esi+0x070 "fstpl +112(%%esi)" // esi+0x070 : // no out (well except the memory located into esi+stuff but we won't say that - it's okayish) :"m"(_ScaleF) // in :"%esi" // using esi (clobbered)); ); return; }
// WARNING don't do this normally except if you really want to change actual coords __CRT_INLINE void __cdecl saveInX (double _x) { __asm__ ("fldl %0 " "fstpl -104(%%esi)" // esi-0x068 : // no out (well except the memory located into esi+stuff but we won't say that - it's okayish) :"m"(_x) // in :"%esi" // using esi (clobbered)); ); return; }
// WARNING don't do this normally except if you really want to change actual coords __CRT_INLINE void __cdecl saveInY (double _y) { __asm__ ("fldl %0 " "fstpl -112(%%esi)" // esi-0x070 : // no out (well except the memory located into esi+stuff but we won't say that - it's okayish) :"m"(_y) // in :"%esi" // using esi (clobbered)); ); return; }
// WARNING don't do this normally except if you really want to change actual coords __CRT_INLINE void __cdecl saveInZ (double _z) { __asm__ ("fldl %0 " "fstpl -120(%%esi)" // esi-0x078 : // no out (well except the memory located into esi+stuff but we won't say that - it's okayish) :"m"(_z) // in :"%esi" // using esi (clobbered)); ); return; }
__CRT_INLINE void __cdecl saveDE (double _DE) { __asm__ ("fldl %0 " "fstpl -32(%%esi) " // esi-0x020 : // no out (well except the memory located into esi+stuff but we won't say that - it's okayish) :"m"(_DE) // in :"%esi" // using esi (clobbered)); ); return; }
__CRT_INLINE double __cdecl Hypot(double x,double y) { return sqrt(x*x+y*y); }
__CRT_INLINE double __cdecl length2(double x,double y,double z) { return (x*x+y*y+z*z); }
__CRT_INLINE double __cdecl length(double x,double y,double z) { return sqrt(length2(x,y,z)); }
__CRT_INLINE double __cdecl recip (double _x) // uses constants, assembly required :o) { double res; __asm__ ("fld1 " "fdivp": "=t" (res) : "0" (_x)); return res; }
__CRT_INLINE double __cdecl max (double _a, double _b) { return (_a>_b)?_a:_b; }
__CRT_INLINE double __cdecl max3 (double _a, double _b, double _c) { return max(max(_a,_b),_c); }
__CRT_INLINE double __cdecl min (double _a, double _b) { return (_a<_b)?_a:_b; }
__CRT_INLINE double __cdecl min3 (double _a, double _b, double _c) { return min(min(_a,_b),_c); }
// ULTRASLIM since Jessie said Mb3D prepares everything for us already // We just use those lame macros it should be fine I hope? duh void __attribute__((fastcall)) Formula(void) { // lamest cube on earth double x=fabs(loadX()), y=fabs(loadY()), z=fabs(loadZ()); saveDE (max3(x,y,z)-load1()); } Resulting m3f: [OPTIONS] .Version = 6 .DEoption = 20 [CONSTANTS] Double = 1 [CO DE] space to fool forum tags 56DD4698D9E1DD4690D9E1DD4688D9E1D9CA83EC10DDE1DFE0F6C4457404DDD8 EB02DDD9DDE1DFE0F6C4457404DDD8EB02DDD9D9E8DEE9DD5C2408DD442408DD 5EE083C4105EC3 [END]
Just a lame test Except idk how to access at user vars, and consts - now I will try to! Heheh
Title: Re: Writing formula using gcc (Re: Blobby)
Post by: knighty on December 09, 2016, 04:12:02 PM
:thumbsup1:
Title: Re: Writing formula using gcc (Re: Blobby)
Post by: DarkBeam on December 10, 2016, 12:10:05 PM
;D Thanks again, but now I confirmed my age old suspect, vars are in reverse order so I modified macros accordingly :snore:. Plus lame macros for user vars, that must be manually tweaked if you use mixed types, like one int for applyscale+add and all other doubles ;D ... noes! Using offsets (I always waste at least 2 hours everytime) #include <math.h> /* struct TIteration3Dext { double something; // -0x88 ; used in sphereheightmap.m3f double dum0; // -0x80 ; unknown double x; // -0x78 double y; // -0x70 double z; // -0x68 double dum1[8]; //unknown double DE2T; // -0x20 ; Output: distance estimate to current object double dum2[17]; //unknown double accumulatedScale; // +0x70 double dum3; // +0x78; unknown double OTforCol; // +0x80 double dum4[16]; //unknown //void * sphericalMap; // +0x108; pointer to function } MB3D_PACK; */
const int szof1int = 4; // helps to find out offsets const int szof1dbl = 8; // helps to find out offsets
__CRT_INLINE double __cdecl Hypot(double x,double y) ; __CRT_INLINE double __cdecl length2(double x,double y,double z); __CRT_INLINE double __cdecl length(double x,double y,double z); __CRT_INLINE double __cdecl recip (double _x); __CRT_INLINE double __cdecl max (double _a, double _b); __CRT_INLINE double __cdecl min (double _a, double _b);
__CRT_INLINE double __cdecl loadzero (void) // uses constants, assembly required :o) { double res; __asm__ ("fldz": "=t" (res)); return res; }
__CRT_INLINE double __cdecl load1 (void) // uses constants, assembly required :o) { double res; __asm__ ("fld1": "=t" (res)); return res; }
__CRT_INLINE double __cdecl loadZ (void) { double res; __asm__ ("fldl -104(%%esi)" // esi-0x068 : "=t" (res) : // nothing in (well except the memory located into esi+stuff but we won't say that - it's okayish) :"%esi" // using esi (clobbered)); ); return res; }
__CRT_INLINE double __cdecl loadY (void) { double res; __asm__ ("fldl -112(%%esi)" // esi-0x070 : "=t" (res) : // nothing in (well except the memory located into esi+stuff but we won't say that - it's okayish) :"%esi" // using esi (clobbered)); ); return res; }
__CRT_INLINE double __cdecl loadX (void) { double res; __asm__ ("fldl -120(%%esi)" // esi-0x078 : "=t" (res) : // nothing in (well except the memory located into esi+stuff but we won't say that - it's okayish) :"%esi" // using esi (clobbered)); ); return res; }
__CRT_INLINE double __cdecl loadNthDoubleVar (int _nVar, int OFFSET) // 0 for the 1st ... use only double vars // ELSE use offset, idk how I go by attempts normally ;( { register int N = -(_nVar*szof1dbl)- 0x010-OFFSET; // EAX double res; __asm__ ("fldl (%%edi,%1)" // nothing else seems to work : "=t" (res) :"r"(N) // in :"%edi" // using edi (clobbered)); ); return res; }
__CRT_INLINE int __cdecl loadNthIntVar (int _nVar, int OFFSET) // 0 for the 1st ... use only int vars // ELSE use offset, idk how I go by attempts normally ;( // untested! untested! untested! { int N = -(_nVar*szof1int)- 0x0C-OFFSET; int res; __asm__ ( "add %%edi,%1 " // omg?!?? nothing else seems to work "movl %1,%%eax " // omg?!?? nothing else seems to work "movl %%eax,%0 "// omg?!?? nothing else seems to work : "=m" (res) :"m"(N) // in :"%edi","%eax" // using edi and eax (clobbered)); ); return res; }
__CRT_INLINE double __cdecl loadNthDoubleCns (int _nVar, int OFFSET) // 0 for the 1st ... use only double vars // ELSE use offset, idk how I go by attempts normally ;( // untested! untested! untested! { register int N = +(_nVar*szof1dbl)+ 0x000+OFFSET; // EAX double res; __asm__ ("fldl (%%edi,%1)" // nothing else seems to work : "=t" (res) :"r"(N) // in :"%edi" // using edi (clobbered)); ); return res; }
__CRT_INLINE double __cdecl loadNthIntCns (int _nVar, int OFFSET) // 0 for the 1st ... use only double vars // ELSE use offset, idk how I go by attempts normally ;( // untested! untested! untested! { register int N = +(_nVar*szof1int)+ 0x000+OFFSET; // EAX double res; __asm__ ( "add %%edi,%1 " // omg?!?? nothing else seems to work "movl %1,%%eax " // omg?!?? nothing else seems to work "movl %%eax,%0 "// omg?!?? nothing else seems to work : "=m" (res) :"m"(N) // in :"%edi","%eax" // using edi and eax (clobbered)); ); return res; }
// WARNING don't do this normally except if you really have changed actual coords, // AND you scaled them by a value... works even in nonlinear changes like Tglad's sphere inv // If you changed coords, scaling them you must also change this (at the end of your fmla) __CRT_INLINE void __cdecl multScaleFactorBy (double _ScaleF) { __asm__ ("fldl %0 " "fmull +112(%%esi)" // esi+0x070 "fstpl +112(%%esi)" // esi+0x070 : // no out (well except the memory located into esi+stuff but we won't say that - it's okayish) :"m"(_ScaleF) // in :"%esi" // using esi (clobbered)); ); return; }
// WARNING don't do this normally except if you really want to change actual coords __CRT_INLINE void __cdecl saveInZ (double _z) { __asm__ ("fldl %0 " "fstpl -104(%%esi)" // esi-0x068 : // no out (well except the memory located into esi+stuff but we won't say that - it's okayish) :"m"(_z) // in :"%esi" // using esi (clobbered)); ); return; }
// WARNING don't do this normally except if you really want to change actual coords __CRT_INLINE void __cdecl saveInY (double _y) { __asm__ ("fldl %0 " "fstpl -112(%%esi)" // esi-0x070 : // no out (well except the memory located into esi+stuff but we won't say that - it's okayish) :"m"(_y) // in :"%esi" // using esi (clobbered)); ); return; }
// WARNING don't do this normally except if you really want to change actual coords __CRT_INLINE void __cdecl saveInX (double _x) { __asm__ ("fldl %0 " "fstpl -120(%%esi)" // esi-0x078 : // no out (well except the memory located into esi+stuff but we won't say that - it's okayish) :"m"(_x) // in :"%esi" // using esi (clobbered)); ); return; }
__CRT_INLINE void __cdecl saveDE (double _DE) { __asm__ ("fldl %0 " "fstpl -32(%%esi) " // esi-0x020 : // no out (well except the memory located into esi+stuff but we won't say that - it's okayish) :"m"(_DE) // in :"%esi" // using esi (clobbered)); ); return; }
__CRT_INLINE double __cdecl Hypot(double x,double y) { return sqrt(x*x+y*y); }
__CRT_INLINE double __cdecl length2(double x,double y,double z) { return (x*x+y*y+z*z); }
__CRT_INLINE double __cdecl length(double x,double y,double z) { return sqrt(length2(x,y,z)); }
__CRT_INLINE double __cdecl recip (double _x) // uses constants, assembly required :o) { double res; __asm__ ("fld1 " "fdivp": "=t" (res) : "0" (_x)); return res; }
__CRT_INLINE double __cdecl max (double _a, double _b) { return (_a>_b)?_a:_b; }
__CRT_INLINE double __cdecl max3 (double _a, double _b, double _c) { return max(max(_a,_b),_c); }
__CRT_INLINE double __cdecl min (double _a, double _b) { return (_a<_b)?_a:_b; }
__CRT_INLINE double __cdecl min3 (double _a, double _b, double _c) { return min(min(_a,_b),_c); }
// ULTRASLIM since Jessie said Mb3D prepares everything for us already // We just use those lame macros it should be fine I hope? duh void __attribute__((fastcall)) Formula(void) { // should we tell the compiler to NOT touch edi nor esi?! I strongly think so... double x=fabs(loadX()), y=fabs(loadY()), z=fabs(loadZ()); double V0=loadNthDoubleVar(0,0),V1=loadNthDoubleVar(1,0),V2=loadNthDoubleVar(2,0); saveDE (max3(x-V0,y-V1,z-V2+h)); } Resulting M3F: O0 [OPTIONS] .Version = 6 .DEoption = 20 .Double Size X = 1. .Double Size Y = 1. .Double Size Z = 1. [CONSTANTS] Double = 1 [CO DE] DELETE the space! B8F0FFFFFF5756DD4688DD4690DD4698DD0407B0E883EC14DD0407B0E0DD0407 D9CBD9E1DEE3D9CBD9E1DEE3D9CBD9E1DEE3D9CADDE1DFE0F6C4457404DDD8EB 02DDD9DDE1DFE0F6C4457404DDD8EB02DDD9DD5C2408DD442408DD5EE083C414 5E5FC3 [END]
Just a lame test, but prettier :P
Title: Re: Writing formula using gcc (Re: Blobby)
Post by: M Benesi on December 10, 2016, 10:41:49 PM
awesome!@#!@# :D
Title: Re: Writing formula using gcc (Re: Blobby)
Post by: DarkBeam on December 10, 2016, 11:22:22 PM
Ehehe thanks Matthew I am a bit scared of this code but I will soon test it for some crazyyy stuff like weird terrains or whatever :dink: plenty of those into Shadertoy. Or some cool 3d models. I am unable to code them in asm :D You are invited to try too :)
|