You will need to do:

return length(p) * pow(scale, (float)-n-1.f);

That's because you bail out before incrementing n.

Of course you can increment n before testing for bailout instead.

A slight scaling down (by a factor say .99) would be necessary to avoid some overstepping. Another way to avoid that slight overstepping is to subtract the radius of the bounding sphere from p:

return (length(p) - Bounding_sphere_radius) * pow(scale, (float)-n-1.f);

In order to save some CPU cycles:

float t=dot(p,n);

if (t>0.0) p-=2.0*t*n;

or

p-=2.0*max(dot(p,n),0)*n;

are faster than using matrices.

BTW! happy holydays.