You will need to do:
return length(p) * pow(scale, (float)-n-1.f);
That's because you bail out before incrementing n.
Of course you can increment n before testing for bailout instead.
A slight scaling down (by a factor say .99) would be necessary to avoid some overstepping. Another way to avoid that slight overstepping is to subtract the radius of the bounding sphere from p:
return (length(p) - Bounding_sphere_radius) * pow(scale, (float)-n-1.f);
In order to save some CPU cycles:
float t=dot(p,n);
if (t>0.0) p-=2.0*t*n;
or
p-=2.0*max(dot(p,n),0)*n;
are faster than using matrices.
BTW! happy holydays.