I think I may have been confused by your term "gradient" - I assumed you meant the 4 points used to get the surface normall, but I guess not
I basically iterate at step and at step+delta (assuming step is not "inside") and then calculate the smooth iteration value for these using:
smooth = iter + (log(0.5*log(bailout))-log(0.5*log(m)))/log(divergence)
in each case where m is the square of the final magnitude and divergence is either fixed or calculated.
Then I use:
Distance Estimate (DE) = 0.4/(1.0 + abs(smooth1-smooth0)/delta)
That's the value I check against the solid threshold, but is scaled to give a step distance if solid is not found:
step = DE*0.6/accuracy
Where accuracy is a user parameter - 1 for speed, 2 or more for accuracy (2 is normally sufficient but sometimes I use up to 4).
So while stepping I only calculate 2 iterations per step, if solid is hit then I find the boundary more accurately using a binary search and then finally iterate at the adjacent points on adjacent rays at the found distance and found distance + delta to get the DE for those - I assume all the DE values are correct so this gives me 5 surface points to average the normals from 4 triangles.
Note that I actually use the adjacent rays at 0.5 pixel offsets on the viewing plane rather than whole pixel by default, though the offset used is actually a user parameter.