Title: VTune Results; Compute vs. CVector3::IsNotANumber. Post by: mancoast on August 03, 2016, 03:42:16 AM Greetings, I bring you this report summary of VTune Results; Compute vs. CVector3::IsNotANumber. The data focuses on CVector3::IsNotANumber. (http://i.imgur.com/ujF1izC.png) This case is interesting because we are spending much time in Compute function itself. Within the scope of this Compute function, we use CVector3::IsNotANumber where the majority instructions retire. Also within the scope of this Compute function call, other operations retire a large share of instructions to self. The Bottom-up view shows that we retire most instructions in CVector3::IsNotANumber. (http://i.imgur.com/pusfpiw.png) The Caller/Callee view reinforces the separate and nested scopes where the majority of CPU instructions takes place. It shows that we spend 68.2% of total time in Compute function. It further shows that we spend 22.5% of total time in CVector3::IsNotANumber function. However, the report clearly states that CVector3::IsNotANumber retired nearly double the CPU instructions to self as opposed to the “runner-up” Compute function. (http://i.imgur.com/KcknmZz.png) There are instances of Compute function throughout the RayMarcher, ObjectShader, CalculateNormals, etc… (http://i.imgur.com/hW8oFZb.png) This source code view below shows one call stack instance of Compute function. Here we call CVector3::IsNotANumber() This instance alone accounts for 11.2% of total CPU time. (http://i.imgur.com/gNVoT7b.png) This initial source view of algebra.hpp pinpoints the critical section. (http://i.imgur.com/RQ8c1jA.png) This secondary source view of algebra.hpp breaks down the critical section. (http://i.imgur.com/jSIbq4F.png) When testing for NaN, any ideas to optimize? Thanks, coast |