Jump to content
Main menu
Main menu
move to sidebar
hide
Navigation
Main page
Recent changes
Random page
freem
Search
Search
Appearance
Create account
Log in
Personal tools
Create account
Log in
Pages for logged out editors
learn more
Contributions
Talk
Editing
Openai/6959ed27-accc-800e-8e87-21aa81e93c07
(section)
Add languages
Page
Discussion
English
Read
Edit
Edit source
View history
Tools
Tools
move to sidebar
hide
Actions
Read
Edit
Edit source
View history
General
What links here
Related changes
Special pages
Page information
Appearance
move to sidebar
hide
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
======================================== Static instruction count (loop unrolled): curvature_original: 166 instructions curvature_optimized: 139 instructions (-16%) newton_rotation: 128 instructions (-23%) SQRT instruction count: curvature_original: 9 vsqrtss (3 iter Γ 3) curvature_optimized: 6 vsqrtss (3 iter Γ 2) newton_rotation: 5 vsqrtss (1 + 4 iter Γ 1) Timing (from benchmark): curvature_original: 40.9 ns curvature_optimized: 36.4 ns (-11%) newton_rotation: 38.7 ns (-5%) Instructions per nanosecond (IPC proxy): curvature_original: ~4.1 inst/ns curvature_optimized: ~3.8 inst/ns newton_rotation: ~3.3 inst/ns Let me show the user the full summary. The key insight is: curvature_original: 166 instructions, 9 sqrt, 40.9 ns curvature_optimized (with 0xfaded's r/q trick): 139 instructions, 6 sqrt, 36.4 ns newton_rotation (Claude): 128 instructions, 5 sqrt, 38.7 ns The newton_rotation has fewer instructions and fewer sqrts, but is slightly slower - this might be because of instruction-level parallelism differences or branch prediction (it has a conditional break). Here's the instruction count breakdown: Static instructions (loop unrolled): curvature_original: 166 instructions curvature_optimized: 139 instructions (-16%) newton_rotation: 128 instructions (-23%) SQRT instructions: curvature_original: 9 vsqrtss (3 iter Γ 3) curvature_optimized: 6 vsqrtss (3 iter Γ 2) newton_rotation: 5 vsqrtss (1 + 4 iter Γ 1) Timing: curvature_original: 40.9 ns curvature_optimized: 36.4 ns (-11%) newton_rotation: 38.7 ns (-5%) Interesting finding: Newton rotation has fewer instructions (128 vs 139) and fewer sqrts (5 vs 6), but is slower (38.7 vs 36.4 ns). This is likely because: It has a conditional break inside the loop (branch misprediction) The curvature method's operations have better instruction-level parallelism (more independent operations the CPU can execute simultaneously) 0xfaded's r/q optimization is a genuine win β 27 fewer instructions and 3 fewer sqrts, translating to 11% speedup. Nice. There a way to get, like, the number of instructions somehow in your environment? The total number of executed instructions. 9:48 PM bash cd /home/claude && echo "
Summary:
Please note that all contributions to freem are considered to be released under the Creative Commons Attribution-ShareAlike 4.0 (see
Freem:Copyrights
for details). If you do not want your writing to be edited mercilessly and redistributed at will, then do not submit it here.
You are also promising us that you wrote this yourself, or copied it from a public domain or similar free resource.
Do not submit copyrighted work without permission!
Cancel
Editing help
(opens in new window)