Jump to content
Main menu
Main menu
move to sidebar
hide
Navigation
Main page
Recent changes
Random page
freem
Search
Search
Appearance
Create account
Log in
Personal tools
Create account
Log in
Pages for logged out editors
learn more
Contributions
Talk
Editing
Openai/6959ed27-accc-800e-8e87-21aa81e93c07
(section)
Add languages
Page
Discussion
English
Read
Edit
Edit source
View history
Tools
Tools
move to sidebar
hide
Actions
Read
Edit
Edit source
View history
General
What links here
Related changes
Special pages
Page information
Appearance
move to sidebar
hide
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
=========================================== Total program instructions (10000 calls each): curvature_original: 2,210,956 curvature_optimized: 2,010,972 newton_rotation: 2,550,958 Startup overhead (baseline): ~131,000 Instructions per call: curvature_original: 207 inst/call curvature_optimized: 187 inst/call newton_rotation: 241 inst/call Summary: original β optimized: -19 inst/call (9% reduction) original β rotation: +34 inst/call (16% increase) Interesting! Now I have the executed instruction counts. Let me present this cleanly. Here's the executed instruction count (via valgrind callgrind): Instructions Time IPC (per call) (ns) (inst/ns) βββββββββββββββββββββββββββββββββββββββββββββββββββββββββ curvature_original 207 40.9 5.1 curvature_optimized 187 36.4 5.1 newton_rotation 241 38.7 6.2 Key findings: 0xfaded's r/q optimization saves 20 instructions (207 β 187, -10%) Newton rotation executes MORE instructions (241 vs 187) but is competitive in time β it has higher IPC (instructions per cycle), probably because the operations are more parallelizable Curvature optimized is the winner: fewest instructions AND fastest time Updated table for your HN reply if you want: yours yours+opt claude ------------------------------------ Time (ns) 40.5 36.4 38.3 Instructions 207 187 241 sqrt/iter 3 2 1 Accuracy 4.8e-7 4.2e-7 4.8e-7 make a solid more compact and add two spaces and output it as a as key block that can copy paste Nice that worked. It's even faster. original optimized rotation (claude) ----------------------------------------------------------- Time (ns/call) 40.9 36.4 38.7 Speedup -- 11% 5% 9:52 PM The user wants me to make the table more compact, add two spaces at the beginning of each line (for HN code formatting), and output it as a code block they can copy-paste. They also want to include the instruction count data. Let me create a compact version with the key metrics. yours yours+opt claude -------------------------------------------- Time (ns) 40.9 36.4 38.7 Instructions 207 187 241 sqrt/iter 3 2 1 Accuracy 4.8e-7 4.2e-7 4.8e-7 Also include the line before I had with a nice 9:53 PM The user wants me to include the previous line they had about it being "nice that worked. It's even faster." as a lead-in to the code block. Let me format it nicely. Nice, that worked. It's even faster. yours yours+opt claude -------------------------------------------- Time (ns) 40.9 36.4 38.7 Instructions 207 187 241 sqrt/iter 3 2 1 Accuracy 4.8e-7 4.2e-7 4.8e-7 I'll put the instruction count. But call, make it compact, put the table as ASCII, and then just mentioned, instead of Newton rotation, just mention I guess, yeah, his clogged and optimized Okay. So no. So there's yeah. The cloud solution The new one that got optimized and then he's the original one. 9:55 PM The user wants me to restructure the table to show: His (0xfaded's) original solution His optimized solution (with the r/q trick he suggested) Claude's solution And make it compact with ASCII, ready to copy paste with 2 spaces prefix. yours optimized claude ------------------------------------------- Time (ns) 40.9 36.4 38.7 Instructions 207 187 241 sqrt/iter 3 2 1 Accuracy 4.8e-7 4.2e-7 4.8e-7 The valgine, v a l d r I n d, that has the IPC 9:55 PM bash cd /home/claude && echo "=== curvature_original ===" && valgrind --tool=callgrind --branch-sim=yes --callgrind-out-file=cg1.out ./inst_count2 2>&1 | grep -E "refs|mispred|branches" && echo "=== curvature_optimized ===" && valgrind --tool=callgrind --branch-sim=yes --callgrind-out-file=cg2.out ./inst_count2 x 2>&1 | grep -E "refs|mispred|branches" && echo "=== newton_rotation ===" && valgrind --tool=callgrind --branch-sim=yes --callgrind-out-file=cg3.out ./inst_count2 x x 2>&1 | grep -E "refs|mispred|branches" Output
Summary:
Please note that all contributions to freem are considered to be released under the Creative Commons Attribution-ShareAlike 4.0 (see
Freem:Copyrights
for details). If you do not want your writing to be edited mercilessly and redistributed at will, then do not submit it here.
You are also promising us that you wrote this yourself, or copied it from a public domain or similar free resource.
Do not submit copyrighted work without permission!
Cancel
Editing help
(opens in new window)