Jump to content
Main menu
Main menu
move to sidebar
hide
Navigation
Main page
Recent changes
Random page
freem
Search
Search
Appearance
Create account
Log in
Personal tools
Create account
Log in
Pages for logged out editors
learn more
Contributions
Talk
Editing
Openai/694057b6-101c-8007-9a65-c40578c7252d
(section)
Add languages
Page
Discussion
English
Read
Edit
Edit source
View history
Tools
Tools
move to sidebar
hide
Actions
Read
Edit
Edit source
View history
General
What links here
Related changes
Special pages
Page information
Appearance
move to sidebar
hide
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
=== Assistant: === ==== This appendix defines the exact logic the worker uses to run normal tool calls (round-trip via ToolRunner), while also recording exit-tools (one-way signals) without altering control flow. ==== The goal is to make the implementation deterministic and testable. ===== - Normal tools: config.normal_tools (round-trip). ===== * Exit tools: config.exit_tools (one-way signals). * Tool budget: tool_iters_remaining (applies to normal tools only). * Conversation: list of ChatMessage built via build_message_stack(): - BIOS system message - caller system message - ongoing messages (user/assistant/tool) Key rule (locked in): exit tools never change worker behavior; they are recorded only. (If the model chooses to stop output when emitting an exit-tool, that is the model’s behavior.) ===== transport.stream_chat(payload) yields events. The worker updates progress timestamps on any bytes after headers regardless of event type. ===== The tool loop is written in terms of these semantic events: * TextDelta(text_fragment: str) — content tokens/deltas * AssistantMessageFinal(message_obj: dict) — final assembled assistant message (may include tool_calls) * StreamDone() — end of stream * ServerError(err_obj: dict | str) — error surfaced from stream * (optional) UsageUpdate(usage_obj: dict) — tokens, timing if available The exact parsing is transport’s job; the tool loop consumes these events. ===== Each request task runs this loop until it reaches a terminal state: ===== # Build initial conversation (BIOS + caller system + user prompt). # Dispatch a streaming request. # Accumulate text output and update repeated-line detector. # If a normal tool call is emitted: - execute it (via ToolRunner), - append tool result message, - decrement tool budget, - regenerate BIOS, - continue generation (next iteration). # If an exit tool is emitted: - record it as a signal, - do not execute it, - do not decrement tool budget, - do not change control flow. ===== Tool calls may appear: ===== * in structured fields (preferred), or * in assistant content in fallback mode. The worker must support both. ====== When an assistant message contains tool_calls, parse them into a list of tool call objects, then partition: ====== * normal_calls = [c for c in tool_calls if c.name in normal_tool_names] * exit_calls = [c for c in tool_calls if c.name in exit_tool_names] * unknown_calls = everything else Rules: * If unknown_calls is non-empty → treat as FAILED(reason="tool_parse_error") (preserve partial output). * Record each exit_call into signals[] immediately (best effort parsing). * Proceed with normal_calls via the normal tool loop. Important detail (prevents “unanswered tool calls”): * When continuing generation after normal tool execution, the worker should append an assistant message containing only the normal tool calls, not the exit tool calls. * Exit tool calls are treated as out-of-band signals and are not part of the conversation that requires tool responses. This keeps exit-tools “no round trip” while avoiding confusing the model with unresponded calls. ====== If no structured tool_calls are present but the model is expected to call tools, the worker uses a BIOS-enforced convention, for example: ====== * A tool call is represented as a single JSON object (or JSON line) that includes: - {"tool": "<name>", "arguments": { ... }} - optionally with a stable prefix/suffix marker if you choose to enforce one. Fallback parsing rules: * Attempt parsing only when the accumulated assistant output contains a clearly delimited tool-call candidate. * If parsed tool name is in normal tools: - treat it as a normal tool call - strip the tool-call directive from the user-visible output (so the final result isn’t polluted) * If tool name is in exit tools: - record signal - optionally strip directive from output (recommended) * If parsing fails or tool name unknown: - FAILED(reason="tool_parse_error") (preserve output as-is for debugging) This fallback logic should be isolated in tooling.py and unit-tested heavily. ===== For each normal tool call, in order: ===== Preconditions * tool_iters_remaining > 0, else fail request with FAILED(reason="tool_execution_error", detail="tool budget exhausted") (preserve output). Execution # Parse tool call name + arguments: - arguments must be JSON object; if not → tool_parse_error. # Invoke ToolRunner: - await tool_runner.run_tool(name=..., arguments=..., request_id=..., job_name=...) - Enforce per-tool timeout (via asyncio.wait_for). # Serialize tool result into a tool message: - {"role": "tool", "tool_call_id": <id>, "content": <json-serialized result>} # Update conversation: - Append assistant tool-call message (containing the normal tool call(s) only). - Append tool result message(s). # Decrement budget: - tool_iters_remaining -= 1 (or decrement by number of executed tool calls if you allow multiple per iteration). # Regenerate BIOS and rebuild message stack for continuation: - new BIOS includes updated tool_iters_remaining. Failure handling * Tool runner timeout → FAILED(reason="tool_execution_error") * Tool runner raises exception → FAILED(reason="tool_execution_error") * In all cases: - preserve accumulated output for retrieval via get_result() - release slot promptly ===== To keep v1 simple and robust: ===== * The worker may support multiple normal tool calls emitted together by executing them sequentially in the order given. * Each executed tool call decrements the normal tool iteration budget by 1 (simple, predictable). * If the model emits N tool calls but budget has fewer than N remaining: - execute up to the remaining budget - then fail with tool_execution_error (“budget exhausted”) while preserving output and any recorded signals. This behavior is deterministic and easy to test. ===== When an exit tool call is detected (structured or fallback): ===== * Record ExitSignal: - tool_name - arguments (best effort JSON) - emitted_at (monotonic timestamp) * Do not execute it. * Do not decrement normal tool budget. * Do not append it to the conversation used for continuation (if continuation occurs due to normal tools). If the model emits only exit tool calls and then stops: * request completes (likely with empty/partial text), and signals are returned upward. ===== - The request accumulates output text whenever TextDelta events arrive. ===== * On any terminal state (COMPLETED, FAILED, CANCELED), get_result() returns: - text: the full accumulated output so far (possibly empty) - signals: any recorded exit signals - failure details if applicable Even tool failures and restarts must preserve accumulated output until get_result() is called. ===== <syntaxhighlight lang="python">async def run_request(req: RequestRecord) -> None: ===== try: # Build initial conversation with BIOS + caller system + user req.conversation = build_initial_conversation(req) while True: payload = build_chat_payload(req.conversation, req.params, tools=req.all_tools) stream = transport.stream_chat(payload) # Per-stream scratch assistant_text = "" structured_tool_calls = None async for event in stream: req.last_stream_byte_at = now_monotonic() # any bytes after headers count as progress if event.type == "TextDelta": assistant_text += event.text req.output += event.text loop_detector.feed(event.text) if loop_detector.triggered(): req.fail("repeated_line_loop") return elif event.type == "AssistantMessageFinal": structured_tool_calls = extract_tool_calls(event.message_obj) # Some servers finalize here; continue to StreamDone elif event.type == "ServerError": req.fail("unknown_error", detail=str(event.error)) return elif event.type == "StreamDone": break # Tool detection: structured first, then fallback normal_calls, exit_calls, err = partition_structured_calls(structured_tool_calls) if err: req.fail("tool_parse_error", detail=err) return record_exit_calls(req, exit_calls) if normal_calls: if req.tool_iters_remaining <= 0: req.fail("tool_execution_error", detail="tool budget exhausted") return # Append assistant tool-call message (NORMAL ONLY), execute tools, append tool results ok, err = await execute_normal_tools(req, normal_calls) if not ok: req.fail("tool_execution_error", detail=err) return req.tool_iters_remaining -= len(normal_calls) req.conversation = rebuild_conversation_with_new_bios(req) continue # loop back to generate more # No structured normal calls; try fallback parsing on the recent output fb = try_fallback_parse_tool_call(assistant_text) if fb: if fb.name in exit_tool_names: record_exit_signal(req, fb) # optional: strip directive from output req.output = strip_fallback_directive(req.output, fb) # no continuation mandated; allow model to finish naturally # (if you want continuation behavior, keep it explicit and tested) elif fb.name in normal_tool_names: # strip directive from output, run normal tool loop, then continue req.output = strip_fallback_directive(req.output, fb) ok, err = await execute_normal_tools(req, [fb]) if not ok: req.fail("tool_execution_error", detail=err) return req.tool_iters_remaining -= 1 req.conversation = rebuild_conversation_with_new_bios(req) continue else: req.fail("tool_parse_error", detail="unknown fallback tool") return # Otherwise: no tools detected -> completion req.complete(finish_reason=deduce_finish_reason(req.params, req.output)) return except asyncio.CancelledError: req.cancel() raise except Exception as e: req.fail("unknown_error", detail=str(e)) finally: release_slot(req) </syntaxhighlight> If you want to proceed, the next appendices that are most useful for implementation and testing are: * Appendix I: Restart policy and crash-loop behavior (explicit algorithm) * Appendix J: Liveness probing details (/proc/<pid>/stat CPU time delta) * Appendix L: Test plan and stub-server fixtures (test matrix)
Summary:
Please note that all contributions to freem are considered to be released under the Creative Commons Attribution-ShareAlike 4.0 (see
Freem:Copyrights
for details). If you do not want your writing to be edited mercilessly and redistributed at will, then do not submit it here.
You are also promising us that you wrote this yourself, or copied it from a public domain or similar free resource.
Do not submit copyrighted work without permission!
Cancel
Editing help
(opens in new window)