Editing Openai/694057b6-101c-8007-9a65-c40578c7252d (section)

=== Assistant:  ===
==== This appendix defines the exact logic the worker uses to run normal tool calls (round-trip via ToolRunner), while also recording exit-tools (one-way signals) without altering control flow. ====

The goal is to make the implementation deterministic and testable.

===== - Normal tools: config.normal_tools (round-trip). =====
* Exit tools: config.exit_tools (one-way signals).
* Tool budget: tool_iters_remaining (applies to normal tools only).
* Conversation: list of ChatMessage built via build_message_stack(): - BIOS system message - caller system message - ongoing messages (user/assistant/tool)

Key rule (locked in): exit tools never change worker behavior; they are recorded only.
(If the model chooses to stop output when emitting an exit-tool, that is the model’s behavior.)

===== transport.stream_chat(payload) yields events. The worker updates progress timestamps on any bytes after headers regardless of event type. =====

The tool loop is written in terms of these semantic events:
* TextDelta(text_fragment: str) — content tokens/deltas
* AssistantMessageFinal(message_obj: dict) — final assembled assistant message (may include tool_calls)
* StreamDone() — end of stream
* ServerError(err_obj: dict | str) — error surfaced from stream
* (optional) UsageUpdate(usage_obj: dict) — tokens, timing if available

The exact parsing is transport’s job; the tool loop consumes these events.

===== Each request task runs this loop until it reaches a terminal state: =====
# Build initial conversation (BIOS + caller system + user prompt).
# Dispatch a streaming request.
# Accumulate text output and update repeated-line detector.
# If a normal tool call is emitted: - execute it (via ToolRunner), - append tool result message, - decrement tool budget, - regenerate BIOS, - continue generation (next iteration).
# If an exit tool is emitted: - record it as a signal, - do not execute it, - do not decrement tool budget, - do not change control flow.

===== Tool calls may appear: =====
* in structured fields (preferred), or
* in assistant content in fallback mode.

The worker must support both.

====== When an assistant message contains tool_calls, parse them into a list of tool call objects, then partition: ======
* normal_calls = [c for c in tool_calls if c.name in normal_tool_names]
* exit_calls = [c for c in tool_calls if c.name in exit_tool_names]
* unknown_calls = everything else

Rules:
* If unknown_calls is non-empty → treat as FAILED(reason="tool_parse_error") (preserve partial output).
* Record each exit_call into signals[] immediately (best effort parsing).
* Proceed with normal_calls via the normal tool loop.

Important detail (prevents “unanswered tool calls”):
* When continuing generation after normal tool execution, the worker should append an assistant message containing only the normal tool calls, not the exit tool calls.
* Exit tool calls are treated as out-of-band signals and are not part of the conversation that requires tool responses.

This keeps exit-tools “no round trip” while avoiding confusing the model with unresponded calls.

====== If no structured tool_calls are present but the model is expected to call tools, the worker uses a BIOS-enforced convention, for example: ======
* A tool call is represented as a single JSON object (or JSON line) that includes: - {"tool": "<name>", "arguments": { ... }} - optionally with a stable prefix/suffix marker if you choose to enforce one.

Fallback parsing rules:
* Attempt parsing only when the accumulated assistant output contains a clearly delimited tool-call candidate.
* If parsed tool name is in normal tools: - treat it as a normal tool call - strip the tool-call directive from the user-visible output (so the final result isn’t polluted)
* If tool name is in exit tools: - record signal - optionally strip directive from output (recommended)
* If parsing fails or tool name unknown: - FAILED(reason="tool_parse_error") (preserve output as-is for debugging)

This fallback logic should be isolated in tooling.py and unit-tested heavily.

===== For each normal tool call, in order: =====

Preconditions
* tool_iters_remaining > 0, else fail request with FAILED(reason="tool_execution_error", detail="tool budget exhausted") (preserve output).

Execution
# Parse tool call name + arguments: - arguments must be JSON object; if not → tool_parse_error.
# Invoke ToolRunner: - await tool_runner.run_tool(name=..., arguments=..., request_id=..., job_name=...) - Enforce per-tool timeout (via asyncio.wait_for).
# Serialize tool result into a tool message: - {"role": "tool", "tool_call_id": <id>, "content": <json-serialized result>}
# Update conversation: - Append assistant tool-call message (containing the normal tool call(s) only). - Append tool result message(s).
# Decrement budget: - tool_iters_remaining -= 1 (or decrement by number of executed tool calls if you allow multiple per iteration).
# Regenerate BIOS and rebuild message stack for continuation: - new BIOS includes updated tool_iters_remaining.

Failure handling
* Tool runner timeout → FAILED(reason="tool_execution_error")
* Tool runner raises exception → FAILED(reason="tool_execution_error")
* In all cases: - preserve accumulated output for retrieval via get_result() - release slot promptly

===== To keep v1 simple and robust: =====
* The worker may support multiple normal tool calls emitted together by executing them sequentially in the order given.
* Each executed tool call decrements the normal tool iteration budget by 1 (simple, predictable).
* If the model emits N tool calls but budget has fewer than N remaining: - execute up to the remaining budget - then fail with tool_execution_error (“budget exhausted”) while preserving output and any recorded signals.

This behavior is deterministic and easy to test.

===== When an exit tool call is detected (structured or fallback): =====
* Record ExitSignal: - tool_name - arguments (best effort JSON) - emitted_at (monotonic timestamp)
* Do not execute it.
* Do not decrement normal tool budget.
* Do not append it to the conversation used for continuation (if continuation occurs due to normal tools).

If the model emits only exit tool calls and then stops:
* request completes (likely with empty/partial text), and signals are returned upward.

===== - The request accumulates output text whenever TextDelta events arrive. =====
* On any terminal state (COMPLETED, FAILED, CANCELED), get_result() returns: - text: the full accumulated output so far (possibly empty) - signals: any recorded exit signals - failure details if applicable

Even tool failures and restarts must preserve accumulated output until get_result() is called.

===== <syntaxhighlight lang="python">async def run_request(req: RequestRecord) -> None: =====
    try:
        # Build initial conversation with BIOS + caller system + user
        req.conversation = build_initial_conversation(req)

        while True:
            payload = build_chat_payload(req.conversation, req.params, tools=req.all_tools)
            stream = transport.stream_chat(payload)

            # Per-stream scratch
            assistant_text = ""
            structured_tool_calls = None

            async for event in stream:
                req.last_stream_byte_at = now_monotonic()  # any bytes after headers count as progress

                if event.type == "TextDelta":
                    assistant_text += event.text
                    req.output += event.text
                    loop_detector.feed(event.text)
                    if loop_detector.triggered():
                        req.fail("repeated_line_loop")
                        return

                elif event.type == "AssistantMessageFinal":
                    structured_tool_calls = extract_tool_calls(event.message_obj)
                    # Some servers finalize here; continue to StreamDone

                elif event.type == "ServerError":
                    req.fail("unknown_error", detail=str(event.error))
                    return

                elif event.type == "StreamDone":
                    break

            # Tool detection: structured first, then fallback
            normal_calls, exit_calls, err = partition_structured_calls(structured_tool_calls)
            if err:
                req.fail("tool_parse_error", detail=err)
                return

            record_exit_calls(req, exit_calls)

            if normal_calls:
                if req.tool_iters_remaining <= 0:
                    req.fail("tool_execution_error", detail="tool budget exhausted")
                    return

                # Append assistant tool-call message (NORMAL ONLY), execute tools, append tool results
                ok, err = await execute_normal_tools(req, normal_calls)
                if not ok:
                    req.fail("tool_execution_error", detail=err)
                    return

                req.tool_iters_remaining -= len(normal_calls)
                req.conversation = rebuild_conversation_with_new_bios(req)
                continue  # loop back to generate more

            # No structured normal calls; try fallback parsing on the recent output
            fb = try_fallback_parse_tool_call(assistant_text)
            if fb:
                if fb.name in exit_tool_names:
                    record_exit_signal(req, fb)
                    # optional: strip directive from output
                    req.output = strip_fallback_directive(req.output, fb)
                    # no continuation mandated; allow model to finish naturally
                    # (if you want continuation behavior, keep it explicit and tested)
                elif fb.name in normal_tool_names:
                    # strip directive from output, run normal tool loop, then continue
                    req.output = strip_fallback_directive(req.output, fb)
                    ok, err = await execute_normal_tools(req, [fb])
                    if not ok:
                        req.fail("tool_execution_error", detail=err)
                        return
                    req.tool_iters_remaining -= 1
                    req.conversation = rebuild_conversation_with_new_bios(req)
                    continue
                else:
                    req.fail("tool_parse_error", detail="unknown fallback tool")
                    return

            # Otherwise: no tools detected -> completion
            req.complete(finish_reason=deduce_finish_reason(req.params, req.output))
            return

    except asyncio.CancelledError:
        req.cancel()
        raise
    except Exception as e:
        req.fail("unknown_error", detail=str(e))
    finally:
        release_slot(req)

</syntaxhighlight>

If you want to proceed, the next appendices that are most useful for implementation and testing are:
* Appendix I: Restart policy and crash-loop behavior (explicit algorithm)
* Appendix J: Liveness probing details (/proc/<pid>/stat CPU time delta)
* Appendix L: Test plan and stub-server fixtures (test matrix)