Editing Openai/694057b6-101c-8007-9a65-c40578c7252d (section)

=== Appendix D: Concrete defaults (recommended starting values) ===

These are starting points, tuned for long prefill and conservative behavior.

==== D.1 Readiness ====
* Probe: GET /v1/models
* Startup probe interval: 0.5s initially (short burst), then backoff
* Startup max wait: 120s (server bind/readiness, not model TTFT)

==== D.2 TimeoutProfile baseline (slow hardware / big contexts) ====
* connect_timeout_s = 3.0
* headers_timeout_s = 30.0
* ttft_timeout_s = None
* prefill_liveness_timeout_s = None (disabled) or very large (e.g., 3600s)
* idle_stream_timeout_s = 300.0
* absolute_timeout_s = None
* liveness_probe_interval_s = 5.0
* restart controls: - restart_backoff_s = 5.0 - restart_window_s = 120.0 - max_restarts_per_window = 5

==== D.3 Normal tools ====
* max_tool_iterations = 8 (iterations, not calls)
* per-tool timeout: set by worker/tooling policy (e.g., 10s default), but heavy tools can be handled by ToolRunner configs.

==== D.4 Repeated-line detector (updated for ~39-char loops) ====
* Ignore empty/whitespace-only lines.
* Normalize by stripping and (optionally) collapsing internal whitespace.
* Thresholds: - len(line) >= 64 → trigger at 8 consecutive repeats - 32 <= len(line) < 64 → trigger at 12 consecutive repeats - len(line) < 32 → ignore
* Warmup: - start checking after min_output_chars_before_check = 256 - require at least 2 completed non-empty lines observed
* On trigger: - cancel request → FAILED(reason="repeated_line_loop") - preserve output for retrieval