feat(log): opt-in JSON format for unit error log (D8)#5
feat(log): opt-in JSON format for unit error log (D8)#5
Conversation
Add planning documents covering the fork's direction and priorities: Roadmap docs: - README.md — index and navigation hub - unit-roadmap.md — cross-cutting platform work, core daemon, governance - unit-maintainer.md — maintainer-facing synthesis, priorities, backlog - unit-php.md — PHP ZTS worker pool, persistent worker, TrueAsync - unit-python.md — free-threaded 3.13t, subinterpreters, ASGI/WSGI - unit-ruby.md — thread pool, Ractors, Fiber scheduler, YJIT - unit-cron.md — scheduler/cron primitive for framework tasks - unit-arm32.md — armv7/armhf SIGBUS/alignment investigation - unit-todos.md — ~90 TODO/FIXME/HACK markers inventory - unit-wasm.md — WASM backends, WASI component model, OCI distribution Core changes: - nxt_conf.h — add new config validation helpers - nxt_conf_validation.c — expand validation for routes, targets, TLS - nxt_controller.c — wire up new validation entry points Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Adds a `--log-format json` command-line flag that switches the unit
error log from the legacy free-form text format to one JSON object per
line. Default behaviour is unchanged: without the flag, log output
stays byte-for-byte compatible with prior releases.
Schema (per roadmap D8):
{"ts","level","pid","app","msg","request_id"}
- ts is ISO-8601 UTC with millisecond precision (cached per second)
- level is the existing nxt_log_levels[] string
- pid is the OS pid as a JSON number
- app is currently the literal "unit"; per-process role is a follow-up
- msg is the formatted message + ctx_handler output, JSON-escaped
- request_id is included only when log->ident is non-zero
Implementation is intentionally minimal:
- nxt_log_json_handler in src/nxt_app_log.c next to the existing
nxt_log_time_handler; same stack-buffer style, no heap, escape
helper duplicates the rules from nxt_conf_json_escape rather than
lifting it across translation units.
- nxt_runtime_t.log_format selects the handler in nxt_runtime_create
after argv parsing, before any fork; children inherit the pointer.
- No changes to nxt_log_t or nxt_main_log struct layout, no new IPC,
no runtime swap, no config schema in v1.
Tests in test/test_log_json.py launch a standalone unitd to exercise:
help text, bad/missing values, default-text smoke, JSON parsing,
multi-process pids, escaping of embedded quotes, and request_id
omission for startup records.
Roadmap: roadmap/unit-roadmap.md D8
https://claude.ai/code/session_017QCJgJWYDJZ9yqHdhHZLyz
There was a problem hiding this comment.
Code Review
This pull request introduces an opt-in JSON format for the unit error log, enabled via the --log-format json flag, and includes a new test suite for verification. The implementation adds ISO 8601 timestamping, JSON escaping, and a dedicated log handler. Review feedback identifies critical safety concerns: a large stack allocation in the JSON handler poses a risk of stack overflow in fiber-based environments, and the lack of boundary checks in the escaping logic and manual pointer increments could lead to buffer overflows.
| * for the framing fields ("ts", "level", "pid", "app", "request_id" | ||
| * keys plus quotes/commas) on top. | ||
| */ | ||
| u_char out[NXT_MAX_ERROR_STR * 6 + 256]; |
There was a problem hiding this comment.
Allocating ~12.5 KiB on the stack for the out buffer is a significant concern in the context of Unit's architecture, particularly when logging from fibers which may have limited stack sizes (often 16-32 KiB). This could lead to stack overflow in deep call stacks or resource-constrained environments. Consider using a thread-local buffer or reducing the buffer size and implementing truncation if the escaped message exceeds a safer limit (e.g., 4 KiB).
There was a problem hiding this comment.
Addressed in bfc2a2b. The out buffer is now NXT_MAX_ERROR_STR + 256 (~2.3 KiB) instead of NXT_MAX_ERROR_STR * 6 + 256 (~12.5 KiB), with the escape pass truncating cleanly when expansion would otherwise overflow.
For context: Unit's fiber stack is 512 KiB (src/nxt_fiber.c:33), so the original 12.5 KiB was actually within the safety margin even on fibers — but the smaller buffer is genuinely better and pairs naturally with the bounded escape from the other thread.
Generated by Claude Code
| * with the access-log JSON formatter. | ||
| */ | ||
| static u_char * | ||
| nxt_log_json_escape(u_char *dst, const u_char *src, size_t size) |
There was a problem hiding this comment.
The nxt_log_json_escape function does not take a destination buffer limit. While the current caller in nxt_log_json_handler provides a buffer sized for the worst-case expansion, it is safer and more consistent with other Unit string utilities to pass a boundary pointer (e.g., u_char *end) to prevent potential overflows if the logic or caller assumptions change.
There was a problem hiding this comment.
Done in bfc2a2b. nxt_log_json_escape now takes an end boundary:
static u_char *
nxt_log_json_escape(u_char *dst, u_char *end, const u_char *src, size_t size)It stops cleanly at end and returns the actual write position, so callers cannot overflow regardless of input.
Generated by Claude Code
| *q++ = '"'; | ||
|
|
||
| if (log->ident != 0) { | ||
| q = nxt_sprintf(q, qend, ",\"request_id\":%uD", log->ident); | ||
| } | ||
|
|
||
| *q++ = '}'; | ||
| *q++ = '\n'; |
There was a problem hiding this comment.
Manual increments of the q pointer (e.g., *q++ = '"') are performed without checking against qend. Although the buffer is sized for the worst case, this bypasses the safety provided by nxt_sprintf used earlier in the function. If any preceding operation were to reach the buffer limit, these manual writes would result in a buffer overflow. It is safer to use nxt_sprintf for these characters or add explicit if (q < qend) checks.
There was a problem hiding this comment.
Addressed in bfc2a2b. The handler now reserves a fixed trailer budget at buffer construction time:
static const size_t trailer_max =
sizeof("\",\"request_id\":4294967295}\n") - 1; /* 27 bytes */
qmax = qend - trailer_max;
...
q = nxt_log_json_escape(q, qmax, raw, msg_len); /* clamped at qmax */
*q++ = '"'; /* in trailer budget */
if (log->ident != 0) {
q = nxt_sprintf(q, qend, ",\"request_id\":%uD", log->ident);
}
*q++ = '}';
*q++ = '\n';Because the escape pass is now capped at qmax = qend - 27, the closing ", optional ,"request_id":<u32> (24 B max), } and \n are all guaranteed to fit in the reserved tail without further checks. The unguarded *q++ writes are now safe by construction.
Generated by Claude Code
The full-suite CI run (test/test_*.py walked under sudo -E pytest-3)
applies conftest.py's autouse `run` fixture to every test, including
the new JSON log ones. After each test the fixture runs
_check_processes() which substring-matches on main_pid against ps -ax
output -- if our subprocess unitd happened to spawn child PIDs
containing main_pid as a digit substring, the assertion would fail.
Restructure test/test_log_json.py:
* Single module-scope fixture starts unitd once, captures the log,
and fully tears down (SIGTERM -> SIGKILL on process group) BEFORE
yielding. By the time any test body runs the log file is static
and no unitd children of ours exist for _check_processes to see.
* Use start_new_session=True + os.killpg so router/controller/
discovery children are guaranteed to die even if SIGTERM
propagation is unreliable under sudo.
* Drop test_default_is_text -- text mode is exercised implicitly by
the entire pre-existing test suite running in default mode.
* Collapse the JSON checks to four targeted tests sharing the static
log: schema validity, multi-pid, escape correctness, and
request_id absence on startup records.
Also keeps the three offline CLI tests (--help, bad value, missing
value) which never launch a unit.
https://claude.ai/code/session_017QCJgJWYDJZ9yqHdhHZLyz
Addresses three review comments on the JSON log handler:
1. Drop ~12.5 KiB stack buffer: the worst-case 6x-expansion buffer
(NXT_MAX_ERROR_STR * 6 + 256) is replaced with NXT_MAX_ERROR_STR
+ 256 (~2.3 KiB). Combined with the existing raw[] body buffer
the total stack footprint is ~4.4 KiB -- safely below the 512 KiB
fiber stack from src/nxt_fiber.c:33.
2. nxt_log_json_escape() now takes an explicit `end` boundary and
stops cleanly at the limit, returning the actual write position.
This matches the rest of unit's string utilities and avoids
relying on implicit buffer-size assumptions in the caller.
3. Manual `*q++` writes after the escape pass are now guaranteed
safe: the handler reserves a fixed trailer budget (closing quote
+ ",\"request_id\":<u32>}\n" = 27 bytes) and clamps the escape
pass at qmax = qend - trailer_max. No write can pass qend.
The trade-off is that pathological messages whose escape would
expand past the buffer get truncated rather than emitted in full.
Real log content (ASCII / UTF-8 with a few quotes) does not trigger
this; the worst-case 6x expansion was a theoretical bound.
Output schema and existing tests are unchanged.
https://claude.ai/code/session_017QCJgJWYDJZ9yqHdhHZLyz
Review feedback addressed in bfc2a2bAll three review comments on 1. Large stack buffer (high) — 2. Escape function lacked a destination boundary (medium) — static u_char *
nxt_log_json_escape(u_char *dst, u_char *end, const u_char *src, size_t size);It stops cleanly at 3. Unguarded static const size_t trailer_max =
sizeof("\",\"request_id\":4294967295}\n") - 1; /* 27 bytes */
qmax = qend - trailer_max;
...
q = nxt_log_json_escape(q, qmax, raw, msg_len); /* clamped at qmax */
*q++ = '"'; /* in trailer budget */
if (log->ident != 0) {
q = nxt_sprintf(q, qend, ",\"request_id\":%uD", log->ident);
}
*q++ = '}';
*q++ = '\n';Because the escape pass cannot exceed Verification
Generated by Claude Code |
|
/gemini review |
There was a problem hiding this comment.
Code Review
This pull request introduces an opt-in JSON format for the unit error log, enabled via the --log-format json command-line flag. It includes the necessary JSON escaping logic, ISO 8601 timestamp formatting, and updates to the runtime configuration to support this new output mode. I have reviewed the implementation and suggest using a defined constant for the trailer size calculation in nxt_app_log.c to ensure consistency if the format string changes, and noted that the manual CLI argument parsing for --log-format should be monitored for maintainability as the CLI surface area grows.
| if (nxt_strcmp(p, "--log-format") == 0) { | ||
| if (*argv == NULL) { | ||
| write(STDERR_FILENO, no_log_format, nxt_length(no_log_format)); | ||
| return NXT_ERROR; | ||
| } | ||
|
|
||
| p = *argv++; | ||
|
|
||
| if (nxt_strcmp(p, "text") == 0) { | ||
| rt->log_format = NXT_LOG_FORMAT_TEXT; | ||
|
|
||
| } else if (nxt_strcmp(p, "json") == 0) { | ||
| rt->log_format = NXT_LOG_FORMAT_JSON; | ||
|
|
||
| } else { | ||
| write(STDERR_FILENO, bad_log_format, nxt_length(bad_log_format)); | ||
| return NXT_ERROR; | ||
| } | ||
|
|
||
| continue; | ||
| } |
There was a problem hiding this comment.
Declining this one as a non-issue. The if (nxt_strcmp(p, "--xxx") == 0) { ... } chain is exactly how every other CLI option in this function is parsed today — --control, --control-mode, --control-user, --control-group, --user, --group, --pid, --log, --modulesdir, --statedir, --tmpdir, --no-daemon, --version, --help (lines 1033, 1046, 1060, 1074, 1088, 1101, 1114, 1127, 1146, 1166, 1186, 1199, 1204, 1209). My new --log-format branch follows the same shape exactly.
Switching this single option to a different parsing pattern would be inconsistent with the rest of the file and harder to read for maintainers reaching for the existing template. If/when the CLI surface grows enough that the sequential-strcmp approach becomes a maintainability problem, a refactor should convert all options at once, not introduce a hybrid mid-file. Out of scope for this PR.
Generated by Claude Code
| static const size_t trailer_max = | ||
| sizeof("\",\"request_id\":4294967295}\n") - 1; |
There was a problem hiding this comment.
Addressed in dbe8734. The field key is now single-sourced and both the format string and the trailer bound derive from it:
#define NXT_LOG_JSON_REQID_KEY ",\"request_id\":"
#define NXT_LOG_JSON_REQID_FMT NXT_LOG_JSON_REQID_KEY "%uD"
#define NXT_LOG_JSON_REQID_MAX (sizeof(NXT_LOG_JSON_REQID_KEY) - 1 + 10)
#define NXT_LOG_JSON_TRAILER_MAX (1 + NXT_LOG_JSON_REQID_MAX + 1 + 1)
/*close" + reqid + } + \n */Renaming the key (or changing its size) now touches a single token; the bound and the nxt_sprintf format update together. Eliminates the mirror-string drift risk.
Generated by Claude Code
Note on the
|
syslog mirror (#if 0) |
JSON handler (this PR) | |
|---|---|---|
| Where it goes | syslog/journald | stderr / unit.log |
| What's emitted | only [alert] records |
every record |
| Format | text | structured |
| Consumer | syslog daemon, then forwarded | log shipper / pipeline |
| Activation | code change | --log-format json |
The JSON path covers the modern shipper-based deployment style (Vector, Fluent Bit, Promtail) and makes the #if 0 syslog mirror less urgent: a shipper can read the JSON unit.log and forward to syslog/journald with full level fidelity, not just [alert]. That's why I parked syslog/journald sink as an explicit follow-up rather than blocking on it. When you do want it, the right shape is probably an opt-in sink type (handler family chosen by config), not a fixed if (level == ALERT) mirror baked into every record path.
If a maintainer prefers, I can also delete the #if 0 blocks in a separate PR — they're effectively rotted code (six lines, never compiled, never referenced) and the rationale for keeping them as a roadmap marker is now captured here in writing.
Generated by Claude Code
Address gemini-code-assist review note on src/nxt_app_log.c: the
trailer-byte calculation used a hardcoded mirror string of the format
literal in the nxt_sprintf call, which would silently drift if the
field key were ever renamed.
Define the field key once and derive the format string and the
worst-case byte budget from the same macro:
#define NXT_LOG_JSON_REQID_KEY ",\"request_id\":"
#define NXT_LOG_JSON_REQID_FMT NXT_LOG_JSON_REQID_KEY "%uD"
#define NXT_LOG_JSON_REQID_MAX (sizeof(NXT_LOG_JSON_REQID_KEY) - 1 + 10)
#define NXT_LOG_JSON_TRAILER_MAX (1 + NXT_LOG_JSON_REQID_MAX + 1 + 1)
/*close" + reqid + } + \n */
Renaming the key now touches a single token; the trailer bound and the
emit-site format both update with it. Behavior is unchanged.
https://claude.ai/code/session_017QCJgJWYDJZ9yqHdhHZLyz
Summary
Lands roadmap item D8 Structured logging (
roadmap/unit-roadmap.md:156-159) as a minimal, fully backward-compatible v1: an opt-in--log-format jsonflag that switchesunit.logto one JSON object per line.Schema (one record per line)
{"ts":"2026-04-29T17:42:55.579Z","level":"info","pid":22628,"app":"unit","msg":"router started"}tslevelalert/error/warn/notice/info/debug.pidapp"unit"for v1; per-process role (router/controller/…) is a follow-up.msgctx_handleroutput, JSON-escaped (RFC 8259).request_idBackward compatibility
Log.check_alerts()regex keep working unchanged.nxt_log_t(src/nxt_log.h) struct is untouched;nxt_main_loglayout is untouched; no log macros changed.src/. Helpers (ISO time cache, JSON escape) live alongside the existing time handler insrc/nxt_app_log.c. JSON escape duplicates the rules fromnxt_conf_json_escape(~25 lines) rather than lifting it across translation units, to avoid risking the access-log JSON path that depends on it.Implementation
src/nxt_log.h: two additive#defines —NXT_LOG_FORMAT_TEXT/_JSON.src/nxt_runtime.{h,c}: newuint8_t log_formatonnxt_runtime_t;--log-formatCLI parser; help text; handler installed at end of CLI parsing insidenxt_runtime_create, before any fork.src/nxt_app_log.c:nxt_log_json_handler,nxt_log_iso_time_cache,nxt_log_json_escape. Stack-only buffers (~12.5 KiB worst case), no heap, singlenxt_write_consoleper record.src/nxt_main.c: unchanged. Earliest startup logs (duringnxt_lib_start) keep using the legacy text handler — preserves today's exact behavior.Tests
test/test_log_json.py(8 tests, all passing) launches a standaloneunitdso it does not interfere with conftest's shared instance:--log-formatadvertised in--helpyaml) and missing value rejected with non-zero exit--log-format jsonevery record parses as JSON, has all required keys with correct types, andtsmatches the ISO-8601 patternjson.loadsrequest_idkey is absent (not null) on startup records wherelog->ident == 0Test plan
pytest test/) still passes against the default text format on a host with all language modules built.--log-format jsonand ingest into your log pipeline of choice (Loki / ELK / Datadog) to verify field stability.Out of scope (deliberate, all additive follow-ups)
settings.log.formatconfig schema (CLI alone is enough for v1; config can land later without breaking v1 users).request_idpropagation across router → app worker (needs new IPC field; currentlog->identis per-process atomic).approle string (main/router/controller/app-name).#if 0blocks insrc/nxt_app_log.cindicate prior intent).Roadmap
roadmap/unit-roadmap.mdD8 marked with v1-landed note + follow-ups.CHANGESupdated for FreeUnit 1.35.4.Draft until you've validated and are ready to promote upstream.
https://claude.ai/code/session_017QCJgJWYDJZ9yqHdhHZLyz
Generated by Claude Code