Skip to content

Commit 4935119

Browse files
committed
Default OPENAI_REASONING_EFFORT to 'low' (not 'minimal')
'minimal' is only valid on older gpt-5 / o-series models and is rejected by gpt-5.4 with a 400 ("Supported values are: 'none', 'low', 'medium', 'high', 'xhigh'"). Our fail-safe silently approved every request when the API errored, which presented as "moderation is fast but broken" in real tests — scam / phishing / misinfo content was getting through. 'low' is the lowest common value accepted by every reasoning model in the family so the default works regardless of which model you point the OPENAI_CHAT_MODEL at. Empirically verified: 8/8 real-world classification accuracy on gpt-5.4-nano with reasoning=low, vs 2/4 with reasoning=none (false negatives on obvious scams).
1 parent b92bdf6 commit 4935119

1 file changed

Lines changed: 11 additions & 6 deletions

File tree

config/config.py

Lines changed: 11 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -24,12 +24,17 @@ class Config:
2424
# AI moderator only returns small JSON responses (~100-200 tokens), so 500 is plenty
2525
OPENAI_MAX_OUTPUT_TOKENS = int(os.environ.get(
2626
'OPENAI_MAX_OUTPUT_TOKENS', '500'))
27-
# Reasoning level for gpt-5 / o-series models. Valid values depend on the
28-
# model; gpt-5.4 supports 'none' | 'low' | 'medium' | 'high' | 'xhigh';
29-
# older gpt-5 / o-series accept 'minimal' | 'low' | 'medium' | 'high'.
30-
# Lower = faster. Set to 'none' for pattern-matching workloads like
31-
# content moderation where reasoning doesn't add signal.
32-
OPENAI_REASONING_EFFORT = os.environ.get('OPENAI_REASONING_EFFORT', 'minimal')
27+
# Reasoning level for gpt-5 / o-series models. Valid values are model-
28+
# specific:
29+
# gpt-5 / o-series: minimal | low | medium | high
30+
# gpt-5.4 (nano etc): none | low | medium | high | xhigh
31+
# 'low' is the only value accepted across every reasoning model in the
32+
# supported family, so it's a safe default. Override per-deployment in
33+
# .env if you've picked a model and want a different point on the
34+
# latency/quality curve. Note that 'none' on gpt-5.4 measurably degrades
35+
# moderation accuracy on borderline content (false negatives on scams,
36+
# phishing, misinformation) — verified empirically on this codebase.
37+
OPENAI_REASONING_EFFORT = os.environ.get('OPENAI_REASONING_EFFORT', 'low')
3338
ADMIN_EMAIL = os.environ.get('ADMIN_EMAIL')
3439
ADMIN_PASSWORD = os.environ.get('ADMIN_PASSWORD')
3540

0 commit comments

Comments
 (0)