Skip to content

Commit e0445db

Browse files
aksOpsclaude
andauthored
block 1: critical ship-blockers (upload cap, auth hardening, bounded workq, scoped search) (#60)
* feat(api): cap multipart upload size via MaxBytesReader New server.max_upload_bytes config (default 100 MiB, env DOCSIQ_SERVER_MAX_UPLOAD_BYTES). Upload handler wraps request body with http.MaxBytesReader before ParseMultipartForm so oversized requests terminate at the transport with 413 instead of allocating memory or temp files. Closes the P2-1 TODO in handlers.go. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(api): address Task 1 code-review feedback - Drop double WriteHeader in *http.MaxBytesError branch (MaxBytesReader already committed the 413) - Test asserts JSON contract + Content-Type header - New TestUploadMaxBytes_UnknownContentLength covers the chunked/unknown-length slow path - Extract writeTooLarge helper (DRY the 413 body) - Comment limit<=0 escape hatch on the config field - slog.Warn on oversize reject (matches repo convention) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(serve): refuse insecure default on non-loopback bind validateServeSecurity fails startup when server.api_key is empty AND server.host is not loopback. Loopback+empty emits a prominent slog warning instead. Closes the "auth-disabled-by-default" ship-blocker. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(serve): address Task 2 code-review feedback - Refuse empty server.host (binds all interfaces, not loopback) - Use net.ParseIP().IsLoopback() to accept [::1], 127.0.0.0/8, and ::ffff:127.0.0.1 as loopback - Table-driven tests cover new hosts + empty-string case - Assert DOCSIQ_SERVER_API_KEY is named in the refusal message - gofmt Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(workq): bounded worker pool with graceful drain New internal/workq package. Pool has fixed worker count and a bounded submission queue; Submit is non-blocking and returns ErrQueueFull when saturated. Close(ctx) cancels the pool context and waits for workers to drain within the caller's deadline. Preparing to replace fire-and-forget upload indexing goroutine. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(workq): guard Submit against send-on-closed-channel race RWMutex guards the jobs-channel send in Submit against a concurrent Close that calls close(p.jobs). Previously a Submit that passed the p.closed check could be preempted and then send on an already-closed channel. Add a stress test (50 iterations x 32 concurrent Submits + Close) to exercise the race under `-race`. Modernize the iteration count in TestPool_CloseDrainsInflight to Go 1.22's for-range-N form. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(api): submit upload indexing through workq with graceful drain Upload handler now submits the indexing closure to a bounded workq Pool instead of spawning a detached goroutine. When the queue is full the handler returns 503 with Retry-After: 30. On shutdown, cmd/serve calls pool.Close with a 30s deadline so in-flight jobs honour the cancelled context and partial writes are avoided. Config: server.workq_workers (default NumCPU), server.workq_depth (default 64). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(api): address Task 4 code-review follow-ups - upload(): set progress to rejected state before writing 503 when workq.Submit returns ErrQueueFull or ErrClosed, so progress polling reflects the rejection instead of stale "queued" forever. - upload_workq_test.go: defer close(block) after defer pool.Close so drain completes even if the test panics. Also add started-channel sync and fill both channel slots (capacity = Workers+QueueDepth = 2) so saturation is deterministic under the race detector. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * chore(api): silence unused-parameter lint in upload_workq_test Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(api): session cookie path alongside bearer auth New POST /api/session exchanges an Authorization: Bearer key for a docsiq_session httpOnly; Secure; SameSite=Strict cookie (30-day Max-Age). DELETE /api/session clears it. Auth middleware now accepts either the Authorization header or the cookie; /api/session itself is public (it is the auth boundary). Preparing to stop injecting the key into served HTML. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(api): defense-in-depth symmetry for session auth - bearerAuthMiddleware: explicitly reject when server.api_key is unset, matching newSessionHandler's guard. Previously correct only via the no_token branch firing first; fragile under refactor. - extractToken: trim whitespace from the session cookie value for parity with the Authorization header path. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(ui): cookie-based auth; stop reading bearer from DOM apiFetch now sends credentials: 'include' on every request; the Authorization header is no longer attached. On boot, initAuth exchanges the legacy meta-tag bearer (if present) for a cookie via POST /api/session exactly once and scrubs the meta tag. Production installs that already ship cookies out-of-band (via `docsiq login`) skip the exchange entirely. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(ui): surface session-exchange failures, scrub meta tag first - Check res.ok on POST /api/session. Non-2xx rejects sessionReady so the UI sees a single clear auth-boundary failure instead of a cascade of generic 401s on subsequent fetches. - Scrub the legacy meta tag BEFORE awaiting the fetch. The bearer is already captured as a closure param; leaving the tag in the DOM during a slow/hung exchange widens the disclosure window. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(api): stop injecting API key into served HTML The SPA handler no longer rewrites index.html to embed the bearer in a <meta name="docsiq-api-key"> tag. Browser auth now flows through the docsiq_session cookie set by POST /api/session. Closes the "API-key-in-HTML" ship-blocker. The stale spa_meta_test.go asserted the injection was present and is replaced by spa_test.go which asserts the tag is absent. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * perf(search): scope entity fetch to top-hit docs in local search Replaces the full-table AllEntities scan with EntitiesForDocs, which joins the entities → relationships tables and filters by relationships.doc_id with a chunked IN-list (chunk size 900, below SQLite's 999 variable limit). Local search now scales sub-linearly with corpus size. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(ui): drop legacy meta-tag read in MCP hook, use cookie auth useMCP had its own getBearer() reading the docsiq-api-key meta tag and attaching Authorization to /mcp requests — a parallel path that Tasks 5-7 missed. Now the hook sends credentials: 'include' and relies on the docsiq_session cookie, matching apiFetch. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
1 parent 2a7028d commit e0445db

21 files changed

Lines changed: 1117 additions & 99 deletions

cmd/serve.go

Lines changed: 59 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -9,15 +9,19 @@ import (
99
"os"
1010
"os/signal"
1111
"path/filepath"
12+
"runtime"
13+
"strings"
1214
"syscall"
1315
"time"
1416

1517
"github.com/RandomCodeSpace/docsiq/internal/api"
18+
"github.com/RandomCodeSpace/docsiq/internal/config"
1619
"github.com/RandomCodeSpace/docsiq/internal/embedder"
1720
"github.com/RandomCodeSpace/docsiq/internal/llm"
1821
"github.com/RandomCodeSpace/docsiq/internal/project"
1922
"github.com/RandomCodeSpace/docsiq/internal/sqlitevec"
2023
"github.com/RandomCodeSpace/docsiq/internal/vectorindex"
24+
"github.com/RandomCodeSpace/docsiq/internal/workq"
2125
"github.com/spf13/cobra"
2226
)
2327

@@ -140,11 +144,25 @@ var serveCmd = &cobra.Command{
140144
}
141145
}
142146

147+
workers := cfg.Server.WorkqWorkers
148+
if workers <= 0 {
149+
workers = runtime.NumCPU()
150+
}
151+
depth := cfg.Server.WorkqDepth
152+
if depth <= 0 {
153+
depth = 64
154+
}
155+
pool := workq.New(workq.Config{Workers: workers, QueueDepth: depth})
156+
143157
router := api.NewRouter(prov, emb, cfg, registry,
144158
api.WithProjectStores(stores),
145159
api.WithVectorIndexes(vecIndexes),
160+
api.WithWorkq(pool),
146161
)
147162

163+
if err := validateServeSecurity(cfg); err != nil {
164+
return err
165+
}
148166
addr := fmt.Sprintf("%s:%d", cfg.Server.Host, cfg.Server.Port)
149167
ln, err := net.Listen("tcp", addr)
150168
if err != nil {
@@ -177,6 +195,17 @@ var serveCmd = &cobra.Command{
177195
slog.Error("❌ shutdown error", "err", err)
178196
return err
179197
}
198+
199+
// Drain workq within its own 30s deadline. Server.Shutdown has already
200+
// stopped accepting new HTTP requests, so no new jobs can be submitted;
201+
// all that remains is letting in-flight pipelines finish or honour the
202+
// cancelled ctx.
203+
drainCtx, drainCancel := context.WithTimeout(context.Background(), 30*time.Second)
204+
defer drainCancel()
205+
if err := pool.Close(drainCtx); err != nil {
206+
slog.Warn("⚠️ workq drain timeout; some indexing jobs were cancelled mid-flight", "err", err)
207+
}
208+
180209
slog.Info("✅ shutdown complete")
181210
return nil
182211
},
@@ -187,3 +216,33 @@ func init() {
187216
serveCmd.Flags().StringVar(&serveHost, "host", "", "Server host (overrides config)")
188217
serveCmd.Flags().IntVar(&servePort, "port", 0, "Server port (overrides config)")
189218
}
219+
220+
// validateServeSecurity refuses to start the server when the API key is
221+
// empty AND the bind host is not loopback. An unauthenticated service
222+
// exposed on the network is almost never intentional; make it explicit.
223+
// Loopback with empty key gets a prominent warning at boot instead.
224+
func validateServeSecurity(cfg *config.Config) error {
225+
if cfg.Server.APIKey != "" {
226+
return nil
227+
}
228+
host := strings.ToLower(strings.TrimSpace(cfg.Server.Host))
229+
if host == "" {
230+
return fmt.Errorf(
231+
"server.api_key is empty and server.host is unset (binds all interfaces); refusing to start. " +
232+
"Set DOCSIQ_SERVER_API_KEY or bind to 127.0.0.1/localhost for dev",
233+
)
234+
}
235+
loopback := host == "localhost"
236+
if ip := net.ParseIP(strings.Trim(host, "[]")); ip != nil {
237+
loopback = loopback || ip.IsLoopback()
238+
}
239+
if !loopback {
240+
return fmt.Errorf(
241+
"server.api_key is empty and server.host=%q is not loopback; refusing to start. "+
242+
"Set DOCSIQ_SERVER_API_KEY or bind to 127.0.0.1/localhost for dev",
243+
cfg.Server.Host,
244+
)
245+
}
246+
slog.Warn("⚠️ auth disabled (empty server.api_key); only loopback bind allowed", "host", host)
247+
return nil
248+
}

cmd/serve_test.go

Lines changed: 87 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,87 @@
1+
package cmd
2+
3+
import (
4+
"strings"
5+
"testing"
6+
7+
"github.com/RandomCodeSpace/docsiq/internal/config"
8+
)
9+
10+
func TestValidateServeSecurity_RefusesNonLoopbackWithEmptyKey(t *testing.T) {
11+
t.Parallel()
12+
cases := []struct {
13+
name string
14+
host string
15+
mustContain []string
16+
}{
17+
{
18+
name: "empty host binds all interfaces",
19+
host: "",
20+
mustContain: []string{"binds all interfaces", "DOCSIQ_SERVER_API_KEY"},
21+
},
22+
{
23+
name: "0.0.0.0 is wildcard",
24+
host: "0.0.0.0",
25+
mustContain: []string{"api_key", "DOCSIQ_SERVER_API_KEY"},
26+
},
27+
{
28+
name: "public IPv4",
29+
host: "10.0.0.5",
30+
mustContain: []string{"api_key", "DOCSIQ_SERVER_API_KEY"},
31+
},
32+
}
33+
for _, tc := range cases {
34+
tc := tc
35+
t.Run(tc.name, func(t *testing.T) {
36+
t.Parallel()
37+
cfg := &config.Config{}
38+
cfg.Server.Host = tc.host
39+
cfg.Server.Port = 8080
40+
cfg.Server.APIKey = ""
41+
42+
err := validateServeSecurity(cfg)
43+
if err == nil {
44+
t.Fatalf("host=%q: expected error for empty api_key on non-loopback bind; got nil", tc.host)
45+
}
46+
for _, want := range tc.mustContain {
47+
if !strings.Contains(err.Error(), want) {
48+
t.Errorf("host=%q: error should contain %q; got %v", tc.host, want, err)
49+
}
50+
}
51+
})
52+
}
53+
}
54+
55+
func TestValidateServeSecurity_AllowsLoopbackWithEmptyKey(t *testing.T) {
56+
t.Parallel()
57+
hosts := []string{
58+
"127.0.0.1",
59+
"localhost",
60+
"::1",
61+
"[::1]", // bracketed IPv6
62+
"127.0.0.2", // other address in 127.0.0.0/8
63+
"::ffff:127.0.0.1", // IPv6-mapped IPv4
64+
}
65+
for _, host := range hosts {
66+
host := host
67+
t.Run(host, func(t *testing.T) {
68+
t.Parallel()
69+
cfg := &config.Config{}
70+
cfg.Server.Host = host
71+
cfg.Server.APIKey = ""
72+
if err := validateServeSecurity(cfg); err != nil {
73+
t.Fatalf("host=%s: expected nil; got %v", host, err)
74+
}
75+
})
76+
}
77+
}
78+
79+
func TestValidateServeSecurity_AllowsNonLoopbackWithKey(t *testing.T) {
80+
t.Parallel()
81+
cfg := &config.Config{}
82+
cfg.Server.Host = "0.0.0.0"
83+
cfg.Server.APIKey = "s3cret"
84+
if err := validateServeSecurity(cfg); err != nil {
85+
t.Fatalf("expected nil; got %v", err)
86+
}
87+
}

internal/api/auth.go

Lines changed: 36 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -50,17 +50,31 @@ func bearerAuthMiddleware(apiKey string, next http.Handler) http.Handler {
5050
return
5151
}
5252

53-
raw := strings.TrimSpace(r.Header.Get("Authorization"))
54-
const prefix = "Bearer "
55-
if !strings.HasPrefix(raw, prefix) {
53+
// /api/session is the auth boundary itself — always public.
54+
if path == "/api/session" {
55+
next.ServeHTTP(w, r)
56+
return
57+
}
58+
59+
// Defense-in-depth: reject immediately if the server has no key
60+
// configured. This mirrors newSessionHandler's guard and keeps the
61+
// middleware correct under future refactors (rather than relying on
62+
// the no_token branch firing because keyBytes would also be empty).
63+
if apiKey == "" {
64+
slog.Warn("🔒 auth failure", "path", path, "remote_addr", r.RemoteAddr, "reason", "server_misconfigured")
65+
writeJSON401(w)
66+
return
67+
}
68+
69+
token := extractToken(r)
70+
if token == "" {
5671
slog.Warn("🔒 auth failure",
5772
"path", path,
5873
"remote_addr", r.RemoteAddr,
59-
"reason", "no_bearer_prefix")
74+
"reason", "no_token")
6075
writeJSON401(w)
6176
return
6277
}
63-
token := raw[len(prefix):]
6478
if subtle.ConstantTimeCompare([]byte(token), keyBytes) != 1 {
6579
slog.Warn("🔒 auth failure",
6680
"path", path,
@@ -82,3 +96,20 @@ func writeJSON401(w http.ResponseWriter) {
8296
w.WriteHeader(http.StatusUnauthorized)
8397
_ = json.NewEncoder(w).Encode(map[string]string{"error": "unauthorized"})
8498
}
99+
100+
// extractToken returns the bearer token from either the Authorization
101+
// header (preferred, for machine clients) or the session cookie (for
102+
// browser clients after POST /api/session). Returns "" if neither.
103+
func extractToken(r *http.Request) string {
104+
raw := strings.TrimSpace(r.Header.Get("Authorization"))
105+
const prefix = "Bearer "
106+
if strings.HasPrefix(raw, prefix) {
107+
return raw[len(prefix):]
108+
}
109+
if c, err := r.Cookie(sessionCookieName); err == nil {
110+
if v := strings.TrimSpace(c.Value); v != "" {
111+
return v
112+
}
113+
}
114+
return ""
115+
}

internal/api/auth_test.go

Lines changed: 23 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -515,6 +515,29 @@ func TestBearerAuthMiddleware(t *testing.T) {
515515
})
516516
}
517517

518+
func TestAuth_AcceptsValidCookie(t *testing.T) {
519+
t.Parallel()
520+
h := buildAuthHandler("s3cret")
521+
req := httptest.NewRequest(http.MethodGet, "/api/ping", nil)
522+
req.AddCookie(&http.Cookie{Name: sessionCookieName, Value: "s3cret"})
523+
rr := httptest.NewRecorder()
524+
h.ServeHTTP(rr, req)
525+
if rr.Code != http.StatusOK {
526+
t.Fatalf("want 200 with valid cookie, got %d", rr.Code)
527+
}
528+
}
529+
530+
func TestAuth_RejectsMissingBothHeaderAndCookie(t *testing.T) {
531+
t.Parallel()
532+
h := buildAuthHandler("s3cret")
533+
req := httptest.NewRequest(http.MethodGet, "/api/ping", nil)
534+
rr := httptest.NewRecorder()
535+
h.ServeHTTP(rr, req)
536+
if rr.Code != http.StatusUnauthorized {
537+
t.Fatalf("want 401, got %d", rr.Code)
538+
}
539+
}
540+
518541
// BenchmarkBearerAuth_WrongKey ensures the constant-time compare path runs
519542
// under the benchmark harness. It is NOT a hard timing assertion — if the
520543
// code ever switches to a non-constant-time compare (==, bytes.Equal), the

0 commit comments

Comments
 (0)