Skip to content

Commit b73267c

Browse files
committed
fix(serve): publish availability events so /actuator/health returns 200
Problem: /actuator/health on a running `code-iq serve` returned HTTP 503 with body {"groups":["liveness","readiness"],"status":"OUT_OF_SERVICE"}, even after the graph was fully loaded and /api/stats was returning real data. This made the health endpoint unusable for K8s/Compose readiness probes and confused baseline smoke-tests. Root cause: ServeCommand.call() blocks on Thread.currentThread().join() inside Spring Boot's CommandLineRunner.run(). Because the runner never returns, Spring's ApplicationReadyEvent is never published, and neither is the default AvailabilityChangeEvent that normally transitions ReadinessState from REFUSING_TRAFFIC to ACCEPTING_TRAFFIC. With the serving profile's probes enabled, the aggregated health endpoint stays pinned at OUT_OF_SERVICE forever. Fix: ServeCommand now explicitly publishes the two availability events via a new markReady() method before blocking: AvailabilityChangeEvent.publish(events, this, LivenessState.CORRECT); AvailabilityChangeEvent.publish(events, this, ReadinessState.ACCEPTING_TRAFFIC); markReady() is extracted for testability. A new ServeCommandTest case verifies both events are published in the documented order. Verified end-to-end against both seed repos via the updated pipeline script: /actuator/health now returns HTTP 200 with status "UP". | seed | ready | health_http | before | |---------------------|-------:|------------:|-------:| | spring-petclinic | 13s | 200 | 503 | | realworld-express | 14s | 200 | 503 | Follow-up (out of scope): GraphBootstrapper's @eventlistener(ApplicationReadyEvent.class) is effectively dead code for the same reason — the listener never fires because the event never fires. Only not-a-bug today because enrich always runs before serve in our pipeline, so the bootstrap fallback never actually needs to trigger.
1 parent 62806c4 commit b73267c

3 files changed

Lines changed: 57 additions & 0 deletions

File tree

docs/superpowers/baselines/2026-04-17/BASELINE.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -237,6 +237,7 @@ Ordered by severity. Each item cites the raw artifact it was derived from.
237237
Follow-up split out below.
238238

239239
- **`GraphHealthIndicator` reports `OUT_OF_SERVICE` (503) even when the graph is loaded.** Discovered during the pipeline smoke-test fix. `/actuator/health` body: `{"groups":["liveness","readiness"],"status":"OUT_OF_SERVICE"}`. The server is fully functional (`/api/stats` returns real data) but the health indicator makes `/actuator/health` unusable as a readiness probe for orchestrators (K8s, Compose, CI). Fix in `src/main/java/io/github/randomcodespace/iq/health/GraphHealthIndicator.java`. Low for baseline use; High when we start Dockerizing or targeting K8s.
240+
- **RESOLVED (2026-04-17, branch `phase-a/fix-graph-health`)**: Root cause was *not* in `GraphHealthIndicator` (which correctly returns UP when nodes>0). It was in `ServeCommand`: the CLI blocks on `Thread.currentThread().join()` inside Spring Boot's `CommandLineRunner.run()`, which prevents `ApplicationReadyEvent` from ever firing. Without that event, Spring's default readiness publisher never flips `ReadinessState` from `REFUSING_TRAFFIC` (503 `OUT_OF_SERVICE`) to `ACCEPTING_TRAFFIC` (200 `UP`). Fix: `ServeCommand` now explicitly publishes `AvailabilityChangeEvent` for `LivenessState.CORRECT` + `ReadinessState.ACCEPTING_TRAFFIC` before blocking, via a new `markReady()` method (unit-tested). Verified end-to-end: `health_http` is now 200 on both seeds (petclinic ready 13s, express ready 14s; status "UP"). Follow-up filed: `GraphBootstrapper`'s `@EventListener(ApplicationReadyEvent.class)` is effectively dead code for the same reason — only noticed because enrich always runs before serve in our pipeline, so the bootstrap fallback never actually needs to fire.
240241

241242
- **SpotBugs: 8 HIGH-priority findings (priority=1) + 1,484 at priority=2.** Total 1,492. HIGH findings must be triaged individually (read `raw/spotbugs.xml`). Noise-dominant rules (`NM_METHOD_NAMING_CONVENTION`=730, `SF_SWITCH_NO_DEFAULT`=448) should be filtered via a SpotBugs exclude file so real signal surfaces; real-concern patterns that deserve review now: `NP_NULL_ON_SOME_PATH_FROM_RETURN_VALUE` (26), `BC_UNCONFIRMED_CAST` (55), `UL_UNRELEASED_LOCK_EXCEPTION_PATH` (1), `WMI_WRONG_MAP_ITERATOR` (2), `ES_COMPARING_STRINGS_WITH_EQ` (2), `MT_CORRECTNESS` category (1).
242243
- Raw: `raw/spotbugs.xml`, `raw/spotbugs-summary.json`.

src/main/java/io/github/randomcodespace/iq/cli/ServeCommand.java

Lines changed: 26 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -5,6 +5,10 @@
55
import org.slf4j.Logger;
66
import org.slf4j.LoggerFactory;
77
import org.springframework.beans.factory.annotation.Autowired;
8+
import org.springframework.boot.availability.AvailabilityChangeEvent;
9+
import org.springframework.boot.availability.LivenessState;
10+
import org.springframework.boot.availability.ReadinessState;
11+
import org.springframework.context.ApplicationEventPublisher;
812
import org.springframework.stereotype.Component;
913
import picocli.CommandLine.Command;
1014
import picocli.CommandLine.Option;
@@ -56,6 +60,9 @@ public class ServeCommand implements Callable<Integer> {
5660
@Autowired(required = false)
5761
private GraphStore graphStore;
5862

63+
@Autowired
64+
private ApplicationEventPublisher events;
65+
5966
@Override
6067
public Integer call() {
6168
Path root = path.toAbsolutePath().normalize();
@@ -96,6 +103,16 @@ public Integer call() {
96103
System.out.println();
97104
CliOutput.info("Press Ctrl+C to stop.");
98105

106+
// Publish availability transitions so /actuator/health reports UP (200).
107+
// This Callable is invoked from a CommandLineRunner that blocks forever
108+
// (the Thread.join below), so Spring's ApplicationReadyEvent — which
109+
// normally drives ReadinessState to ACCEPTING_TRAFFIC — never fires.
110+
// Without this, /actuator/health stays OUT_OF_SERVICE (503) even though
111+
// the server is accepting and serving traffic. See also: known gap on
112+
// GraphBootstrapper's @EventListener(ApplicationReadyEvent.class) which
113+
// is dead for the same reason — out of scope for this fix.
114+
markReady();
115+
99116
try {
100117
Thread.currentThread().join();
101118
} catch (InterruptedException e) {
@@ -105,6 +122,15 @@ public Integer call() {
105122
return 0;
106123
}
107124

125+
/**
126+
* Flip availability state to live + accepting traffic. Extracted for
127+
* testability — callers can verify the right events are published.
128+
*/
129+
void markReady() {
130+
AvailabilityChangeEvent.publish(events, this, LivenessState.CORRECT);
131+
AvailabilityChangeEvent.publish(events, this, ReadinessState.ACCEPTING_TRAFFIC);
132+
}
133+
108134
public Path getPath() { return path; }
109135
public int getPort() { return port; }
110136
public String getHost() { return host; }

src/test/java/io/github/randomcodespace/iq/cli/ServeCommandTest.java

Lines changed: 30 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,12 +1,21 @@
11
package io.github.randomcodespace.iq.cli;
22

33
import org.junit.jupiter.api.Test;
4+
import org.mockito.ArgumentCaptor;
5+
import org.mockito.Mockito;
6+
import org.springframework.boot.availability.AvailabilityChangeEvent;
7+
import org.springframework.boot.availability.LivenessState;
8+
import org.springframework.boot.availability.ReadinessState;
9+
import org.springframework.context.ApplicationEventPublisher;
10+
import org.springframework.test.util.ReflectionTestUtils;
411
import picocli.CommandLine;
512

613
import java.nio.file.Path;
714

815
import static org.junit.jupiter.api.Assertions.assertEquals;
916
import static org.junit.jupiter.api.Assertions.assertNotNull;
17+
import static org.mockito.Mockito.times;
18+
import static org.mockito.Mockito.verify;
1019

1120
class ServeCommandTest {
1221

@@ -76,4 +85,25 @@ void pathNotSwallowedWhenNoUiPrecedesPath() {
7685
assertEquals(true, cmd.isNoUi());
7786
assertEquals(Path.of("/some/repo"), cmd.getPath());
7887
}
88+
89+
@Test
90+
void markReadyPublishesLivenessThenReadiness() {
91+
// Regression guard for /actuator/health returning 503 OUT_OF_SERVICE:
92+
// serve's CommandLineRunner blocks forever, so Spring never fires
93+
// ApplicationReadyEvent and readiness stays REFUSING_TRAFFIC.
94+
// ServeCommand must publish LivenessState.CORRECT + ReadinessState.ACCEPTING_TRAFFIC
95+
// before blocking so /actuator/health reports UP (200).
96+
var cmd = new ServeCommand();
97+
var mockEvents = Mockito.mock(ApplicationEventPublisher.class);
98+
ReflectionTestUtils.setField(cmd, "events", mockEvents);
99+
100+
cmd.markReady();
101+
102+
var captor = ArgumentCaptor.forClass(AvailabilityChangeEvent.class);
103+
verify(mockEvents, times(2)).publishEvent(captor.capture());
104+
var published = captor.getAllValues();
105+
// Order matters: liveness first (process is alive), then readiness (serving traffic).
106+
assertEquals(LivenessState.CORRECT, published.get(0).getState());
107+
assertEquals(ReadinessState.ACCEPTING_TRAFFIC, published.get(1).getState());
108+
}
79109
}

0 commit comments

Comments
 (0)