Skip to content

Commit 9de20eb

Browse files
branchseerclaude
andcommitted
docs: add research on output restoration compatibility with real build tools
Documents why output restoration fails with Vite 8 and tsdown in auto-detection mode: build tools read from dist/ (for cleanup and size reporting), causing fspy to track dist/ as an inferred input whose deletion triggers a cache miss before restoration can fire. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
1 parent b62b6a2 commit 9de20eb

1 file changed

Lines changed: 117 additions & 0 deletions

File tree

Lines changed: 117 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,117 @@
1+
# Output Restoration: Compatibility with Real Build Tools
2+
3+
## Background
4+
5+
Output restoration automatically archives files produced by a cached task and restores them on cache hit.
6+
When a task runs and creates `dist/`, those files are saved as a `tar.zst` archive in the cache directory.
7+
On subsequent cache hits the archive is extracted, skipping the need to re-execute the task.
8+
9+
The feature works end-to-end when tested with simple write-only commands (`vtt write-file dist/out.txt built`),
10+
but **fails with real build tools** (Vite 8, tsdown) in the default auto-detection mode.
11+
12+
## The Problem
13+
14+
Deleting the output directory (`dist/`) between runs causes a **cache miss** instead of triggering output restoration.
15+
16+
```
17+
~/packages/vite-app$ vite build
18+
...
19+
✓ built in 34ms
20+
21+
# delete dist, run again:
22+
23+
~/packages/vite-app$ vite build ○ cache miss: 'assets' removed from 'packages/vite-app/dist', executing
24+
```
25+
26+
The archived output files exist in the cache and could be restored, but the cache validation rejects the entry before restoration has a chance to run.
27+
28+
## Root Cause
29+
30+
Build tools **read from their output directory** during execution. fspy captures these reads and records them as inferred inputs. When the output directory is later deleted, the inferred input fingerprint no longer matches, producing a cache miss.
31+
32+
### How the cache validation pipeline works
33+
34+
1. **Cache entry lookup** — match by `CacheEntryKey` (spawn fingerprint + input config + output config)
35+
2. **Globbed input validation** — compare stored file hashes against current state for explicit input globs
36+
3. **Post-run fingerprint validation** — compare stored fspy-inferred input fingerprints against current filesystem state
37+
4. **If all pass** → cache hit → replay terminal output → **extract output archive**
38+
39+
The failure occurs at step 3. The stored fingerprint records `packages/app/dist` as `Folder(Some({assets: Dir, index.html: File}))`. After deletion the current fingerprint is `NotFound`. The mismatch is reported before output restoration at step 4 can execute.
40+
41+
### Why build tools read from the output directory
42+
43+
Both Vite 8 and tsdown follow the same pattern: **clean the output directory before writing new files**, then **report compressed sizes after writing**.
44+
45+
#### Vite 8 (`vite build`)
46+
47+
1. **Directory cleanup** (`emptyDir()`, called from `prepareOutDirPlugin` during `renderStart`)
48+
- `fs.readdirSync(outDir)` enumerates entries so they can be deleted before the new build
49+
- Controlled by `emptyOutDir` (default: `true`)
50+
- This is the primary read that causes `packages/app/dist` to appear in fspy's `path_reads`
51+
52+
2. **Compressed size reporting** (rolldown's builtin `viteReporterPlugin`, Rust-native)
53+
- After writing output files, the reporter reads each file back to compute gzip/brotli sizes
54+
- Produces the `dist/index.html 0.15 kB │ gzip: 0.14 kB` lines
55+
- Controlled by `reportCompressedSize` (default: `true`)
56+
- This causes reads on individual output files like `dist/index.html`, `dist/assets/index-xxx.js`
57+
58+
3. **Public directory copy** (`copyDir()`)
59+
- If `copyPublicDir` is enabled (default: `true`), reads the public dir to copy into outDir
60+
61+
#### tsdown
62+
63+
1. **Directory cleanup** (`cleanOutDir()` via `tinyglobby`'s `glob()`)
64+
- Enumerates all files in the output directory with `onlyFiles: false` before deleting them
65+
- Default `clean: true` triggers this
66+
- This is the primary read that causes `packages/lib/dist` to appear in fspy's `path_reads`
67+
68+
2. **Shebang permission check** (`ShebangPlugin`'s `writeBundle` hook)
69+
- Calls `access()` on output files to check existence before setting execute permissions
70+
- Only applies to entry chunks with shebang directives
71+
72+
### Why the read-write overlap check doesn't catch this
73+
74+
The overlap check at `execute_spawn` (mod.rs:486-488) looks for exact path matches between `path_reads` and `path_writes`:
75+
76+
```rust
77+
pa.path_reads.keys().find(|p| pa.path_writes.contains(*p))
78+
```
79+
80+
fspy reports these as separate paths:
81+
82+
- **Read**: `packages/app/dist` (the directory itself, via `readdirSync`)
83+
- **Write**: `packages/app/dist/index.html`, `packages/app/dist/assets/index-xxx.js` (individual files)
84+
85+
Since `packages/app/dist``packages/app/dist/index.html`, no overlap is detected, and caching proceeds.
86+
87+
### Why `should_ignore_entry` doesn't help
88+
89+
`fingerprint.rs:185-187` filters `dist` when listed as a directory entry of a **parent**:
90+
91+
```rust
92+
fn should_ignore_entry(name: &[u8]) -> bool {
93+
matches!(name, b"." | b".." | b".DS_Store") || name.eq_ignore_ascii_case(b"dist")
94+
}
95+
```
96+
97+
This prevents the fingerprint of `packages/app/` from changing when `dist` appears or disappears inside it. But it does not help when `packages/app/dist` itself is a direct key in `inferred_inputs` — that path is fingerprinted independently, and its transition from `Folder(...)` to `NotFound` is a mismatch.
98+
99+
### Why the existing e2e test passes
100+
101+
The `output-cache-test` fixture uses `vtt write-file dist/out.txt built`, a simple write-only operation. `vtt write-file` never reads from `dist/`, so fspy only records it as a write. The directory never appears in `path_reads`, the post-run fingerprint doesn't include it, and cache validation succeeds after deletion.
102+
103+
This does not reflect real build tool behavior.
104+
105+
## Behavior Matrix
106+
107+
All tests performed with Vite 8.0.8 and tsdown 0.12.9. "Cache hit after deleting dist" means output restoration can work.
108+
109+
| `input` | `output` | fspy enabled | Cache hit after deleting dist |
110+
| -------------- | --------------- | ------------ | ----------------------------- |
111+
| auto (default) | auto (default) | yes | **no** |
112+
| auto (default) | `["dist/**"]` | yes | **no** |
113+
| `["src/**"]` | auto (default) | yes | **no** |
114+
| `["src/**"]` | `["dist/**"]` | no | **yes** |
115+
| `["src/**"]` | `[]` (disabled) | no | yes (but no restoration) |
116+
117+
The only configuration that works requires **both** explicit input globs **and** explicit output globs, which disables fspy entirely. Any configuration that enables fspy (for either auto-input or auto-output detection) causes the output directory reads to pollute the inferred input set.

0 commit comments

Comments
 (0)