mount import directory for pyodide sandbox#2073
Conversation
It turns out the sandbox used for importing files is reused for a given document, meaning that a change to copying vs mounting the import doesn't work after the first import. This reverts that change, and drops read permissions selectively.
fflorent
left a comment
There was a problem hiding this comment.
It looks good to me. I have asked you some questions, so I am sure to have understood the logic behind your PR
Thank you! :)
| // everything else. See --allow-read in SandboxPyodide.ts | ||
| const readDir = fs.realpathSync(__dirname); | ||
| const gristDir = fs.realpathSync(path.join(__dirname, "..", "grist")); | ||
| const reqFile = fs.realpathSync(path.join(__dirname, "..", "requirements.txt")); |
There was a problem hiding this comment.
For my own understanding, why is --allow-read=/PATH/TO/sandbox/requirements.txt added in a first place in SandboxPyodide.ts?
I think I see the reason for adding --allow-read=/PATH/TO/sandbox/grist: it is used to make a copy in loadCode. Deno.permissions.revoke() allows here to revoke permissions previously granted by the --allow-read arguments, right?
(Thanks for the comment BTW, it is very helpful to understand a bit more the context!)
There was a problem hiding this comment.
For requirements.txt, because that is how the code figures out the libraries Grist needs
This could be done a different way, but is what is there right now.
There was a problem hiding this comment.
Correct about the rest.
| this.log("Setting up import from", process.env.IMPORTDIR); | ||
| // Could mount directly, but instead copy so we can drop read perms early. | ||
| await this.copyFiles(process.env.IMPORTDIR, "/import_src", "/import"); | ||
| // All imports for a given doc live in the same root directory. |
There was a problem hiding this comment.
Alright, IMPORTDIR is a subdirectory to a temporary directory attributed exclusively to the ActiveDoc?
Also after adding some logs, I found that a sandbox is spawned during the import, am I right? It is fine here to share the same import directory because it is scoped to the ActiveDoc? (in which case it would make sense to me too)
There was a problem hiding this comment.
Yes that is what @Spoffy found when working on a Grist Desktop issue.
There was a problem hiding this comment.
That was my understanding / findings on how this all worked, yeah. The logic connecting parseFile in DocPluginManager to the sandbox is convoluted, but I'll do my best to summarise:
- File parsers (e.g. CSV, JSON, Excel) are defined as plugins in
plugins/core/manifest.yml. - These are instantiated as PluginInstance objects in
DocPluginManager(which is created on a per-doc basis) whenDocPluginManageris created.DocPluginManageris created one-per-doc. - When DocPluginManager initialises, it creates a new temporary directory (on a per-doc basis) that's passed to each plugin instance.
- When a plugin instance is first used (more specifically - when it first tries to make an RPC call to the sandbox, e.g. to parse a file), it's activated, which spawns a new sandbox
- Plugin instances are shut down when the document is, or after 5 minutes of inactivity. This shuts down that plugin's sandbox.
In short: yes, the temp directory is per active document. 🙂
fflorent
left a comment
There was a problem hiding this comment.
Thanks for your explanation, LGTM!
| this.log("Setting up import from", process.env.IMPORTDIR); | ||
| // Could mount directly, but instead copy so we can drop read perms early. | ||
| await this.copyFiles(process.env.IMPORTDIR, "/import_src", "/import"); | ||
| // All imports for a given doc live in the same root directory. |
There was a problem hiding this comment.
Is it worth adding to this comment, something like:
Copying import files in and dropping read permissions isn't possible, due to the same sandbox being re-used for subsequent imports. The files would only copied once on startup, meaning the files for these later imports won't be available to Pyodide, resulting in errors.
It turns out the sandbox used for importing files is reused for a given document, meaning that a change to copying vs mounting the import doesn't work after the first import.
This reverts that change, and drops read permissions selectively.