ffi: add V8 fast-call path#63140
Draft
bengl wants to merge 1 commit intonodejs:mainfrom
Draft
Conversation
Collaborator
|
Review requested:
|
Add a parallel dispatch path that uses V8 fast API calls instead of libffi for eligible native calls. At DynamicLibrary.getFunction time, generate a per-function JIT'd trampoline that strips V8's receiver argument and tail-calls the target. Signatures with callbacks, unsupported argument types, or more register-passed args than the platform ABI permits transparently fall back to libffi. Stub emitters cover Linux/macOS/FreeBSD on x86_64 and AArch64, Windows on x86_64 and AArch64, and Linux on AArch32. JIT memory is allocated per isolate via direct mmap with MAP_JIT on macOS and W^X enforcement elsewhere. The JS wrapper validates each argument per declared type, mirroring the libffi slow callback so the contract is identical across both paths and across V8 optimization tiers. The path is gated behind --experimental-ffi and can be disabled at build time with --without-ffi-fastcall. The previous shared-buffer JS fast path is removed, replaced by this fast-call path. Signed-off-by: Bryan English <bryan@bryanenglish.com>
Member
|
With this. How does ffi compare to napi? Do we have any guidelines in regards to when to use which? |
Contributor
|
@ronag On my PR (which yields similar results as Bryan's) we're beating it by like 100x. |
Member
|
So ffi is 100x faster than napi? |
Contributor
Sorry, I thought you meant https://github.com/node-ffi-napi/node-ffi-napi. We probably have to measure that once we agree on the direction. |
node-ffi-napi isn't a good benchmark for performance, but koffi is. Version 3 performs even better than version 2; it's still in beta (but can already be used for testing). |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
HEADS UP: This likely isn't done yet. It's mainly here to compare approaches with @ShogunPanda.
Add a parallel dispatch path that uses V8 fast API calls instead of libffi for eligible native calls. At DynamicLibrary.getFunction time, generate a per-function JIT'd trampoline that strips V8's receiver argument and tail-calls the target. Signatures with callbacks, unsupported argument types, or more register-passed args than the platform ABI permits transparently fall back to libffi.
Stub emitters cover Linux/macOS/FreeBSD on x86_64 and AArch64, Windows on x86_64 and AArch64, and Linux on AArch32. JIT memory is allocated per isolate via direct mmap with MAP_JIT on macOS and W^X enforcement elsewhere. The JS wrapper validates each argument per declared type, mirroring the libffi slow callback so the contract is identical across both paths and across V8 optimization tiers.
The path is gated behind --experimental-ffi and can be disabled at build time with --without-ffi-fastcall. The previous shared-buffer JS fast path is removed, replaced by this fast-call path.
Some benchmarks: