Add Rust ABI support for WASM subgraphs#6462
Add Rust ABI support for WASM subgraphs#6462cargopete wants to merge 14 commits intographprotocol:masterfrom
Conversation
- Add Ethereum ToRustBytes impl for Log/Call/Block triggers - Add NEAR ToRustBytes stub (unimplemented, Ethereum-only for now) - Propagate ToRustBytes trait bounds through instance_manager - Skip parity_wasm gas injection for Rust modules (can't parse bulk-memory opcodes) - Skip AS-specific exports (id_of_type, _start) for Rust modules - Add handle_trigger_rust and invoke_handler_rust calling convention - Add Rust host function wrappers in module/context.rs
|
Can you explain a bit what the motivation for this is? Another ABI is a huge commitment in terms of maintenance etc. If another ABI is called for, it would be good to also see if we can avoid some of the mistakes the current ASC ABI makes. |
|
@lutter ▎ Thanks for the feedback! Motivation Meanwhile The Graph has already validated Rust→WASM as a first-class path with Substreams. A Rust mapping language is the natural extension of that investment to the subgraph layer — same WASM runtime, same graph-node infrastructure, just a different serialization boundary. Maintenance argument What we tried to learn from the ASC ABI's mistakes
Happy to discuss any of these design choices further, or to write a formal docs/rust-abi-spec.md or a forum post if that would help the review. |
P1: Replace unimplemented!() panic in NearTrigger::to_rust_bytes() with Vec::new() P2: Propagate PossibleReorg errors from ipfs_cat host fn instead of swallowing them P2: Fix useless .into() conversions on anyhow::Error (clippy useless_conversion) P3: Apply rustfmt to all changed files P3: Clarify comment on byte-scan Rust detection heuristic in ValidModule::new() Build: cargo build clean Tests: 14 rust_abi + existing wasm tests all pass Lint: cargo clippy --deny warnings clean Format: cargo fmt --check clean
Review SummaryFindings
Verification
Commit
|
Motivation
The AS ABI has structural problems that cannot be fixed without breaking existing subgraphs:
AscPtr<T>encodes type information in Rust phantom generics with no compile-time enforcement on the guest side.AscNullableString, sentinel pointers, separate flags), leading to per-type special-casing in the host.__alloc,__new,__pin) to pass data to the mapping. An AS compiler change can silently break graph-node.failed to read AscPtrtraps with no field or type context.apiVersiondescribes the AS class layout, not a wire protocol. Adding a host function requires ad-hoc compatibility code scattered acrossasc_abi/.Rust targeting
wasm32-unknown-unknownis already production-proven in the Substreams ecosystem with the samewasmtimeruntime family and the same class of host-imported functions used here. There is no novel compiler or runtime risk.What this PR does
Adds a parallel
rust_abi/serialization layer (~1,450 LOC) that sits next toasc_abi/and is selected by manifest. The runtime —HostExports, store, chain ingestion, gas accounting — is unchanged.The protocol in one sentence: the host serializes the trigger to a flat byte buffer, calls
allocate(len)on the mapping, copies the bytes in, invokeshandler(ptr, len), and callsreset_arena()after. The mapping owns its heap; the host never touches an AS-style allocator.Spec:
docs/rust-abi-spec.mdin this PR is the authoritative protocol reference: wire formats, TLV tag table, trigger layouts, host function signatures, versioning rules, and maintenance model.Implementation summary
New files (
runtime/wasm/src/rust_abi/):mod.rs—MappingLanguageenum;from_kind("wasm/rust")parsertypes.rs—ToRustWasm/FromRustWasmtraits;ValueTagenum; impls for all scalar typesentity.rs— TLVserialize_entity/deserialize_entity_datacovering allgraph-core::Valuevariants includingTimestamptrigger.rs—ToRustBytestrait; fixed-layoutRustLogTrigger,RustCallTrigger,RustBlockTriggerhost.rs— wasmtime linker wrappers forstore_set/get/remove,crypto_keccak256,log_log,data_source_address/network/create,ipfs_cat,ethereum_call,abort;is_rust_module()namespace detectionModified files:
runtime/wasm/src/mapping.rs— skipparity_wasmgas-injection pipeline for Rust modules (parity_wasm cannot parse bulk-memory opcodes emitted by current Rust toolchains); configure wasmtime fuel metering insteadruntime/wasm/src/module/mod.rs—build_linker()dispatches onMappingLanguage; Rust path skipsid_of_type, skips_startruntime/wasm/src/module/instance.rs—handle_trigger_rust(): allocate → write → call → reset_arena;invoke_handler_rust()with trap/timeout/reorg/out-of-fuel handlingchain/ethereum/src/trigger.rs—ToRustBytesfor all three Ethereum trigger typeschain/near/src/trigger.rs— stubToRustBytes(unimplemented)core/src/subgraph/instance_manager.rs—ToRustBytestrait bound propagationwasm/rustrecognised as a distinctmapping.kindPerformance
Benchmarks run against a Rust ERC20 Transfer indexer compiled with the Graphite SDK.
Binary size (
wasm32-unknown-unknown --release):wasm-opt -OzThe size delta is dominated by
num_bigintanddlmallocbeing statically linked into the Rust binary — functions that AS delegates to host calls. This is reducible; it is not a fundamental property of the ABI.Handler throughput (Rust, isolation benchmark):
A wasmtime harness links no-op host stubs and loops
allocate → memcpy → handle_transfer → reset_arenawith an identical 212-byte Transfer payload. No gas metering, no store I/O.Steady state: ~617k Transfer events/sec, ~1.62 µs/event on Apple Silicon under wasmtime 29, no fuel metering.
Honest caveat: we do not benchmark the AS handler in isolation because invoking it outside graph-node requires reconstructing the entire
asc_abi/encoder (AscPtr object graph, managed-class headers, type IDs). The right end-to-end comparison is deploying both subgraphs against the same chain head and readingsubgraph_indexing_handler_execution_timefrom graph-node's Prometheus metrics. The Rust side of that comparison is covered by the live integration test below.Testing
Unit tests (14): Entity TLV round-trips for every
Valuevariant; each trigger type; BigInt/BigDecimal/String/Bytes/Address primitives.WASM integration test (
tests/integration/tests/wasm_handler.rs): loads the compiled Rust ERC20 WASM into wasmtime, serializes aRustLogTriggerusing the exact production binary format, invokeshandle_transfer(ptr, len), and asserts that the resultingstore_setcall carries the expected entity fields (from, to, value, blockNumber, timestamp, transactionHash, id).Live mainnet test (
scripts/live-test.sh): deployed the ERC20 subgraph to a running fork of this graph-node, indexed real USDCTransferevents from Ethereum mainnet starting at block 24756400, and verified correct GraphQL query results for all entity fields.Maintenance model
HostExportsis already language-agnostic. Adding a new host function is:HostExports(shared, as today).rust_abi/host.rs— read(ptr, len)args, deserialize, callHostExports, write output back. Typically 20–40 lines, no heap manipulation, noAscPtr<T>juggling.The serialization layer (
rust_abi/) is ~1,450 lines total and touches nothing outside theruntime/wasmcrate boundary.Out of scope (follow-ups)
ToRustBytes— stub only; requires a per-chain serialization implgraphite-abicrate —ValueTagconstants are currently duplicated between graph-node and the SDK; ano_stdtypes crate would eliminate drift risk