[Discussion] Shaping the Future of `try-runtime` and `try-runtime-cli`

I would like to initiate discussion and share some suggestions regarding the future direction of try-runtime and try-runtime-cli.

The objective of the discussion is to help establish goals for try-runtime and the CLI, and ultimately help to determine which features and fixes should be prioritised.

Subtopics

I suggest dividing this discussion into 3 sub-topics:

  1. Explicit product goals of try-runtime and its fit in the broader test tooling ecosystem
  2. Shortcomings of the existing try-runtime functionality
  3. The next evolution of try-runtime

1. Explicit goals & test tooling ecosystem fit

Should try-runtime complement other tools like chopsticks, or be an all-in-one testing tool?

The try-runtime-cli docs currently describe try-runtime as:

Substrate’s ultimate testing framework for power users.

A question I have is: should we continue developing try-runtime as an all-in-one “ultimate” testing framework, or view it as a part of the broader testing ecosystem complemented by other tools like chopsticks? I would appreciate input from @xlc, @kianenigma, and others familiar with both tools about:

  1. Aspects in which chopsticks excels and try-runtime might struggle
  2. Aspects in which try-runtime excels and chopsticks might struggle

Both approaches have merits:

  • Continuing to develop try-runtime as an all-in-one testing tool offers redundancy in the test tooling ecosystem and encourages healthy competition, ultimately leading to better tooling.
  • Clearly defined and segmented responsibilities of tools reduce duplicated work and could lessen confusion for new developers when choosing testing tools, and allow each tool to optimise for their strengths.

2. Shortcomings of existing try-runtime functionality

Which improvements to existing functionality should be prioritized?

Several developer experience issues with try-runtime currently include:

2a. Long storage download times

Running try-runtime can be time-consuming due to the need to download & construct a test-externalities environment each time it’s run.

I have discussed some possible improvements with @kianenigma:

Lazy download

Persistent data cache

  • Provide try-runtime its own persistent data cache. The data fetching flow could be: 1. attempt to get data from in-memory, 2. attempt to get data from persistent data store, 3. get data from the node
  • This could be useful for developers who want to repeatedly run try-runtime against the same block state (e.g., when developing a runtime upgrade), significantly speeding up the development process
  • Implementation requires careful consideration; developers would likely need a simple way to manage creating/reviewing/removing storage for specific blocks of interest.

Lazy download seems to be a sensible initial improvement. Once stable, we can consider adding a persistent cache.

2b. Monitoring try_state invariants feels cumbersome

Checking try_states invariants can be somewhat involved, it would be helpful to simplify the process.

@kianenigma and I briefly discussed the idea of allowing a node to run with something like a --try-state-checks-I-KNOW-WHAT-IM-DOING flag that would enable the checks during regular block processing. try-runtime enabled runtimes will be available in S3 shortly, so they could be automatically downloaded by the node and executed alongside the regular runtime without the user needing to follow any complicated steps.

2c. Inherents (fast-forward sub-command)

We currently lack a good pattern to determine which inherents should be included at the beginning of each block when running fast-forward.

This makes the fast-forward command cumbersome and requires a manual code change for each chain its run against.

The optimal solution here is unclear and requires further thought and discussion.

2d. Easier usage of try-runtime apis by other tools

try-runtime apis are important, and could provide value to many testing tools including chopsticks.

Creating a try-runtime enabled runtime is currently a manual process, making it a bit cumbersome to use for other projects to integrate with the api.

The Parity release engineering team is working on automating the publishing of the latest try-runtime enabled runtime, which should help with this.

2e. Multiple sets of docs & hard-to-find quality docs

Currently, if you search for “try runtime docs”, the first result is the substrate.io docs, which are not as informative as the docs at paritytech.github.io. It’s unfortunate that the more comprehensive documentation exists but is harder to find. The SEO on the paritytech.github.io docs is also lacking: the site title and page title don’t even mention docs, I think most people would completely overlook them.

Hopefully, @kianenigma can find a solution to this issue (which likely exists for other tools in Parity) in his efforts to improve documentation this year.

Please comment if you can think of anything I may have missed here

3. What does the next evolution of try-runtime look like?

What are the highest impact new features that should be prioritized, aligning with the product goals for try-runtime?

3a. Standalone CLI

There are plans to pull the CLI out of substrate into a separate Rust project, making it generally more chain-agnostic and easier to use across the substrate chain ecosystem.

3b. Open RPC after fast-forward

It would be convenient if, after fast-forward is run, the CLI exposed an RPC allowing users to query state in the Polkadot.js apps UI and ensure everything is as expected.

3c. Allow forking chain state and mining new transactions on top of it

Similar to what is already possible with chopsticks, it would be useful to allow forking the chain with a new runtime and opening an RPC to it allowing developers to view the chain’s state, craft and instantly mine transactions, fast-forward blocks, etc., all from the Polkadot.js apps.

Please comment if you have ideas for any other capabilities that could be added to try-runtime

Final thoughts & next steps

Lazy download seems to be the most obvious place to start with improvements, it’s high impact and relatively straight forward.

By the time that lazy download is shipped hopefully we’ll have a rough roadmap laid out, and it’ll be clear what to prioritise next.

4 Likes

Happy to see we finally have some proper discussion post before writing code.

Let me highlight some reasons that why I build Chopsticks in the current form:

  • Out of box support of many (if not all) parachains. People can use Chopsticks with their parachain without any extra work. Compare to try-runtime-cli that requires some non-trivial code integration.
  • Multichain & XCM support. I don’t know how to make try-runtime-cli to support such use case. The dependency management will be impossible when dealing with multiple chains.
  • Compatibility. Most of the Chopsticks are built with Typescript, which have a weak typed runtime. This is super useful because it is trivial to write code that’s compatible with many variants. Featuer detection and duck typing required no effort. Chopsticks is able to handle difference between babe and aura, relaychain and parachain, different XCM versions, etc. No change required to make it support XCM v3. This is just not really possible with Rust without defeating the type system.
  • Run in browser. It is not impossible to make try-runtime-cli run in browser but I wouldn’t want to give it a try myself.

Chopsticks also have lazy downloading from day 1 and persistent data cache from like day 7.

One thing try-runtime-cli can achive and not possible for Chopsticks: actually debug the Rust runtime code with debugger.

3 Likes

Thanks for opening this important discussion!

For me, try-runtime has always been perceived as a convenient, yet powerful dev tool for working with (a subset of) runtime API. This has obvious limits and hence I wouldn’t aim at pushing it as a fully universal (all-in-one) testing tool. @xlc already pointed out some valid points that imho are strengthening the approach with a suite of complementary tools. Also, I strongly believe that objectives of try-runtime should be clearly revisited and defined. This has been already a problem in the past that people had expectations that didn’t fit at all the current paradigm.

Regarding 2c and similar - I’m pretty convinced that if we want to be chain-agnostic, we have to accept the fact that some functionalities must be adjusted manually by the end user. The only thing we can do is to provide clear and convenient API with basic examples and defaults.

As for point 3 - I’ve already had some discussions with Kian about something I like to call translation gadget. In brief: in its current form, try-runtime works well under the assumption that the upstream runtime and the one used by the tool share block format - i.e. every tx that is valid in one runtime is also valid in the other one. This covers some part of use cases, but fails for any scenario that e.g. removes a pallet or changes pallet’s API. In that case, follow-chain/execute-block may fail simply because of decoding/interpreting problems. In order to handle these cases, I would like to enable users to put a special gadget that could take a block/extrinsic from the live chain as an input, convert it freely and output a block/extrinsic appropriate for the local runtime.

This could be then used in many useful forms:

  • use live traffic in the old format in a understandable way for the new runtime version
  • filtering out / adding transactions
  • hooking additional source of traffic to fast-forward - instead of producing a lot of empty blocks, we can produce a lot of empty blocks with arbitrary content
3 Likes

As the original author of “Ultimate Testing Tool” I must confess that I would like to retract that statement; indeed, what I have learned over the months is that try-runtime-cli is sub-optimal in many aspects. Although, the fact that it resembles the normal substrate client is also useful for debugability purposes.

All in all, the part of this discussion that I want to highlight is #[cfg(feature = try-runtime)] is much more valuable than try-runtime-cli. The full API is listed here.

I think the idea of a FRAME-based runtime being able to expose additional runtime apis that let its internals being deeply tested is awesome, and gives me a piece of mind. I would love to see more clients (including the substrate client) integrate these APIs. Moreover, we should make more pallet expose these APIs.

I thought I had opened an issue for this, but someday I would love to see each Polkadot release contain a try-runtime-enabled wasm blob, and tools like Chopsticks and Fudge, among try-runtime-cli capable of calling them.

Yes i would really like to see a road-map (or feature-map, does not need to be timed). I think most of the questions should be resolved once some high-level goals are defined for the CLI. Then once these goals are archived, it can be optimized.
Just currently it feels to me like we optimize something that does not even know what to optimize for, since it does not have stated goals.
For example the lazy downloads are nice, but it does not matter to me as long as the CLI is not reliably preventing us from breaking DotSama by a runtime upgrade.

Idea 3b. and 3c. are implemented in chopstick and probably much easier to implement over there. Maybe we should help there instead for these things.

Another runtime API that will be helpful is Runtime API to dispatch call and return events · Issue #13165 · paritytech/substrate · GitHub

Then we can simulate calls from any origin without cumbersome setup

For example the lazy downloads are nice, but it does not matter to me as long as the CLI is not reliably preventing us from breaking DotSama by a runtime upgrade.

Are you saying there’re some classes of bug the on_runtime_upgrade subcommand is not very good at detecting, or just that we need to finish implementing try-state hooks across all pallets?

Yes, but not so much the CLI itself, but more the process of how we use it. For example it was just discovered last week that multiple migrations are missing from the relay runtimes. This is pretty horrible and embarrassing.

I think we need to first find a bullet-proof way from pallet development to runtime upgrade such that storage cannot corrupted anymore. And somewhere in that way there will be a spot for the try-runtime-cli. Just focusing on the CLI itself wont solve this.

Some issues for that include:

1 Like