New JSON-RPC API mega Q&A

Polkadot consists in a peer-to-peer network of so-called clients that talk to each other. In order for the owner of a node to know what happens on Polkadot, they must then communicate with their node. This is done with what we call the JSON-RPC API. For example, PolkadotJS talks to a node using the JSON-RPC API.

Unfortunately, this JSON-RPC API has many flaws (more details below), and since the end of 2021 we have been trying to fix these flaws by creating a new JSON-RPC API. The most important part of this API, which is the chainHead_-prefixed functions, is implemented in Substrate (and thus are in the official Polkadot client) and smoldot.

Website: Introduction
Repository: GitHub - paritytech/json-rpc-interface-spec

This post aims to be a reminder that this is curently in progress, and also to share information about this API. I have done this in the form of a Q&A, as I think that this is a good way to organize information.

If you are a JSON-RPC-client-side developer, for example if you are creating a UI, I strongly invite you to read this Q&A and the details of this new API (link to the website above), and maybe trying to use the new JSON-RPC API.

Q&A

General questions

Does this thing concern me?

If you are maintaining code that manually calls JSON-RPC functions, then you are concerned and you should read these information.

If you are a UI/frontend developer or parachain developer, then you are only indirectly concerned and you probably have to do nothing. However, I would still encourage you to get interested in this topic.

Is the new JSON-RPC API stable?

The new JSON-RPC API is not stable, and the details might still change. However, it is unlikely to change in major ways, and it should be possible to look at the list of commits of https://github.com/paritytech/json-rpc-interface-spec/ to follow the changes that happen.

Is this new JSON-RPC API usable?

More or less! It is implemented in Substrate and smoldot, and maybe on other implementations as well. At the time of publication of this text, however, Substrate hasn’t backported some recent changes to the JSON-RPC specification and doesn’t implement the archive API. Similarly, smoldot doesn’t implement everything.

Since the code is relatively new, there might be accidental mismatches between the specification and what Substrate and smoldot actually do.
For this reason, it would be appreciated for those who want to try the new JSON-RPC API to write their code according to the specification rather than by retro-engineering the data returned by Substrate or smoldot, as doing so would increase the chances of finding a mismatch.
If you notice such as mismatch, please open an issue in the Substrate or smoldot repository.

Since the functions are still marked as unstable, should JSON-RPC client developers wait for them to be stable before upgrading?

The functions are marked as unstable because they are still in the phase where JSON-RPC-clients-side developers should try using them and give feedback. I have personally tried to communicate this over the past year, but I haven’t received much help in this communication, and no feedback has yet been received from any client-side developer.

There is a bit of chicken-and-egg problem, which is that JSON-RPC client developers wait for things to be stable in order to look into it, while JSON-RPC spec and server developers would appreciate feedback from client developers before stabilizing the API and code.

This is why this this forum post tries to raise awareness about this new API.

If no feedback was to be received, the plan would be to stabilize all the unstable JSON-RPC functions as-is despite their possible (unknown) flaws, and gather feedback for a version 2 instead.

Why is the specification repository owned by the paritytech GitHub organization? Who is responsible for this JSON-RPC API?

The design of this new JSON-RPC API is essentially a one-man job (by me, the author of this post), with the support and benediction of the core engineering of Parity. It has received internal feedback and is approved by the Parity core engineering. This API was later implemented in Substrate by @lexnv, another Parity developer. I have left Parity at the end of 2021 but I am still working on this aspect (as you can see by the fact that this post exists) and I am still collaborating with Parity.

The JSON-RPC API is not technically part of the Polkadot specification, as there is no necessity to implement it in order to connect to Polkadot. A Polkadot client could for example implement a completely different JSON-RPC API if they wanted, or could directly integrate a UI without the need for a JSON-RPC layer.
For this reason, there is no straight-forward answer as to where this specification belongs and who owns it. That’s just a consequence of decentralization.

The specification was created in the paritytech GitHub organization for pragmatic reasons, but it shouldn’t be seen that Parity “owns” this specification. I would personally like if it was moved to https://github.com/polkadot-fellows, but it is also unclear who, in polkadot-fellows, would have the rights to merge pull requests and in general accept changes.

When will the legacy API be deprecated and removed?

My answer to this would be “as soon as realistically possible”.

Due to the law of maximum laziness, JSON-RPC-client-side developers will likely never update their code unless they are given the fear that their code will break if they don’t. If the Substrate/Polkadot client waits for all JSON-RPC clients to have updated before deprecating the legacy functions, then this will basically never happen. This paragraph might sound snarky and sarcastic, but pragmatically speaking it is true.

My opinion is as soon as possible to start printing a warning in the logs if a legacy JSON-RPC function is called, which is something that smoldot already does. After the new JSON-RPC API is fully stabilized, throttle down the legacy JSON-RPC functions by adding some kind of sleep(5 seconds) at the beginning of them. Then, a few months later, move to something like 30 seconds. Then, finally remove the functions.
Given that I am no longer at Parity, I can only suggest this path and not actually enforce it.

Why do we need a new JSON-RPC API?

The legacy/current JSON-RPC API suffers from several issues:

  • Many of the functions assume that the implementation is a full node, and light clients weren’t taken into account when the legacy JSON-RPC API has been designed. A good example of this is maybe the state_getKeysPaged function. On a full node, this function is implemented by simply reading from the database and returning the result. On a light client, this function is implemented by querying all the keys of the given prefix from the networking, then filtering out the keys that aren’t in the requested page. If a JSON-RPC client calls this function multiple times, like you are supposed to do, the light client downloads all the keys multiple times, which is incredibly wasteful. Several other functions such as state_queryStorage simply can’t be implemented by a light client at all.

  • The parameters and return value of some functions depend on the details of the runtime of the chain. For example, the payment_queryInfo function (which is used to determine the fees of a transaction) is implemented by calling a function from the runtime then decoding what the function returns and returning a JSON object containing the various fields. If the runtime was modified and this runtime function disappeared or returned something else, the JSON-RPC function implementation would break. Because of this, the Substrate/Polkadot client and the runtime must always use the same version, which defeats the point of having an upgradable runtime.

  • Several legacy JSON-RPC functions aren’t anti-DoS-friendly. Writing a JSON-RPC server that resists DoS attacks can be tricky, and, because we make use of the so-called subscriptions, cannot be done unless the JSON-RPC functions are designed appropriately. Unfortunately this isn’t the case, and the Substrate JSON-RPC server has to rely on hacks such as disconnecting clients that seem too slow, which leads to bad user experience. You can read more here: https://paritytech.github.io/json-rpc-interface-spec/dos-attacks-resilience.html.

  • Most functions are badly documented, in particular when it comes to corner cases. For example, when asking for some information about a specific block, and the server doesn’t know about this block, some functions return null while some others return an error.

  • Many of the functions aren’t load-balancer-friendly. Due to the logic of some of the JSON-RPC functions, load balancers currently have to rely on “pinning” clients to specific servers. In other words, once a client has connected to the load balancer, all the requests that it sends have to be directed to the same server, otherwise the information that the client receives could be contradictory. Due to this pinning system, it is not possible to auto-downscale JSON-RPC servers without disconnecting clients. The new JSON-RPC API has been designed so that a load balancer can move a client from one server to another and thus shut down servers that it doesn’t need anymore.

When Polkadot/Substrate was first started, the set of JSON-RPC functions was simply copy-pasted from Ethereum (given that their implementations already existed in the Parity Ethereum client), then expanded in a cowboy-y way without being given proper thoughts.
The new JSON-RPC API aims at cleaning up this aspect.

The new JSON-RPC API is very strict. Can I no longer add custom JSON-RPC functions to my node?

From what I’ve noticed in the wild, custom JSON-RPC functions can always be put in one of two categories:

  • Functions that are custom to the logic of the runtime (for example interacting with contracts on a contracts chain). I would encourage you to replace these JSON-RPC functions with chainHead_unstable_call or archive_unstable_call (or state_call in the legacy API).
  • Functions that are used internally for debugging. It is completely okay to leave these functions on the node, provided that they are only used internally by the development team and not part of a publicly-available UI. I would however encourage you to use the new naming scheme and put unstable in their name in order to convey that they aren’t part of the API.

If you have a custom JSON-RPC function that doesn’t belong to one of these two categories, please raise an issue with your use case in the spec repo: https://github.com/paritytech/json-rpc-interface-spec/.

While nothing forces you to collaborate with this specification, keep in mind that creating this new JSON-RPC API isn’t so much about solving a technical problem than it is about solving a social problem. Technically you can add any function to the server, but doing so is pointless if there’s no client that calls it. And clients can’t call a function if not all servers implement it or implement it in different ways.
If you distribute a server with custom functions, and distribute a client that calls these custom functions, you put onto the end user the burden of figuring out whether their client and server are compatible with each other in what might end up being what is common named “a clusterfuck”. Creating a standard for the JSON-RPC API aims at avoiding this problem.

Should I learn the details of this JSON-RPC API?

If you are a UI developer, then you are not expected to directly use the JSON-RPC API. Instead, you are expected to use an intermediary-level library between your UI and the JSON-RPC server. This is already the case right now, as most UIs use PolkadotJS rather than manually send JSON-RPC requests.
This intermediary library is responsible for performing all the complicated aspects such as retrieving the metadata or watching storage items.

Contrary to the legacy JSON-RPC API, the new JSON-RPC API has been designed to be implementable by the node in an unopinionated way. When a request asks the node to do X, the node does precisely X. There is for instance no necessity for the server to cache information that it thinks might be requested later.
This means that all the JSON-RPC functions of the legacy API that were designed to be easy to use in a certain opinionated way have now disappeared. Instead, all these opinionated decisions are now found on the client side.
It is not possible to write a universal intermediary library that fits every use case (unless you turn that library into a huge overcomplicated monster, which isn’t desirable either). For example, if you are writing a UI that simply follows some items in the storage, you might use a different intermediary library than if you query historical data or if you are watching the status of your own nodes. For this reason, there is room for several different client-side libraries.

If you are a node operator, you can use the JSON-RPC API in order to investigate and control your node. While some aspects of the API can be a bit complicated (such as following blocks, retrieving storage items, etc.), none of these complicated aspects concern the use case of a node operator. Functions such as sudo_sessionKeys_unstable_generate or transaction_unstable_submitAndWatch are rather simple to use.

Does PolkadotJS support the new JSON-RPC API?

Unfortunately no.

Given that PolkadotJS closely lies on top of the legacy JSON-RPC API, it is unlikely to ever fully transition to the new JSON-RPC API. For this reason, I would generally recommend against using PolkadotJS as an intermediary library when creating an application.

New light-client-friendly intermediary libraries (such as capi) need to be developed, and this Q&A is also targeted at potential library developers.

Unfortunately, and much to my disappointment, as far as I know no high-level library has yet committed to fully embracing the new JSON-RPC API at the moment.

I’m reading the specification and I don’t understand something/I’ve found something ambiguous

Please open an issue in the repository.

The specification is meant to be easy to understand, clear, and precise. If it is not the case, then it must be fixed.

Questions about the details of the new API

Can you explain the naming scheme of the new JSON-RPC functions?

All the new JSON-RPC functions are named like this: namespace_function.
For example, in chainHead_unstable_follow, the namespace is chainHead_unstable, and the function name is follow.

The namespaces currently in the API are: archive_unstable, chainHead_unstable, chainSpec_unstable, rpc, sudo_unstable, sudo_sessionKeys_unstable, and transaction_unstable.

The word unstable represents the version of the namespace. unstable means that the functions implementation might change at any time and thus can’t be relied upon. All the functions of the chainHead_unstable namespace for example will be renamed to chainHead_v1 once we’re happy with their design. After that, their API can’t be modified ever again.

All the functions within the same namespace can be grouped together and are isolatable as a group. For example, all the functions of chainHead_unstable together serve a specific purpose (following the head of the chain) that doesn’t require calling any function from a different namespace.
This is the reason why, for example, there exists chainSpec_unstable_genesisHash but also chainHead_unstable_genesisHash and archive_unstable_genesisHash that all do the same thing.

Note that the namespace is chainHead_unstable and not just chainHead. This means that for example chainHead_v1 and chainHead_v2 (assuming there exists a v2 some day) don’t interact with each other. A JSON-RPC client should either use only chainHead_v1 or only chainHead_v2 depending on what the server supports, and not both at the same time.

What does it mean when a JSON-RPC function is prefixed with sudo_?

The namespaces that start with sudo indicate that they contain functions that operate on one specific node and/or are about administering a node.

All JSON-RPC functions of the new API have been designed so that can be implemented by a load balancer that distributes requests between various load-balanced nodes, except for the ones that start with sudo_.
For example, sudo_unstable_version returns the version of the node that is being queried. From the point of view of a load balancer, this creates an ambiguity: should it returns its own version? Should it return the version of one of the nodes? In that case, do all the nodes have to use the same version? What if they don’t? In order to solve this ambiguity, the load balancer simply shouldn’t expose the sudo_unstable_version function.

Furthermore, functions such as sudo_sessionKeys_unstable_generate modify the node and thus shouldn’t be callable by anyone but the owner of that node. For this reason, they also shouldn’t be exposed by proxies and load balancers.

Which functions are safe and which are unsafe?

The legacy JSON-RPC API has a concept of “safe function” and “unsafe function”. Unsafe functions are the ones that shouldn’t be available to the public but only to the owner of a node.
Unfortunately, it is currently not clearly documented which functions are safe and which are unsafe.

In the new JSON-RPC API, all the functions that start with sudo_ are unsafe. All the others are safe.

Do all JSON-RPC servers support the entire JSON-RPC API?

No.

Some servers might support only some functions. However, the functions of a specific namespace_* must either be all supported or not be supported at all. For example, it is forbidden for a server to support the chainHead_v1_follow function but not the chainHead_v1_header function. This is the reason why functions have been split into namespaces.

In practice:

  • Light clients will not support the archive_ functions, as they can’t be implemented reliably.
  • Public-facing JSON-RPC servers and/or servers behind a load balancer will not support the sudo_ functions, as these functions act upon specific servers.
  • While no such thing exists yet, it is possible to imagine an “archive-only” server that implements only the archive_ functions but no the chainHead_ functions by reading from a database without being connected to the live chain.

At initialization, JSON-RPC clients should call the rpc_methods function in order to determine the list of JSON-RPC functions that are supported. All servers must support the rpc_methods function. In practice, JSON-RPC clients should most likely determine whether they are compatible or not and fail to initialize if they aren’t, rather than try to adjust to every possible server type.

How do I do the equivalent of chain_subscribeNewHeads, chain_subscribeAllHeads, chain_subscribeFinalizedHeads, or state_subscribeRuntimeVersion in the new API?

These four functions have been grouped into one: chainHead_unstable_follow.

The chainHead_unstable_follow function starts a subscription, similarly to the other subscriptions-based JSON-RPC functions of the legacy API. Instead of yielding just one information (e.g. just the list of new blocks), this subscription yields everything important about the head of the chain together.

Rather than go into details here, I invite you to look at the documentation: https://paritytech.github.io/json-rpc-interface-spec/api/chainHead_unstable_follow.html#notifications-format.

If you are interested only in specific information, then simply discard what you are not interested in.
For example, if you are interested only in finalized blocks, then what you need is the finalizedBlockHash field in the initialized event and the finalized events. Anything else can simply be ignored.

By yielding everything together, chainHead_unstable_follow makes it possible for the JSON-RPC client to track, if desired, which runtime version is associated with which block or which block might potentially be finalized. This is something that isn’t really possible to do properly with the legacy API.

How do I get the current best block, the current finalized block, or the current runtime version in the new API?

This can be done with chainHead_unstable_follow.
If you want to know the current finalized block: subscribe, then wait for the first initialized event, then unsubscribe (using chainHead_unstable_unfollow).
If you want to know the current best block: subscribe, then wait for the first bestBlockChanged event, then unsubscribe.
If you want to know the runtime version: subscribe with true as parameter, then wait for the first initialized event, then unsubscribe.

There is no direct equivalent to the legacy chain_getBlockHash(null), chain_getFinalizedHead, and state_getRuntimeVersion functions.
It would be correct to point out that if all you need is knowing the current best or finalized block or the runtime version, then subscribing and unsubscribing is less efficient compared to calling a JSON-RPC function that simply returns this information. This would be a problem if a client needs to repeatedly know the best or finalized block or the runtime version, however in that case they should instead simply stay subscribed.

What is this stop event that chainHead_unstable_follow can generate?

If a subscription to chainHead_unstable_follow sends a notification that contains {"event":"stop"}, it indicates that the subscription can no longer continue. The subscription is now dead and the JSON-RPC client must resubscribe.

This event is generated by the server is a variety of situations, such as a sudden influx of blocks on the node (which can happen if you were disconnected from the Internet for a long time then reconnect), the syncing subsystem of the node crashing and being restarted, or a load balancer killing all active subscriptions before shutting down one of the load-balanced nodes.

A subscription to chainHead_unstable_follow always generates a consistent view of the chain. This means that JSON-RPC clients can (if desired) build a tree of the blocks that the subscription yields. This stop event indicates that this consistent view can’t be guaranteed anymore by the server, and thus the client must resubscribe and re-build this view from scratch.

Can you explain the parameter that is passed to chainHead_unstable_follow?

The chainHead_unstable_follow function accepts a boolean parameter indicating whether information about the runtime should be provided in the notifications in addition to the blocks.

If false is provided, then it is not possible to call chainHead_unstable_call with that subscription, and no information about the runtime is provided in the notifications.

From a purely logical point of view, passing true only gives advantages.
On light clients, however, the initialization of the subscription will take up to a few seconds if true is passed but only up to a few milliseconds if false is passed. This is the case because light clients need to download additional data in order to provide information about the runtime.

For the best user experience, a UI can subscribe twice: once with false and once with true. Once the subscription with false is initialized (which takes at most a few milliseconds) it can start displaying some information. Then, once the subscription with true is initialized (at most a few seconds later), it can display the rest and unsubscribe from the first one.
While this is complicated to implement, UIs are expected to use some kind of library to communicate with a node rather than perform JSON-RPC calls manually.

What is the difference between the archive_-prefixed JSON-RPC functions and the chainHead_-prefixed JSON-RPC functions? Can you explain the blocks pinning system of chainHead_unstable_follow?

Every function that allows querying information about a certain block exists in two versions: one prefixed with archive_ and one prefixed with chainHead_. For example, to query the header of a block you can use either archive_unstable_header or chainHead_unstable_header.

Which one to use depends on the specific use case.

If you want to query information about blocks that are near the head of the chain, then use the chainHead_-prefixed function. In order to use a chainHead_-prefixed function, you must first be subscribed through chainHead_unstable_follow. Only blocks that have been reported through notifications can be queried using one of the other chainHead_ functions. When you call the chainHead_-prefixed function, you must also pass as parameter the identifier of the follow subscription (the value that chainHead_unstable_follow has returned).

In other words, the chainHead_ functions can be seen as an extension to chainHead_unstable_follow. When a block is reported through a follow subscription, the client can then query what it wants from this block.

Once a JSON-RPC client no longer needs to query anything about a block that was reported through chainHead_unstable_follow, it must unpin the block by calling chainHead_unstable_unpin. This indicates to the server that it can liberate the resources associated with that block.
Failing to unpin blocks can lead to the follow subscription terminating itself, in which case it must be reopened.

This API design, while a bit complicated, makes it possible for the JSON-RPC server to keep in memory the blocks that haven’t been unpinned yet, and throw away entirely the blocks that have been unpinned. It removes the necessity for the server to try to magically guess which blocks to keep in its cache.

If instead you want to query information about any block, even for example blocks from years ago, then you must use the archive_-prefixed functions.
The archive_-prefixed functions are slower than their chainHead_ counterparts, as they most likely have to load information from the disk. Furthermore, and importantly, they are not available on light clients.

How do I use the archive_-prefixed JSON-RPC functions?

The archive_-prefixed functions allow querying information about a specific block.

The node has a “current finalized block height” which can be retrieved by calling archive_unstable_finalizedHeight. Any block whose height is inferior or equal to the number returned by this function can be queried in an idempotent way.

Any block whose height is superior to this value, in other words blocks that haven’t been finalized yet, can maybe be queried. Querying works but might unexpectedly stop working as the block might disappear from the node’s storage.

Keep in mind that archive_-prefixed functions aren’t meant to be used for querying blocks that are near the head of the chain. It is still allowed to query recent blocks, but this is expected to be useful mostly for manual debugging and in situations where finality is stuck rather than under normal operation. Please use chainHead_-prefixed functions if you are interested in the head of the chain.

Note that archive_unstable_finalizedHeight isn’t a subscription but a simple function that returns a value. A JSON-RPC client is expected to call this function once at initialization and/or every 5 minutes or so, rather than continuously. Again, the use case of the archive_-prefixed functions is query the archive of the chain rather than anything recent.

8 Likes

If archive_ isn’t always available, how do I get the hash of the genesis block?

The hash of a genesis block is a rather important information, as it needs to be included when building a transaction.

In the legacy API, it can be retrieved using chain_getBlockHash(0).
The equivalent of chain_getBlockHash in the new API is archive_unstable_hashByHeight. However, given that the archive_-prefixed functions aren’t available, you are encouraged to not use archive_unstable_hashByHeight but rather one of archive_unstable_genesisHash, chainHead_unstable_genesisHash, or chainSpec_unstable_genesisHash depending on which namespaces your UI depends on. For example, if you use chainHead_*-prefixed functions in your application, then use chainHead_unstable_genesisHash.

How do I access the chain’s storage with the new API?

This is done using either the archive_unstable_storage or the chainHead_unstable_storage function, depending on whether the block is recent.

See the documentation here:

Contrary to the legacy API, querying the storage in the new API creates a subscription with later returns the storage values through notifications.

If a waiting-for-continue event is generated, that means that the server has sent a lot of data and is waiting for the client to have received it before sending more. The client can indicate this by calling archive_unstable_storageContinue or chainHead_unstable_storageContinue in response to this event.
This mechanism is similar to what the legacy state_getKeysPaged function provides and is used to avoid head-of-line blocking issues and to avoid transferring a lot of data when all the client is no longer interested in it.

Why are chainHead_unstable_storage, chainHead_unstable_call, etc. subscriptions? Why is it so complicated?

Retrieving information about the blocks at the head of the chain is done by starting a subscription then waiting for notifications.

The reason for this design is so that the operation can be cancelled by the JSON-RPC client. For example, you can call chainHead_unstable_stopStorage to interrupt a chainHead_unstable_storage subscription before the operation has finished.

While on full nodes accessing a block’s information is a straight forward database read, on light clients it is instead a networking download. Interrupting an operation when it is no longer needed makes it possible to save resources.

As already explained, UIs are expected to use some kind of library to communicate with a node that leverage this complexity for them.

How can I watch when the value of a storage item changes?

There is no equivalent in the new API to the legacy state_subscribeStorage JSON-RPC function.
Instead, the client side must read from the storage at every block and manually determine which items have been modified. While this method seems inefficient, it is how state_subscribeStorage is implemented in Substrate. Doing this on the client side rather than the server side just moves code that already exists somewhere else in the stack.

While the logic of reading from the storage at every block is a bit complicated to implement, UIs do not need to do so and are instead meant to use a library that does it for them.
Depending on exactly what is needed, the implementation strategy might be different. For example, if a storage item changes from A to B in a block then back to A in the next block, some UIs will prefer detecting the change while some others only care about the latest value. The legacy state_subscribeStorage JSON-RPC function is in an ambiguous in-between regarding this situation: it watches every block in order to detect modifications, but doesn’t actually guarantee that behaviour.

Additionally, please note that chainHead_unstable_storage has a type parameter. If this type parameter is set to value, then the plain value is returned. If however this type parameter it set to hash, then the hash of the value is returned. When the JSON-RPC server is a light client, this request directly translates to a network request towards a full node. If the type parameter is value, then the light client will download the value from the full node. If the type parameter is hash, then the light client will only download the hash of the value from the full node.
When a value is large, downloading its hash is significantly faster and uses less bandwidth. For this reason, when the value is large, rather than download the value at each block, you are encouraged to download the hash of the value and compare it with a known hash, then download the value only if they differ.

How can I get the list of items in a map, and watch when the list of items in a map changes?

When a runtime item is of type Map, there isn’t actually any item at the location of the Map. Instead, the location of the Map is a prefix, and every map item starts with this prefix.

However, the chainHead_unstable_storage JSON-RPC function can still be used in that situation. For the key parameter, pass the location of the Map, and for the type parameter, pass descendants-values or descendants-hashes. Instead of returning only one notification of type "event": "item", the query will instead generate one notification of type "event": "item" for each item in the Map.

In order to know when the list of items in a map changes, you can query descendants-values or descendants-hashes at every block, similar to watching individual storage items.
However, when the list of items is large , a more efficient way to do so is to pass a type of closest-ancestor-merkle-value. This will generate one notification (one and only one notification is always generated, even if the map is empty or the key corresponds to nothing) of type "event": "item" that contains a hash of the whole content of the map. You can query the closest-ancestor-merkle-value at each block, check if it has changed compared to the previous block, and only if so query descendants-values.

Note: At the time of writing, Substrate and smoldot don’t support this yet.

How do I get the metadata, the account nonce, or the payment fees with the new API?

The new JSON-RPC API has no direct equivalent to legacy JSON-RPC functions such as payment_queryInfo, state_getMetadata, or system_accountNextIndex.
It does, however, have indirect equivalents through the new chainHead_unstable_call and archive_unstable_call functions (see a previous question for the difference between chainHead and archive). The examples below use chainHead_unstable_call, but the same applies with archive_unstable_call.

  • In order to obtain the metadata (equivalent to state_getMetadata), use chainHead_unstable_call(.., .., "Metadata_metadata", "").

  • In order to obtain the payment information (equivalent to payment_queryInfo), use chainHead_unstable_call(.., .., "TransactionPaymentApi_query_info", "0x" + balanceTransferHex + balanceTransferLenHex) where balanceTransferHex is the hexadecimal-encoded balance transfer transaction and balanceTransferLenHex is four hexadecimal-encoded bytes containing the length of the balance transfer in little endian. The fact that the length of the balance transfer needs to be provided in addition to the balance transfer itself is most likely an accidental wart in the runtime API.

  • In order to obtain the account next nonce (equivalent to system_accountNextIndex), use chainHead_unstable_call(.., .., "AccountNonceApi_account_nonce", "0x" + accountPublicKeyHex) where accountPublicKeyHex is the hexadecimal-encoded 32 bytes public key of the account.

Any other function that similarly requires some knowledge of runtime-specific concepts (such as accounts, referundums, contracts, etc.) has no equivalent in the new JSON-RPC API and chainHead_unstable_call should instead be used in order to directly call the appropriate runtime function.

Removing runtime-specific functions from the JSON-RPC API ensures that the JSON-RPC server doesn’t need to know anything about the runtime and can be runtime-agnostic. The logic of which runtime function exists and needs to be called is instead moved to the UI, which already has to be runtime-specific anyway.

In order to be future-proof, the JSON-RPC client is also encouraged to check before using chainHead_unstable_call whether the runtime of the desired block supports the given API. This is done through the apis field of runtime specification provided by chainHead_unstable_follow. For example, the Metadata_metadata function is available only if the list of apis contains an entry named 0xd2bc9897eed08f15 (which is the 64 bits blake2 hash of the string Metadata) whose value is equal to 1 or 2.
Performing this check is rather complicated, and any production-grade UI is expected to use a library to connect to the JSON-RPC server rather than calling JSON-RPC functions directly. This library should contain the logic for tracking which APIs are supported by which block.

How do I get the number of a block?

Before answering this question, first ask yourself: do you really need to know the number of a block?

The end user doesn’t care whether it is at block one million or ten million, as for them it is meaningless numbers. I’m sure that even Polkadot developers couldn’t give you even a broad estimate of the current block number of Polkadot.

The reason why the block number is typically being shown in a UI is to show that the chain is still working and progressing as intended. But in order to convey this fact, what you probably want to show instead is the timestamp of the latest block, which is a much more useful information. The timestamp is a UNIX timestamp available under the storage item now of the pallet timestamp.

Due to its questionable utility, the new JSON-RPC API doesn’t provide any direct way of knowing the number of a block.

That being said, the block number is still useful is some situations, such as building a transaction. For these situations, I would recommend finding the number of a block by reading the value of number in the system pallet. This is done the same way as reading any storage value.

If you don’t want to perform storage queries, it is also still possible to know the number of a block by querying its header using the chainHead_unstable_header function, then interpreting the bytes 32 to 36 of the header as a little-endian 32bits unsigned number.

In JavaScript, it can be done like this:

// Assuming that `header` is an `Array` or `Uint8Array`
const blockNumber = (header[32] | (header[33] << 8) | (header[34] << 16)) + (header[35] * 0x1000000)

Some chains built using Substrate (other than Kusama/Polkadot/Westend/Rococo) use an 8 bytes block number (in which case the number goes between bytes 32 and 40), but because the encoding is little-endian, as long as the number is inferior to 4 billion the code above will also work as expected.

It might seem dangerous to assume the layout of the bytes of the header of the block. However, there has never been any breaking change to the format of the block header in the entire history of the development of Substrate/Polkadot, and it is extremely unlikely that there ever be one.

How do I submit a transaction with the new API?

The equivalent to the legacy author_submitAndWatchExtrinsic JSON-RPC function is transaction_unstable_submitAndWatch.

Compared to author_submitAndWatchExtrinsic, the behavior of the new function is more precisely documented, even in corner cases such as the write queue of JSON-RPC notifications being full.

There is no equivalent to the legacy author_submitExtrinsic JSON-RPC function, as it can be simulated by calling transaction_unstable_submitAndWatch then immediately unwatching with transaction_unstable_submitAndWatch.

Why does transaction_unstable_submitAndWatch no longer return the list of peers the transaction has been broadcasted to?

The legacy author_submitAndWatchExtrinsic JSON-RPC function generates broadcast events that contain the peer IDs the transaction has been broadcasted to. The new transaction_unstable_submitAndWatch JSON-RPC function no longer does that and instead simply indicates the number of peers the transaction has been broadcasted to.

The reason is that in the case of author_submitAndWatchExtrinsic, if a JSON-RPC client is overloaded and doesn’t read from its socket, the JSON-RPC server has no choice but to buffer the list of peers potentially indefinitely. In the case transaction_unstable_submitAndWatch, the JSON-RPC server can simply stop sending broadcasted events as long as the JSON-RPC client isn’t reading from its socket.

While the list of peers is unlikely to occupy a significant amount of memory, every little bit helps.

Similarly, the other events generated by transaction_unstable_submitAndWatch have been adjusted in subtle ways so that they can be merged together while still preserving the logic that is attached to them. This is not the case with author_submitAndWatchExtrinsic.

How do I do manage the keystore of my node with the new API?

The functions (such as author_rotateKeys) of the legacy API that don’t need a lot of changes have simply been renamed and documented properly.

The new JSON-RPC API doesn’t aim at changing everything. However, enforcing the new naming scheme consistently is in my opinion a good idea.

In particular, the legacy author_rotateKeys function is now sudo_sessionKeys_unstable_generate.

What is the equivalent of consensus-related information queries such as babe_epochAuthorship, beefy_getFinalizedHead, or grandpa_roundState?

These functions belong to the category of functions that are used for programmers to debug a chain. They are not meant to be used by a UI that end users will use.

These functions are always called either manually or automatically by unstable tools. They are “hacks” in the noble sense of the word.

Because of this, they will intentionally never have any stable equivalent in the new JSON-RPC API. Notice the word “stable”.
These functions will not exist in the specification, as they shouldn’t be used by any stable tooling, but will still be available on nodes.
They will be renamed to conform to the new naming scheme and will forever remain _unstable_. Their return type will not be documented.

What are the equivalents of system_name and system_version in the new API?

These two functions have been merged into one called sudo_unstable_version.

What is the equivalent of system_properties in the new API?

It is chainSpec_unstable_properties.

What is the equivalent of system_chain in the new API?

It is chainSpec_unstable_chainName.

What is the equivalent of system_localListenAddresses in the new API?

There is none.

The system_localListenAddresses function returns the list of addresses the nodes is listening on.
This function has a valid theoretical use case, which is to start a node then query its address in order to provide this address to other nodes.
In practice, however, the IP address that is returned is typically 0.0.0.0 or [::], which aren’t very useful.
Even if a valid IP address is returned, it is generally the IP address of the host interface of the machine (e.g. 192.168.1.2), which again isn’t very useful.

Substrate/Polkadot nodes try determine their public-facing IP address, but this doesn’t always work and isn’t instantaneous.

While it is possible that we add back in the future a JSON-RPC function that would provide the theoretical use-case described above, doing so is pretty complicated and would first require some deep thinking.

What is the equivalent of system_syncState in the new API?

Use chainHead_unstable_follow in order to find the current best block.
There is no equivalent to knowing the starting block, as this information isn’t deemed useful. If you disagree, please raise an issue with your use case in the spec repo: https://github.com/paritytech/json-rpc-interface-spec/.
There is no equivalent to knowing the highest known block, as this value is very easily hijackable by malicious peers.

What are the equivalents of system_addReservedPeer, system_removedReservedPeer, and system_reservedPeers in the new API?

The “reserved peers” system of the Substrate client makes it possible to provide a list of nodes that the node must be connected to.
The Substrate client uses this list to determine who to download blocks from and gossip block announces, transactions, and finality messages with.

However, the discovery system and parachains system are completely orthogonal to the list of reserved peers and are free to connect to anyone and receive connections from anyone. It is not possible to restrict the discovery and parachains networking to only reserved peers without completely breaking them.

Many people use the reserved peers system expecting their node to only establish connections to the given peers, then are surprised when the node actually connects to other peers.

Because it is not very useful and because it is confusing, I have decided to remove the “reserved peers” system from the JSON-RPC API. This might be controversial, and if people disagree with this decision I would like to understand what their use case is. If a similar feature was to be added back, it would first need some deep re-thinking.

However, the new JSON-RPC API provides the sudo_unstable_p2pDiscover function, which gives the possibility to provide the address of a peer to the node. The node might then connect to this address. This is useful in testing situations or when creating a private network.

What is the equivalent of system_peers in the new API?

There is none, as this function was not deemed useful.

Similar to the reserved peers system (see question above), the definition of “peer” is ambiguous and confusing in lights of the discovery system and the parachains system.
Contrary to what one might expect, the legacy system_peers function does not return the list of all the peers the node is connected to, but only the list of peers it is receiving block announces from and downloading blocks from.

If an equivalent was to be added back, it would need some re-thinking.

What is the equivalent of system_nodeRoles in the new API?

There is none, as this function was not deemed useful. If you disagree, please raise an issue with your use case in the spec repo: https://github.com/paritytech/json-rpc-interface-spec/.

What is the equivalent of system_localPeerId in the new API?

There is none, as this function was not deemed useful. If you disagree, please raise an issue with your use case in the spec repo: https://github.com/paritytech/json-rpc-interface-spec/.

What is the equivalent of system_unstable_networkState in the new API?

This legacy JSON-RPC function was initially named system_networkState, then was renamed in order to indicate that it is not supposed to be used and might break at any time.
This function already accidentally uses the new naming scheme and can stay implemented on the server, but not documented. See also the question about babe_epochAuthorship, beefy_getFinalizedHead and grandpa_roundState above.

What is the equivalent of system_health in the new API?

There is none, as this function was not deemed useful.

The meaning of the is_syncing field has always been very ambiguous and has no place in the new JSON-RPC API. It is not possible to implement is_syncing without guessing what the client is going to do with this value.
Given that both full nodes and light clients are now using warp syncing in order to warp directly to the head of the chain, is_syncing is more or less obsolete anyway.

If you were using system_health as a node administrator in order to watch the number of peers (the peers field), using Prometheus metrics is a more appropriate way of doing this.

Why was JSON-RPC chosen? Why not gRPC for example?

The JSON-RPC protocol was chosen over more modern alternatives because:

  • It is very significantly more simple to implement than modern alternatives.
  • It is the protocol already in use by the legacy API, meaning that all the concerned developers are already familiar with it and with the concept of subscriptions.
  • It is possible to manually send JSON-RPC requests using curl for example, but not with gRPC, Cap’n Proto, or similar.
  • It has stood the test of time and is unlikely to ever be abandonned by the tech industry. It is for example not impossible for gRPC to be deprecated by Google and forgotten about (especially considering that companies which have adopted gRPC are typically the ones who embrace the “move fast and break things” ideology), in which case we would be stuck with an unmaintained technology. Even if JSON-RPC was deprecated/forgotten, it is so simple that this isn’t a problem.

I am extremely wary about the cargo cult of using a modern alternative to JSON-RPC just because everyone else does it.
The main advantage of modern alternatives over JSON-RPC is performance. In particular, JSON-RPC suffers from the head of line blocking problem.
However, I am personally convinced that these potential performance gains are heavily outweighted by the drawbacks outlined above. Additionally, keep in mind that making JSON-RPC calls over the Internet is fundamentally a hack. Everyone is in principle supposed to run their own node (i.e. JSON-RPC server) on the same machine as their UI (i.e. JSON-RPC client), in which case performance is a non-issue.

Any other question?

Feel free to ask other questions as a reply to this post, and I will include them in this Q&A.

2 Likes

(reserving in case the Q&A expands)

system_health is usually used to indicate the node is health (up & running). For example, we have configured monitoring on it to alert if it ever return an unexpected result. Also warp sync still going to take some amount of time to download the necessary state and sync to head, so we still need API to notify that the node is synced to tip and ready for usage.

I would encourage you to look into Prometheus. It is very specifically designed for alerting and can alert you much better than what repeatedly calling system_health can ever achieve.
It can for example alert you when no block has been authored for ten minutes, or when you have only outgoing connections (meaning that your port is most likely closed).

As for the warp syncing: if the node is ready for usage, then chainHead_unstable_follow will generate an initialized event. As as long as the node isn’t ready, no event is generated.
It is a much more straight forward design: to use your node, just start using it and it will tell you when it’s ready.
It doesn’t expose the concept of “syncing” which is fundamentally ambiguous.

Is there something that suitable for Health checks for your target groups - Elastic Load Balancing?

Original Substrate issue readiness and liveness endpoints for service health monitoring · Issue #1017 · paritytech/substrate · GitHub

The main use case is to dynamically update targets for load balancer so that it only exposes healthy/ready RPC nodes. It needs to speak something common load balancers understands.

Ah I understand the problem now. Specifically, when bringing up new nodes behind a load balancer there’s a period of time during which the node isn’t “ready” yet.

Independently of the new API, it is fundamentally impossible to do that in a clean way. For example, maybe node A is at block 10000 and node B is at block 50 (because it has only managed to connect to one other peer and this other peer is at block 50). They will both report that they’re healthy, but you probably want the load balancer to redirect clients to node A.

Even with warp syncing, you have no way to actually know whether you’re at the head of the chain or not. It’s actually even worse with warp syncing, because nodes currently can’t switch back to warp syncing after they’ve finished it the first time. So you might end up accidentally warp syncing to a very old block, and your node will then sync very slowly from there.

The only clean solution is to have a custom load balancer specifically for Substrate/Polkadot that will compare the latest block of all the nodes that it is load balancing.

This is a typical example where web2 conflicts with web3, because in the web2 world you never have that problem. Your web server is either started or not, it is never in a “started but can’t answer yet” phase.

Anyway, Health checks for your target groups - Elastic Load Balancing isn’t related to the JSON-RPC API anyway, as AWS can’t do live checks by sending JSON-RPC requests.

My opinion, for the sake of pragmatism, would indeed be to add some endpoint (like readiness and liveness endpoints for service health monitoring · Issue #1017 · paritytech/substrate · GitHub suggests), but implement it on top of the Prometheus server rather than use system_health.

If this is done by an external tool, the format in which Prometheus exposes metrics is insanely simple, and arguably even easier to obtain through a script than with a JSON-RPC request.

Thanks for the Q&A; it’s been a good read!

Unfortunately, and much to my disappointment, as far as I know no high-level library has yet committed to fully embracing the new JSON-RPC API at the moment.

Subxt has plans to do so (and has had for a while; we’d very much like to get to a point where we can remove all of the other methods and support only these).

We currently have this milestone to add trial support (while the methods are still unstable): Parity Roadmap · GitHub

Alex is up for taking on the job of implementing the recent spec changes in Substrate (including the archive_ methods) in probably the next quarter, and with our trial support in Subxt (also next quarter) we hope that we’ll be able to spot any holes and push for any changes that we need or think will improve the interface.

Of course, the more teams that try out the API, the better the feedback we’ll ultimately get for it will be, but we’re in a good position to help push this forwards I think :slight_smile:

What is the equivalent of consensus-related information queries such as babe_epochAuthorship, beefy_getFinalizedHead, or grandpa_roundState?

These functions will not exist in the specification, as they shouldn’t be used by any stable tooling, but will still be available on nodes. They will be renamed to conform to the new naming scheme and will forever remain unstable. Their return type will not be documented.

Is there an issue already to do this somewhere? We should make sure that one exists which lists the methods that should be renamed, so that when we do eventually get to removing the old methods we don’t go too far!

Thank you for your efforts!
I am glad that now we have a place to follow, discuss and propose new JSON RPC interfaces rather than constantly checking if any new important RPC was added to Substrate.
KAGOME client team will support implementation of new JSON RPC API in the future versions

1 Like

Would it be possible to get scale binary returned as a possibility rather than returning Json encoded hex encoded scale? For the odd call it doesn’t make much difference, but if you’re interrogating a node for indexing purposes then all that encoding and decoding has to add up (not to mention the additional code required to decode it).

How would you do that, practically speaking?

If you want to download all the data of an entire chain, I’d recommend doing libp2p network requests instead, as they’re more suitable. That’s what the warp syncing algorithm does, for instance.

The ws request can indicate that it accepts application/scale mime type in the http headers. If found in the request the response would be Binary rather than Text websocket. The json result shapes would have to have corresponding scale encoded types - I’m hoping that metadata v16 will include these ‘json’ result types. Potentially we can do the same kind of thing so that the request can be specified as scale in a bin ws message - it might be a bit strange if we only supported bin responses and not requests?

On the libp2p side, can libp2p request the same kinds of thing as json rpc does? Is it a stable interface to talk with polkadot with few breaking changes? I’ve not seen many rust programs interact with polkadot at this level.

In general not necessarily, but for specific things yes. The client is nothing more than a calculator that takes as input the libp2p networking messages, and the JSON-RPC API lets you query the result of the calculations that the client does.

Yes

That’s because it’s non trivial to talk to nodes.

The stack of protocols is TCP or WebSocket + multistream-select + Yamux + Noise + Substrate-specific messages.
Yamux and Noise are not specific to Substrate/Polkadot, but they’re not widely implemented (especially Yamux is pretty niche). Multistream-select is specific to libp2p. The Substrate-specific messages are obviously specific to Substrate but are pretty simple.

See 4. Networking | Polkadot Protocol Specification (not a super well written document, but basically nobody in charge really seems to give a fuck about the spec)

So basically if you want to talk to other nodes on the networking level, either you use rust-libp2p or you have to re-implement a lot yourself. And unfortunately rust-libp2p isn’t the most user-friendly library.
Smoldot has gone the path of reimplementing a lot manually and might eventually provide some easy-to-use API to talk to other nodes.

I’ve opened pull requests to stabilize the chainSpec-prefixed functions and the transaction-prefixed functions.
This is the last call for comments on these functions (although I must mention that, for the sake of shipping things, I will probably be against large-scale changes now).

While we’re still tweaking the chainHead-prefixed functions, I think that they’re nearing stabilization as well.