All metrics are imperfect, but many are useful. Let's make them more available

sourabhniyogi · February 18, 2023, 2:39pm

As marketing people, “investors”, competing ecosystems, govts monitoring Web2, etc. use DAU and MAU vocabulary, I strongly believe that it’s important to retain the notion of “active” and not abandon it because it’s difficult to define otherwise. In particular, I think we should arrive at a definition that conforms to human intuition for “active” - something like “clicked the sign button on a wallet” with only one private key, or a bot doing so via equivalent code (which we can’t distinguish), and corresponds to signed extrinsics (or signed EVM Transactions). I don’t think it’s healthy to attempt to include accounts that haven’t passed the “is there a signature?” test as “active.”

Let’s consider four cases:

If P is a proxy account for X, and there is an extrinsic signed by P to transfer Assets from Account X to Account Y, only P is active, while both X and Y are passive.
If there is a 2-of-3 multisig account with S_1+S_2+S_3 controlling X, and there are two extrinsics from S_1 and S_2 to transfer Asset from Account X to Account Y, only S_1 and S_2 are active, while Y are passive and S_3 has not performed any action onchain to be considered active nor passive.
If there is an XCM transfer from an origination chain C1 of account S_4 with an amount A to a destination chain C2 with a beneficiary Y, only S_4 is active, while Y on destination chain is passive.
If there is remote execution from an origination chain C1 of account S_5 causing some Transact operation with a destination chain C2 with some keyless derivative account Y who does a transfer of some amount A to account Z, only S_5 is active, while Y and Z are passive.

The border between active and passive is guided by the simple “is there a signature?” test, where signing is the fundamental operation that drives our human intuition of “active”. I recommend refining (or coming to consensus) by explicitly considering these 4 cases, and adding additional cases to refine further. If we wanted to artificially inflate the number of accounts, we could consider X + Y + S_3 + Z as “active” accounts. However, this would be overzealous and potentially misleading, and it goes against human intuition. These accounts should remain “passive.” I understand the marketer within us may want to claim bigger numbers of active accounts, but honest analytics should not be driven by these insecurities.

The Polkadot ecosystem is led by people who consistently do the right thing and take the long-term view. We shouldn’t twist “active” based on any other psychology. I would like the industry to conduct cross-ecosystem analytics in a multichain objective way, with Polkadot doing the right thing. We should not suddenly go against human intuition unless CeFi entities (Coinbase, Binance, OpenSea, etc.) publicly start using “active” in this manner.

Therefore, we keep the “active” definition simple and adhering to human intuition:

Active Accounts: (Substrate)
Accounts that have signed an extrinsic

Passive Accounts: (Substrate)
Accounts that aren’t active but have any balance changes for any asset (native or non-native asset, within chain or cross-chain). This can include proxy and multisig (X, Y, S_3 in the above examples), remote execution triggered by other accounts, transfer recipients (including crowdloan and staking rewards distribution), and balance deductions (e.g., pre-authorized transferFrom/proxy called by other active accounts).

We have mechanized the above definitions with exact substrate-etl BigQuery here:

github.com

colorfulnotion/substrate-etl/blob/ff793176f5c49a39ed5e760409fe0b16a497e65a/DEFINITIONS.md

# Substrate-etl Definitions

_Note: These are tentative, and may be revised based on community feedback._

To support precise transparent definitions of all data summarized within substrate-etl report summaries (and used in polkaholic.io)
(see [All metrics are imperfect, but many are useful. Let’s make them more available](https://forum.polkadot.network/t/all-metrics-are-imperfect-but-many-are-useful-lets-make-them-more-available/1858/4)),
we attempt to define terms used in this repo both in English and through BigQuery on the substrate-etl datasets.

The open source approach taken here is of _transparency_ and _reproducibility_: with exact BigQuery computations, anyone can reproduce any datapoint and improve any definition with adjustments to query form.  

## Account Metrics (Substrate)

* _Active Accounts_ (Substrate): Accounts that have signed an extrinsic on a Substrate chain 
* _System Accounts_ (Substrate): Accounts that have participated in consensus and produced a block
* _Passive Accounts_ (Substrate): Accounts that aren't active but have any balance changes for any asset (native or non-native asset, within chain or cross-chain). This can include proxy and multisig accounts, remote execution triggered by other accounts, transfer recipients (including crowdloan and staking rewards distribution), and balance deductions (e.g., pre-authorized transferFrom/proxy called by other active accounts).

The above definitions are mechanized in `substrate-etl` BigQuery below.  The following computes _Active Accounts_, _System Accounts_ and _Passive Accounts_ for the Kusama relay chain for February 1, 2023 using the following public data:

* `substrate-etl.kusama.extrinsics0`
* `substrate-etl.kusama.blocks0`

This file has been truncated. show original

The 4 cases active/passive distinctions are covered perfectly with the above definition, but we imagine being able to refine both definitions precisely with open source transparent code as new forms of active/passive evolve.

The open source approach taken here is of transparency and reproducibility: with exact BigQuery computations, anyone can reproduce any datapoint and improve any definition with simple code adjustments to the query form.

Topic		Replies	Views
Web3metrics.com - Alpha now live. Tell us if you want more Ecosystem	0	458	April 20, 2023
Adoption, metrics, treasury spending and ROI - open discussion thread	12	1549	March 3, 2023
Treasury spending - teams reporting user numbers, active addresses and useful metrics Governance treasury , kusama , metrics	13	872	July 26, 2023
Portal to parachains Ecosystem	3	346	March 14, 2023
Improving the substrate/ecosystem vulnerabilities disclosure Tech Talk security	19	1322	April 15, 2024

All metrics are imperfect, but many are useful. Let's make them more available

Related topics