Experiment: P2P Network Visualization

:wave: Greetings,

We’ve been prototyping a P2P scanner for Substrate networks. Our little experiment involved scanning the Kusama network at the P2P-level and creating a geospatial visualization of the discovered nodes with enriched information.

:mag: Take a peek at the overall network of discovered nodes (~2k) in the image below.

And a screenshot showing some node information.

Curious about the data? Here’s a snippet of the information for each peer in JSON format:

{
  "_id": "12D3KooWSueCPH3puP2PcvqPJdNaDNF3jMZjtJtDiSy35pWrbt5h",
  "type": "Ed25519",
  "addresses": [
    {
      "family": 4,
      "host": "51.77.66.187",
      "transport": "tcp",
      "port": 30333
    },
    {
      "family": 4,
      "host": "51.77.66.187",
      "transport": "tcp",
      "port": 30334
    },
    {
      "family": 6,
      "host": "a29:a25:3520:6b75:7361:6d61:2d62:6f6f",
      "transport": "tcp",
      "port": 30334
    }
  ],
  "agent": {
    "version": "Parity Polkadot/v1.3.0-7c9fd83805c (kusama-bootnode-0)",
    "protocol": "/substrate/1.0"
  },
  "protocols": [
  "/b0a8d493285c2df73290dfb7e61f870f17b41801197a149ca93654499ea3dafe/beefy/2",
  "/b0a8d493285c2df73290dfb7e61f870f17b41801197a149ca93654499ea3dafe/beefy/justifications/1",
  "/b0a8d493285c2df73290dfb7e61f870f17b41801197a149ca93654499ea3dafe/block-announces/1",
  "/b0a8d493285c2df73290dfb7e61f870f17b41801197a149ca93654499ea3dafe/grandpa/1",
  ...more protocols
  ],
  "metadata": {
    "AgentVersion": {
      "$binary": {
        "base64": "UGFyaXR5IFBvbGthZG90L3YxLjMuMC03YzlmZDgzODA1YyAoa3VzYW1hLWJvb3Rub2RlLTAp",
        "subType": "00"
      }
    },
    "ProtocolVersion": {
      "$binary": {
        "base64": "L3N1YnN0cmF0ZS8xLjA=",
        "subType": "00"
      }
    }
  },
  "tags": {
    "bootstrap": {
      "value": 50,
      "expiry": 1699622067589
    }
  },
  "ipInfos": [
    {
      "ip": "51.77.66.187",
      "hostname": "ns3136711.ip-51-77-66.eu",
      "city": "Frankfurt am Main",
      "region": "Hesse",
      "country": "DE",
      "country_name": "Germany",
      "country_currency": {
        "code": "EUR",
        "symbol": "€"
      },
      "continent": {
        "code": "EU",
        "name": "Europe"
      },
      "isEU": true,
      "loc": "50.1155,8.6842",
      "org": "AS16276 OVH SAS",
      "postal": "60306",
      "timezone": "Europe/Berlin"
    }
  ]
}

Cheers!

13 Likes

As a heads up, you can also determine which nodes are validators using the so-called “authority discovery” system. Each validator uploads to the DHT its network identity.

I believe that a tool like that is much better than the existing telemetry system, as it is not centralized and uses the existing network protocol rather than a separate one.

3 Likes

Amazing interface!

  • Is the P2P network scanned entirely from the browser and is it written in Rust :smiley: ?
  • Did you find any reliable database for IP translation to geolocation?
  • Usually takes a bit of time to scan the network, did you implement any sort of backoff / heuristics to improve the time of data collection?

That reminds me of a little tool I wrote a while back that can scan any substrate-based p2p network and submit transactions on the low level protocol. The limitation atm is that is not wasm compatible.

@tomaka Could the “authority” also be derived while scanning the network from the NodeRole::authority part of the handshake? :thinking:

2 Likes

Seemingly what is being put into the DHT is the hash256 of the AuthorityId with the corresponding value of a SignedAuthorityRecord containing multi addresses and the signature. Unfortunately, the AuthorityId is a session key, so we cannot find the authority records by any derivation of the information contained in the node peer id. Are we overlooking something?

A potential solution is to utilize the light client protocol for querying the state to get the session.validators or even the staking.validators. Then we can try to map the addresses to the node by using session.nextKeys to obtain the authority id and subsequently query the DHT for the SignedAuthorityRecord. However, this approach feels somewhat overcomplicated, and not very convincing.

Using the AuthorityDiscoveryApi to query the runtime could be another way…

As for the sync handshake approach using the role field (@lexnv), the approach seems like abusing the protocol semantics and lacks security guarantees. But it has the upside that it is easy to implement.

Thanks for the feedback! :smiley_cat:

  • No it is not, we are using Node.js and libp2p. Running from the browser has serious limitations in relation to the supported protocols and secure origins. Since it is a scanner we want to support all the protocols possible.
  • We are using https://ipinfo.io/ that is pretty nice, but has limitations on the free-tier. We are thinking on integrating Shodan as well, and maybe a Substrate NSE lib for nmap.
  • Yes, we wrote a custom kad handler with focus on quick discovery instead of using the standard kademlia discovery implementation.
1 Like

Yeah that’s how I had in mind you would do it. The AuthorityDiscoveryApi_authorities runtime call returns the list of all current and next authorities.

The other drawback of this approach is that knowing the role of a peer requires opening a gossip channel with it, and that peer could refuse the opening.

In practice, due to spaghetti code issues, Substrate never straight-up refuses opening a gossip channel and instead implements refusing by immediately closing any gossip channel that it doesn’t want. Consequently you’re always able to know the role of a peer.
But in theory, at least how the networking protocol has been conceived, an implementation could straight up deny the opening and not send back its role.

1 Like

This will be fixed in the relatively-near future and I wouldn’t write any code that relies on that specific behavior.

2 Likes

We’ve added support to the prototype for authority discovery using the AuthorityDiscoveryApi to get the authority keys, querying the DHT to get the authority record and finally matching it with the peer id.

Indicating the authority nodes with a different color:

12 Likes

That’s really cool! It’s fascinating to see how the Substrate ecosystem is evolving, and this project sounds like a valuable tool for understanding the network’s topology and performance. I’m particularly interested in the geospatial visualization aspect, as it could provide insights into the global distribution of nodes and potential bottlenecks.

I like the direction of this project. Wonder if you have the codebase open source somewhere? Would love to learn more about how you implement the system.

Nice tool! Where can we find this? :smiley:

Since seems that the experiment has some interest, we will publish it in a public repo when we have some spare time to clean it up and so on, as it is currently a side project. :smiley_cat:

Do you have specific questions? We are happy to help. As a note, the discovery mechanism is the same as Exploring alternatives for Substrate Telemetry - #4 by dennis-tra (surprisingly, because we were not aware of the Nebula project).

As replied above regarding the source code. Regarding a deployment, we were running it locally.

1 Like

Nebula currently doesn’t include AuthorityDiscovery as far as I can see and this is quite a useful feature. So, would be nice if you can put the code somewhere. Even if the code isn’t that nice :wink:

1 Like

I’ve uploaded a gist with the steps of the P2P authority discovery process.

Here you go: Polkadot P2P Authority Discovery · GitHub

Have a good day! :smiley:

1 Like

Any chance the backend will be open sourced anywhere anytime soon? I think this would be very useful to have available.

3 Likes

As soon as we have a little bit of spare time we will allocate to open source it :smile::flamingo:

I suppose that the most interesting parts are the dht crawler and peer id mapping, maybe other parts are of your interest?

2 Likes

This is very interesting, wondering if you have stopped working on this?

Hey! Glad that you found it interesting.

We’ve explored the possibilities of a production-ready P2P-level network explorer, incorporating features like real-time tracking, Polkadot specific elements such as detecting parachain collators and stake delegations, topology insights around nodes availability by exploring resilience through uptime, connection spans and autonomous system clusters, and advanced visualizations. There’re many interesting possibilities. We even considered implementing it in Nim.

However, while the technical foundation and conceptualization are solid, the lack of funding and our current focus on other priorities have shifted our efforts elsewhere for the time being.

1 Like