Distrubuted Hash Tables for nuWeb

I’ve been recently referring to Distributed Hash Tabses, or DHTs (no relation to the chemical that balds heads) in my entries and on the nuWeb project. Since I didn’t directly introduce them I’ll do so now.

Descriptions: * Wikipedia * infoAnarchy * nuWeb

A few weeks ago I finally began to understand how DHTs worked. Interestingly, they were different than what I thought they were. My goal for the DHT network for the nuWeb project is to provide put(data) and get(key) functions with the key being the hash of the data, whereas most DHT networks provide put(key,value) and get(key) functions where the hash is of the key and the data is arbitrary, quickly opening the possability of key collisions.

So, I began to feel at a loss. If DHTs don’t use the hash of the data and instead use the hash of the key, how could I use it for the P2P aspect of the nuWeb project? And then it hit me…

Just do it as a subset of the DHT functionality - use the data for both the key and the value! The key is hashed and used externally, the value safely stored in the DHT, among other potential key collision values. When we use the key and get all the bogus values, we can determine which one was the one intended because it will hash properly to the key we just used.

And, because of the seperation of keys and values, we can also develop new non-hash-based data structures to be used, such as hashing the URL or other attribute of a document hosted on a non-nuWeb-supporting host. We rely on header characteristics in the values to check that it’s the same document as on the original host, and proceed to download it in addition to the progressive download from the host.

So we can still do the put(data) and get(key) functions, and the room to expand to the full put(key,value) and get(key) functions is still there.

All we need now is a DHT network that’s:

  • Open (Standardized)
  • Stable (Works)
  • Reliable (Fast)

Overnet and eDonkey2000 are closed source, Freenet is too slow, Mnet won’t even work for me, and the rest of the developmental DHTs are too complex or ‘researchy’. I’d like to also do splitfiles, so that we don’t put too much value into a single key, instead splitting it up into smaller entries and creating a final manifest entry representing them all.

We could possibly even use forward error correction. At the same time, I always wondered theoretically if FEC actually help or merely creates its own problems since it increases the data needed to be stored by the DHT network.

In the end, this clearly seems to be the future of P2P technology. With that and my hunch that P2P will be the best profititable technology since VHS, I think there’s a lot of potential here. This in combination with my idea to provide live updates via instant messaging, forms the hybrid that will recreate the web - a nuWeb!

0 Responses to “Distrubuted Hash Tables for nuWeb”


  1. No Comments

Leave a Reply