Currently, when you think of P2P File Sharing, you think of a program where you search, locate, and retrieve files all within the functionality of the P2P network. The problem is, P2P is not currently the most efficant or secure way of searching and locating files. Many users will find misnamed files, retrieveing files they did not intend. They also will find hundereds of similar files, differing in unknown ways to the user unless they download each and every one of them for comparison.
I believe the days of P2P search is ending, and instead P2P file sharing will be used primarilly by content publishers as a means of cheap and efficant data distribuition and propagation. The publisher links to a Magnet URI or eDonkey URI from their website, and the user or their browser uses that to download the exact file in question. The URI contains the hash and the name of the file to be saved as.
How can we guarantee that the exact file the publisher is referring to is provided to the user and not some other file with a similar name? Well, instead of having a search done based on a file’s name, a search is done its hash.
A hash, also known as a message digest, is a mathematical checksum algorithm that is unique to each and every file. The idea is that it is almost impossible to create a file to match a hash, only to create a hash from a file. Given a file and a hash, one can determine if they complement each other or not. Hashes are smaller than the origional file, and usually are the same length as other hashes made using the same mathematical checksum algorithm.
In the case of hash based P2P file sharing networks, the user or their P2P client, provided a hash by the content publisher, searces, locates, and downloads a file using said hash. Other clients, given the hash, can search their data stores for a file that match that hash and return that file. Once the file has been retrieved, the user can verify that the file is the one intended by the content publisher by checking to see if the hash compliments the file or not.
Since the network is less burdened with problematic file name searching in favor of file hash searching, there is more bandwidth available for actual data distribution. This lack of name searching, combined with other file sharing techniques such as swarming, provides an optimal network that is faster than more traditional ones.
The best example of hash based P2P file sharing is BitTorrent. While it is not a network per se, it is a P2P protocol that uses hashes. In it, files are split into blocks of a set size, and each block’s hash is recorded into a .torrent file, as well as the address of a tracker which coordinates the peers in the P2P swarm. Because there is a larger amount of hash (ha ha), it is even more secure. Some hash based P2P file sharing networks also carry this characeristic of block hashing.
However, it would be ideal to have a P2P file sharing network itself that was hash based. The good news is that there already are. The bad news is that not one has been fully adopted.
The current leading network regarding hashes is the eDonkey2000 network, with KaZaA and Gnutella coming up in second and third, respectively. The problem is, the speed and reliability of these networks is often horrible compared to a functioning BitTorrent tracker.
We’ve also got hashed based P2P file sharing networks that cache data automatically, such as Freenet and Mnet. While these are better at dealing with data available on the network, they seem to fare even worse.
The goal is to have a cached hash based P2P file sharing network; one where content publishers can link to their files and users can retrieve and redistribute them, all using short hashes to refer to the files in question. The current step to achieve this goal is to improve speed and reliability. As to when this will happen, it depends on those designing and developing these networks.
Personally, I am voting for a modified Gnutella network with a caching mechanism, because I find their network to be the fastest and most reliable hash based P2P file sharing network available today. I would vote for Mnet also, however I have yet to see it in action.
Pingback: PixelCort » Put and Get