Chord: A Scalable Peer-to-peer Lookup Protocol for Internet Applications – Part I: Motivation and System overview

0 0
Read Time:4 Minute, 31 Second

In the very first post on All Things Distributed, we will be reviewing the paper: Chord: A Scalable Peer-to-peer Lookup Protocol for Internet Applications available here. We will discuss the paper in segments and this edition covers the motivation behind Chord and provides a system overview.

Figure 1: A 16-node Chord network.

Introduction

A fundamental problem that confronts peer-to-peer applications is the efficient location of the node that stores a desired data item. Chord distributed protocol attempts to address this problem. The idea behind this is quite simple: Given a key, it maps the key onto a node. Data location can now be easily implemented on top of Chord by associating a key with each data item, and storing the key/data pair at the node to which the key maps.

Chord also takes into account the dynamic nature of a peer-to-peer network with nodes joining and leaving the network constantly. Chord is designed to be scalable and experiments have shown that Chord scales logarithmically with the number of Chord nodes.

Peer-to-peer networks can be built with a very rich set of features such as anonymity, redundant storage, permanence, search, authentication, hierarchical naming etc. However, a core operation of most peer-to-peer networks is efficient location of data items and to identify which node has a given piece of data. Chord is a scalable protocol for lookup in a dynamic peer-to-peer system with frequent node arrivals and departures.

The Chord protocol supports only one operation: Given a key, it maps the key onto a node. Depending on the application using Chord, that node might be responsible for storing a value associated with the key. Chord uses consistent hashing to assign keys to Chord nodes. Consistent hashing tends to balance load, since each node receives roughly the same number of keys, and requires relatively little movement of keys when nodes join and leave the system.

Usually in Consistent hashing, it is assumed that each node is aware of most of the other nodes in the system. However, this has some scalability issues for large number of nodes. Chord attempts to solve this problem by maintaining a routing table at each node. Each node only needs to maintain routing information about only a few nodes in the system. Since this routing table is distributed and maintained at each node, a Chord node communicates with other nodes to look up data efficiently.

In the steady state, in an N-node system, each node maintains information about only O(log N) other nodes, and resolves all lookups via O(log N) messages to other nodes. This routing table is kept up-to-date by the Chord system as nodes join and leave the network.

Note that a Chord node requires information about O(logN) other nodes for efficient routing, but performance degrades gracefully when that information is out of date. This is particularly important as in a large-scale peer-to-peer network nodes may be constantly joining and leaving the network and it may be hard to maintain information about O(log N) state.

The Chord protocol has been evaluated as a tool to serve DNS and also in a scheme to store public keys in a distributed fashion for secure name resolution. For additional work around Chord and related work, we encourage the reader to check Section II. Related work in the above paper.

The Chord System

Chord addresses these problems:

  • Load Balancing: Using a hash function, Chord distributes keys evenly over nodes.
  • Decentralization: Chord is fully distributed; all nodes are considered equal and perform functionally the same.
  • Scalability: Lookups in Chord system grows logarithmically with the number of Nodes. This is automatic; no parameter tuning is necessary
  • Availability: Chord continuously adjusts its routing tables as new nodes join and leave the network; This ensures that a node carrying the requested data can always be found.
  • Flexible Naming: Chord key space is flat; there is no special naming or other artificial constraints imposed in the system.


A typical usage of Chord is to link an application with the Chord library. The Chord library provides an API lookup(key) that returns the IP address of the node that is responsible for this key. Additionally, Chord software on each node notifies the application of changes in the set of keys that the node is responsible for. This allows the application software to, for example, move corresponding values to their new homes when a new node joins.

Chord can be used to build applications like Cooperative mirroring, Time-shared storage, Distributed Indexes, Large-scale Combinatorial Search etc. A distributed hash table can take advantage of Chord to identify the node for a given key and communicate with it.

In the next post, we will see the Chord Protocol in action.

References

  • Chord on Wikipedia
  • KARGER, D., LEHMAN, E., LEIGHTON, F., LEVINE, M., LEWIN, D.,AND PANIGRAHY, R. Consistent hashing and random trees: Distributed caching protocols for relieving hot spots on the World Wide Web. In Proceedings of the 29th Annual ACM Symposium on Theory of Computing (El Paso, TX, May 1997), pp. 654–663.
  • LEWIN, D. Consistent hashing and random trees: Algorithms for caching in distributed networks. Master’s thesis, Department of EECS, MIT, 1998. Available at the MIT Library, http://thesis.mit.edu/.
  • Medium post on Chord (with implementation)
Happy
0 0 %
Sad
0 0 %
Excited
0 0 %
Sleepy
0 0 %
Angry
0 0 %
Surprise
0 0 %

Average Rating

5 Star
0%
4 Star
0%
3 Star
0%
2 Star
0%
1 Star
0%

Leave a Comment

Your email address will not be published. Required fields are marked *

Exit mobile version