Eyes Clouded by Distributed Systems

You are probably reading this article with a dual- or quad-core processor, and perhaps with even more cores. Your computer is already a distributed system, with multiple computing components—cores—communicating with each other via main memory and other channels such as physical buses—or wires—between them. As you browse multiple web pages you are interacting with the largest distributed system ever created—the Internet. We recently celebrated IPv6 Day [0]: IPv6 is a new form of addressing devices connected to the Internet because its sheer scale has outgrown the previous standard IPv4’s list of addresses—all 4+ billion of them. Every Internet company depends on distributed systems, and, by extension, the economies of the world are now tied to them.

Companies such as Google, Facebook, and Amazon are all interested in building highly efficient large-scale distributed systems which power their businesses. Over the previous decade, Google has described their Google File System (GFS) [1]—a file system spanning thousands of computers to store more data than any single computer system, and a technology that has shaped almost every form of large-scale computing since publication: MapReduce [2]. MapReduce is distributed computing for the masses because it distills everything down to two functions—Map and Reduce—and once they are specified it handles all other aspects of coordinating thousands of computers on behalf of the programmer. Facebook has released open source projects such as Thrift [3] for implementing communication between programs in different programming languages. Amazon built the first, and largest, public cloud EC2 [4] by inventing new distributed systems designed to bring datacenter scale to the masses—with EC2 you can easily start 100 servers within minutes. Amazon has offered many other services to enhance their overall cloud such as a storage substrate called S3 [5]—think of it as a building block for a GFS—and CloudFront [6], a content distribution network (CDN) designed to distribute data around the world for low latency and high bandwidth access. Akamai [7] also helps deliver the web’s content with one of the largest CDN networks in the world. Netflix has their own distributed CDN [8] as they outgrew solutions provided by Akamai and Amazon.

The Domain Name System [9] is a large distributed system everyone is familiar with—either directly or indirectly. You may have registered a DNS name in the past providing you with your own customized domain name such as www.wolfgangrichter.com (not registered by the author). DNS is comprised of a multi-tier distributed architecture for load-balancing and efficiency. DNS one of the earliest examples of a distributed key-value store, sometimes called a dictionary—just something that maps arbitrary input keys to arbitrary output values. In DNS, your input key is a human-readable domain name and the output value your computer expects DNS to return is a numeric IP address—described in IPv4 or IPv6—meant for machine consumption. The highest tier redirects to lower tiers and so on to reduce load and force those responsible for domain names to host their own mapping DNS servers. You can imagine how slow the Internet would become if all domain name mappings had to be stored on a single small set of computers. The IP address is used by your web browser or other network-enabled applications to contact a server representing the human-readable domain name provided by you.

With world economies tied to distributed systems, it is no mistake that the study of distributed systems is paramount to the future of computing and research reflects this with efforts such as the Exascale [10] project. The Exascale project explores what future distributed systems might look like beyond the largest scale imaginable today. No problem moving forward will be able to avoid the often messy, although ultimately satisfying when overcome, challenges of distributed computing. The future of computing depends upon our capabilities to develop, deploy, and maintain distributed systems.

As a readership, and author for this blog, we must come to agreement on the definition for ‘distributed system.’ Of course, as a distributed systems researcher, my view is clouded by a lens through which I see everything as a distributed system. You may not agree with me, and we encourage discourse, so please feel free to comment in with your criticism. You might wonder, “Why don’t we just use the definition in Merriam-Websters [not in Merriam-Websters!]? Or Wikipedia [11]?” Well, everyone in academia likes to make up their own definitions for things, and occassionally their own words. I have, of course, saved the best for last. I hope that the definition below crisply defines what a distributed system is in your mind, as I hope to dissect many of the most interesting developments in distributed systems research in future articles:

In Computer Science, a distributed system is any set of entities capable of computation which also have the capability of communicating via a set of mechanisms such that computation may be organized among them.

Examples of distributed systems:

Car – multiple embedded microprocessors
Single core computer with graphics card – two discrete computation entities communicating via shared buses
Multicore computer – clearly a distributed system with multiple cores
Networked computers – at a minimum they cooperate via network protocols; in the limit they could be architected together for high performance or scientific computing

3 thoughts on “Eyes Clouded by Distributed Systems”

Choi on July 22, 2012 at 5:22 pm said:

I don’t think that managing 10,000 svreers is a good use of resources. If virtual svreers exist and can handle the same amount of data or more, it would likely save company resources. Potentially using less energy to keep them cool and less manpower to maintain them. More likely there needs to be development of smaller svreers with more capacity to reduce the footprint freeing up more space.

Reply ↓
- Wolfgang Richter on August 2, 2012 at 6:52 pm said:
  
  Well, the virtual servers have to run on real servers right?
  
  So, if a company has say 70,000 servers, and they pack 7 to a single real server, they still need 10,000 servers.
  
  Fundamentally, it is good to study and understand how to manage tens of thousands of servers.
  
  Reply ↓
- Tran on August 24, 2012 at 12:12 pm said:
  
  I wish to express my apciepration to this writer for bailing me out of such a challenge. Just after browsing through the search engines and finding principles which are not powerful, I was thinking my entire life was well over. Living devoid of the strategies to the issues you have fixed through this posting is a serious case, and those which may have adversely affected my career if I hadn’t encountered your web site. That capability and kindness in handling all areas was crucial. I’m not sure what I would’ve done if I hadn’t discovered such a subject like this. I can also at this time look ahead to my future. Thank you so much for the skilled and results-oriented guide. I will not be reluctant to endorse your web site to any individual who needs to have guidelines on this subject matter.
  
  Reply ↓

XRDS

Crossroads – The ACM Magazine for Students

Eyes Clouded by Distributed Systems

3 thoughts on “Eyes Clouded by Distributed Systems”

Leave a Reply Cancel reply