Distributed computing with Linux and Hadoop

Posted by jmalasko on Dec 11, 2008 2:43 AM EDT
IBM/developerWorks
Mail this story
Print this story

Every day people rely on search engines to find specific content in the many terabytes of data that exist on the Internet, but have you ever wondered how this search is actually performed? One approach is Apache's Hadoop, which is a software framework that enables distributed manipulation of vast amounts of data. One application of Hadoop is parallel indexing of Internet Web pages. Hadoop is an Apache project with support from Yahoo!, Google, IBM, and others. This article introduces the Hadoop framework and shows you why it's one of the most important Linux-based distributed computing frameworks.

Every day people rely on search engines to find specific content in the many terabytes of data that exist on the Internet, but have you ever wondered how this search is actually performed? One approach is Apache's Hadoop, which is a software framework that enables distributed manipulation of vast amounts of data. This article introduces the Hadoop framework and shows you why it's one of the most important Linux-based distributed computing frameworks.

Full Story

  Nav
» Read more about: Story Type: News Story; Groups: Linux

« Return to the newswire homepage

This topic does not have any threads posted yet!

You cannot post until you login.