Check nearby libraries
Buy this book

xxv, 727 pages : 24 cm
Check nearby libraries
Buy this book

Previews available in: English
Subjects
Electronic data processing, Distributed processing, Cloud computing, File organization (Computer science), Apache Hadoop (Computer file), Apache Hadoop, Computer software, Apache (computer program), Data processing, COMPUTERS, Database Management, General, Desktop Applications, Databases, System Administration, Storage & Retrieval, Hadoop, Parallel, Java, Games, Com051220, Cs.cmp_sc.app_sw, Cs.cmp_sc.prog_lang, Open sourceShowing 4 featured editions. View all 12 editions?
Edition | Availability |
---|---|
1 |
eeee
|
2
Hadoop: The Definitive Guide
May 19, 2012, O'Reilly Media, Inc
Paperback
- 3rd edition
1449311520 9781449311520
|
aaaa
|
3
Hadoop: The Definitive Guide
2010, O'Reilly
electronic resource :
in English
- 2nd rev. and updated ed.
1449398642 9781449398644
|
cccc
|
4 |
zzzz
|
Book Details
Table of Contents
1. Meet Hadoop
Data!
Data Storage and Analysis
Comparison with Other Systems
A Brief History of Hadoop
Apache Hadoop and the Hadoop Ecosystem
Hadoop Releases
2. MapReduce
A Weather Dataset
Analyzing the Data with Unix Tools
Analyzing the Data with Hadoop
Scaling Out
Hadoop Streaming
Hadoop Pipes
3. The Hadoop Distributed Filesystem
The Design of HDFS
HDFS Concepts
The Command-Line Interface
Hadoop Filesystems
The Java Interface
Data Flow
Data Ingest with Flume and Sqoop
Parallel Copying with distcp
Hadoop Archives
4. Hadoop I/O
Data Integrity
Compression
Serialization
Avro
File-Based Data Structures
5. Developing a MapReduce Application
The Configuration API
Setting Up the Development Environment
Writing a Unit Test with MRUnit
Running Locally on Test Data
Running on a Cluster
Tuning a Job
MapReduce Workflows
6. How MapReduce Works
Anatomy of a MapReduce Job Run
Failures
Job Scheduling
Shuffle and Sort
Task Execution
7. MapReduce Types and Formats
MapReduce Types
Input Formats
Output Formats
8. MapReduce Features
Counters
Sorting
Joins
Side Data Distribution
MapReduce Library Classes
9. Setting Up a Hadoop Cluster
Cluster Specification
Cluster Setup and Installation
SSH Configuration
Hadoop Configuration
YARN Configuration
Security
Benchmarking a Hadoop Cluster
Hadoop in the Cloud
10. Administering Hadoop
HDFS
Monitoring
Maintenance
11. Pig
Installing and Running Pig
An Example
Comparison with Databases
Pig Latin
User-Defined Functions
Data Processing Operators
Pig in Practice
12. Hive
Installing Hive
An Example
Running Hive
Comparison with Traditional Databases
HiveQL
Tables
Querying Data
User-Defined Functions
13. HBase
HBasics
Concepts
Installation
Clients
Example
HBase Versus RDBMS
Praxis
14. ZooKeeper
Installing and Running ZooKeeper
An Example
The ZooKeeper Service
Building Applications with ZooKeeper
ZooKeeper in Production
15. Sqoop
Getting Sqoop
Sqoop Connectors
A Sample Import
Generated Code
Imports: A Deeper Look
Working with Imported Data
Importing Large Objects
Performing an Export
Exports: A Deeper Look
16. Case Studies
Hadoop Usage at Last.fm
Hadoop and Hive at Facebook
Nutch Search Engine
Log Processing at Rackspace
Cascading
TeraByte Sort on Apache Hadoop
Using Pig and Wukong to Explore Billion-edge Network Graphs
Classifications
The Physical Object
Edition Identifiers
Work Identifiers
Community Reviews (0)
History
- Created May 22, 2014
- 14 revisions
Wikipedia citation
×CloseCopy and paste this code into your Wikipedia page. Need help?
March 28, 2025 | Edited by ImportBot | Redacting ocaids |
December 20, 2023 | Edited by ImportBot | import existing book |
December 21, 2022 | Edited by MARC Bot | import existing book |
December 13, 2022 | Edited by MARC Bot | import existing book |
May 22, 2014 | Created by Kar Vi | Added new book. |