- Overview
- Quick Links
- Quick Start
- Prerequisites
- Getting Started
- Getting Help
- Docker Image Management
- Interact with Hadoop
Quick and easy way to get Hadoop running in pseudo-distributed mode using Docker.
See Hadoop docs for more information.
Impatient, and just want Hadoop quickly?:
docker run --rm -ti --name hadoop-pseudo loum/hadoop-pseudo:latest
NOTE: More at https://hub.docker.com/r/loum/hadoop-pseudo
- Docker
- GNU make
- Python 3 Interpreter. We recommend installing pyenv.
Get the code and change into the top level git
project directory:
git clone https://github.com/loum/hadoop-pseudo.git && cd hadoop-pseudo
NOTE: Run all commands from the top-level directory of the
git
repository.
For first-time setup, prime the Makester project:
git submodule update --init
Keep Makester project up-to-date with:
make submodule-update
Setup the environment:
make init
There should be a make
target to get most things done. Check the help for more information:
make help
NOTE: See Makester's
docker
subsystem for more detailed container image operations.
Build the container image locally:
make image-build
Search for built container image:
make image-search
Delete the container image:
make image-rm
Every Hadoop configuration setting can be overridden during container startup by targeting the setting name and prepending the configuration file context as per the following:
- Hadoop core-default.xml | Override with
CORE_SITE__<setting>
- Hadoop hdfs-default.xml | Override token
HDFS_SITE__<setting>
- Hadoop mapred-default.xml | Override with
MAPRED_SITE__<setting>
- Hadoop yarn-default.xml | Override with
YARN_SITE__<setting>
To start the container and wait for all Hadoop services to initiate:
make controlled-run
Get the Hadoop version:
make hadoop-version
To drop into the container runtime's shell and interact with hdfs
:
make container-bash
NOTE: The Hadoop Command Reference details the full command suite.
Stop the running container image:
make container-stop
The following web interfaces are available to view configurations and logs:
- Hadoop NameNode web UI: http://localhost:9870
- YARN ResourceManager web UI: http://localhost:8088
- MapReduce JobHistory Server web UI: http://localhost:19888