Administering Hadoop(Engineering > Computer Science And Engineering > Hadoop ) Questions and answers for exam Preparation

Question 1. Point out the wrong statement :

If you set the HBase service into maintenance mode, then its roles (HBase Master and all Region Servers) are put into effective maintenance mode
If you set a host into maintenance mode, then any roles running on that host are put into effective maintenance mode
Putting a component into maintenance mode prevent events from being logged
None of the mentioned

Explanation:-

Answer: Option C. -> Putting a component into maintenance mode prevent events from being logged

Maintenance mode only suppresses the alerts that those events would otherwise generate.

Question 2. Point out the wrong statement :

classNAME displays the class name needed to get the Hadoop jar
Balancer Runs a cluster balancing utility
An administrator can simply press Ctrl-C to stop the rebalancing process
None of the mentioned

Explanation:-

Answer: Option A. -> classNAME displays the class name needed to get the Hadoop jar

classpath prints the class path needed to get the Hadoop jar and the required libraries.

Question 3. __________ mode is a Namenode state in which it does not accept changes to the name space.

Recover
Safe
Rollback
None of the mentioned

Explanation:-

Answer: Option C. -> Rollback

dfsadmin runs a HDFS dfsadmin client.

Question 4. _________ command is used to copy file or directories recursively.

dtcp
distcp
dcp
distc

Explanation:-

Answer: Option B. -> distcp

Usage of the distcp command: hadoop distcp .

Question 5. Which of the following is a common reason to restart hadoop process ?

Upgrade Hadoop
React to incidents
Remove worker nodes
All of the mentioned

Explanation:-

Answer: Option D. -> All of the mentioned

The most common reason administrators restart Hadoop processes is to enact configuration changes.

Question 6. __________ Manager's Service feature monitors dozens of service health and performance metrics about the services and role instances running on your cluster.

Microsoft
Cloudera
Amazon
None of the mentioned

Explanation:-

Answer: Option B. -> Cloudera

Manager's Service feature presents health and performance data in a variety of formats.

Question 7. Point out the correct statement :

All hadoop commands are invoked by the bin/hadoop script
Hadoop has an option parsing framework that employs only parsing generic options
archive command creates a hadoop archive
All of the mentioned

Explanation:-

Answer: Option A. -> All hadoop commands are invoked by the bin/hadoop script

Running the hadoop script without any arguments prints the description for all commands.

Question 8. Which of the following scenario may not be a good fit for HDFS ?

HDFS is not suitable for scenarios requiring multiple/simultaneous writes to the same file
HDFS is suitable for storing data related to applications requiring low latency data access
HDFS is suitable for storing data related to applications requiring low latency data access
None of the mentioned

Explanation:-

Answer: Option A. -> HDFS is not suitable for scenarios requiring multiple/simultaneous writes to the same file

HDFS can be used for storing archive data since it is cheaper as HDFS allows storing the data on low cost commodity hardware while ensuring a high degree of fault-tolerance.

Question 9. Point out the wrong statement :

Replication Factor can be configured at a cluster level (Default is set to 3) and also at a file level
Block Report from each DataNode contains a list of all the blocks that are stored on that DataNode
User data is stored on the local file system of DataNodes
DataNode is aware of the files to which the blocks stored on it belong to

Explanation:-

Answer: Option D. -> DataNode is aware of the files to which the blocks stored on it belong to

NameNode is aware of the files to which the blocks stored on it belong to.

Question 10. The need for data replication can arise in various scenarios like :

Replication Factor is changed
DataNode goes down
Data Blocks get corrupted
All of the mentioned

Explanation:-

Answer: Option D. -> All of the mentioned

Data is replicated across different DataNodes to ensure a high degree of fault-tolerance.