Hdfs(Engineering > Computer Science And Engineering > Hadoop ) Questions and Answers
Question 1. InputFormat class calls the ________ function and computes splits for each file and then sends them to the jobtracker.
puts
gets
getSplits
All of the mentioned
Explanation:-
Answer: Option C. -> getSplits
InputFormat uses their storage locations to schedule map tasks to process them on the tasktrackers.
Question 2. _________ identifies filesystem pathnames which work as usual with regular expressions.
-archiveName
source
destination
None of the mentioned
Explanation:-
Answer: Option D. -> None of the mentioned
identifies destination directory which would contain the archive.
Question 3. On a tasktracker, the map task passes the split to the createRecordReader() method on InputFormat to obtain a _________ for that split.
InputReader
RecordReader
OutputReader
None of the mentioned
Explanation:-
Answer: Option B. -> RecordReader
The RecordReader loads data from its source and converts into key-value pairs suitable for reading by mapper.
Question 4. _________ is a pluggable Map/Reduce scheduler for Hadoop which provides a way to share large clusters.
Flow Scheduler
Data Scheduler
Capacity Scheduler
None of the mentioned
Explanation:-
Answer: Option C. -> Capacity Scheduler
The Capacity Scheduler supports for multiple queues, where a job is submitted to a queue.
Question 5. Which of the following parameter describes destination directory which would contain the archive ?
-archiveName
source
destination
None of the mentioned
Explanation:-
Answer: Option C. -> destination
-archiveName is the name of the archive to be created.
Question 6. Which of the following scenario may not be a good fit for HDFS ?
HDFS is not suitable for scenarios requiring multiple/simultaneous writes to the same file
HDFS is suitable for storing data related to applications requiring low latency data access
HDFS is suitable for storing data related to applications requiring low latency data access
None of the mentioned
Explanation:-
Answer: Option A. -> HDFS is not suitable for scenarios requiring multiple/simultaneous writes to the same file
HDFS can be used for storing archive data since it is cheaper as HDFS allows storing the data on low cost commodity hardware while ensuring a high degree of fault-tolerance.
Question 7. The need for data replication can arise in various scenarios like :
Replication Factor is changed
DataNode goes down
Data Blocks get corrupted
All of the mentioned
Explanation:-
Answer: Option D. -> All of the mentioned
Data is replicated across different DataNodes to ensure a high degree of fault-tolerance.
Question 8. ________ is the slave/worker node and holds the user data in the form of Data Blocks.
DataNode
NameNode
Data block
Replication
Explanation:-
Answer: Option A. -> DataNode
A DataNode stores data in the [HadoopFileSystem]. A functional filesystem has more than one DataNode, with data replicated across them.
Question 9. Reducer is input the grouped output of a :
Mapper
Reducer
Writable
Readable
Explanation:-
Answer: Option A. -> Mapper
In the phase the framework, for each Reducer, fetches the relevant partition of the output of all the Mappers, via HTTP.
Question 10. Interface ____________ reduces a set of intermediate values which share a key to a smaller set of values.
Mapper
Reducer
Writable
Readable
Explanation:-
Answer: Option B. -> Reducer
Reducer implementations can access the JobConf for the job.