Friday, December 31, 2010

Hadoop tips

  • Change logging level
    • For each daemon, there is a service at http://daemon_address:port/logLevel through which you can get and set logging level.
    • Use command line
        hadoop daemonlog -getLevel daemon_address:port fullQualifiedClassName
      hadoop daemonlog -setLevel daemon_address:port fullQualifiedClassName logLevel
    • Permanent change
      Change file log4j.properties. Example
          log4j.logger.org.apache.hadoop.mapred.JobTracker=DEBUG
          log4j.logger.org.apache.hadoop.mapred.TaskTracker=DEBUG
          log4j.logger.org.apache.hadoop.fs.FSNamesystem=DEBUG

  • Commission and decommission nodes
    Following four config parameters are related:
    dfs.hosts
    dfs.hosts.exclude
    mapreduce.jobtracker.hosts.filename (mapred.hosts for old version)
    mapreduce.jobtracker.hosts.exclude.filename (mapred.hosts.exclude for old version)

    For HDFS, execute "hadoop dfsadmin -refreshNodes" after you change the include file or exclude file.
    From the mailing list, I know "mradmin -refreshNodes was added in 0.21".  So for MapReduce, you can use "hadoop mradmin -refreshNodes" after you change the include file or exclude file to commission or decommission a node respectively.
    To permanently add or remove a node, you also need to change slave file conf/slaves.
  • Block scanner report
    http://datanode_address:50075/blockScannerReport
  • If you want to check blocks and block locations of a specific file, use following command:
      hadoop fsck file_to_check -files -blocks -locations -racks
    Note: you should execute it on master node.
    Use "hadoop fsck /" to check health of the whole file system.

5 comments:

Unknown said...

Uniqe informative article and of course True words, thanks for sharing. Today I see myself proud to be a hadoop professional with strong dedication and will power by blasting the obstacles. Thanks to hadoop training in adyar

Unknown said...

Your posts is really helpful for me.Thanks for your wonderful post. I am very happy to read your post.
VMWare course chennai | VMWare certification in chennai | VMWare certification chennai

Unknown said...

Interesting article. This article helps to learn a lot of information and shows that the performance and processing importance. Thanks for info!AWS Training in chennai | AWS Training chennai | AWS course in chennai

Unknown said...

Cloud computing is the next big thing, through cloud the users have the liberty to use a shared network. The companies can focus on core business parts rather than investing heavily on infrastucture.
cloud computing training in chennai|cloud computing courses in chennai|cloud computing training

Unknown said...

Thanks for sharing the valuable information here. So i think i got some useful information with this content. Thank you and please keep update like this informative details.

Hadoop Training in Chennai

Dot Net Training in Chennai