Monday, September 05, 2011

How to get network information in Linux

Following files/dirs can be directly read:

/proc/net/dev
/sys/class/net/<if_name>/
/sys/class/net/<if_name>/statistics

Tool netstat is your friend to get network-related information:

netstat -I  # display interface info
netstat -s  # display statistics info
netstat -r  # display routing info
netstat -tlnp  # display TCP listening info

Other tools: sar, ifconfig, iftop

Friday, July 08, 2011

How to see network card and disk speed

Sometimes, you may want to know the hardware speed limit for network interface cards and disks.

Network Interface Card

I use commands

dmesg | grep -i ethernet
dmesg | grep -i infiniband

The output looks like

Intel(R) Gigabit Ethernet Network Driver - version 2.1.0-k2-1
igb 0000:0b:00.0: Intel(R) Gigabit Ethernet Network Connection
igb 0000:0b:00.1: Intel(R) Gigabit Ethernet Network Connection
and
mlx4_ib: Mellanox ConnectX InfiniBand driver v1.0 (April 4, 2008)
Registered RDS/infiniband transport

I also use command

/sbin/lspci
The output has following useful information
0b:00.0 Ethernet controller: Intel Corporation 82575EB Gigabit Network Connection (rev 02)
0b:00.1 Ethernet controller: Intel Corporation 82575EB Gigabit Network Connection (rev 02)
10:00.0 InfiniBand: Mellanox Technologies MT26418 [ConnectX VPI PCIe 2.0 5GT/s - IB DDR / 10GigE] (rev a0)

Disk

/sbin/lspci | grep -i ata

Other useful tools

mii-tools
hdparm

They need root privilege usually.

Thursday, May 12, 2011

java.lang.LinkageError in Tomcat 6

Usually the cause of this error is you may unintentionally include following two jars into your war file. 
    el-api.jar jasper-el.jar
They are provided by tomcat 6.

In Tomcat 6, you will get following error:

java.lang.LinkageError: loader constraint violation: loader …

You can solve the problem by adding following snippet into your pom file.

<dependency>
        <groupId>org.apache.tomcat</groupId>
        <artifactId>el-api</artifactId>
        <version>[1,)</version>
        <scope>provided</scope>
</dependency>
<dependency>
        <groupId>org.apache.tomcat</groupId>
        <artifactId>jasper-el</artifactId>
        <version>[1,)</version>
        <scope>provided</scope>
</dependency>

Read this post for more information.

Hadoop datanode version control

Sometimes, when you upgrade your Hadoop, you may get following error in your namenode log:

Incompatible build versions: namenode BV = Unknown; datanode BV =

./common-0.21.0/src/saveVersion.sh generates package-info.java which includes version information.  The content looks like

@HadoopVersionAnnotation(version="0.21.1-SNAPSHOT", revision="1", branch="",
                         user="username", date="Mon Nov 15 12:28:49 EST 2010",
                         url="your_domain/path",
                         srcChecksum="a1aeb15b4854808d152989ba76f90fac")

saveVersion.sh is executed when you build Hadoop using ant.  It is specified in build.xml (target "init").

In Java code, class org.apache.hadoop.util.VersionInfo manages version.  It gets version information from package-info.java generated by saveVersion.sh.

In class org.apache.hadoop.hdfs.server.datanode.DataNode, method handshake checks whether build version is equal.  How build version is calculated is shown below.

public static String getBuildVersion(){
  return VersionInfo.getVersion() +
  " from " + VersionInfo.getRevision() +
  " by " + VersionInfo.getUser() +
  " source checksum " + VersionInfo.getSrcChecksum();
}

So, the quick solution is that you upgrade all installations of Hadoop on different nodes.

How to install user-provided jars to Hadoop

If you write a MapReduce program and compile it to a jar, you usually run it with following command:

./bin/hadoop jar your_jar_name

If you want to your jar loaded when Hadoop is started (e.g. you add a new service which should be initiated and started by Hadoop), you can follow steps shown below.

In file bin/hadoop-config.sh, you can find following snippet:

for f in $HADOOP_COMMON_HOME/hadoop-*.jar; do
  CLASSPATH=${CLASSPATH}:$f;
done

So only jars whose names starting with "hadoop-" are loaded by default.

Drop your jar to the directory where Hadoop is used, and change file bin/hadoop-config.sh to add

CLASSPATH=${CLASSPATH}:$HADOOP_COMMON_HOME/your_jar_name

Friday, April 15, 2011

Debug/profile heap/gc in Java

HPROF

Profiler agent.
Examples:
java -agentlib:hprof=help
java -agentlib:hprof=heap=sites
java -agentlib:hprof=heap=dump
java -agentlib:hprof=cpu=samples

" By default, heap profiling information (sites and dump) is written out to java.hprof.txt (in ASCII) in the current working directory.

The output is normally generated when the VM exits, although this can be disabled by setting the “dump on exit” option to “n” ( doe=n). In addition, a profile is generated when Ctrl-\ or Ctrl-Break (depending on platform) is pressed. On Solaris OS and Linux a profile is also generated when a QUIT signal is received ( kill -QUIT pid). If Ctrl-\ or Ctrl-Break is pressed multiple times, multiple profiles are generated to the one file.  "

jmap

The jmap command-line utility prints memory related statistics for a running VM or core file.

Commands:

jmap -histo <pid>                          #show histogram of objects
jmap -dump:format=b,file=<file>    #dump heap in HPROF format (can be processed by jhat)

jstat

"The jstat utility uses the built-in instrumentation in the HotSpot VM to provide information on performance and resource consumption of running applications. "

show garbage collection info, class loading info, compilation info, etc.

visualgc

GUI to show results of jstat.

Java VisualVM

http://download.oracle.com/javase/6/docs/technotes/guides/visualvm/index.html
command:  jvisualvm

"Java VisualVM is a tool that provides a visual interface for viewing detailed information about Java applications while they are running on a Java Virtual Machine (JVM), and for troubleshooting and profiling these applications."

JConsole

"This tool is compliant with Java Management Extensions (JMX). The tool uses the built-in JMX instrumentation in the Java Virtual Machine to provide information on the performance and resource consumption of running applications."

jhat (java heap analysis tool)

"The jhat tool provides a convenient means to browse the object topology in a heap snapshot. This tool was introduced in the Java SE 6 release to replace the Heap Analysis Tool (HAT). "

Command:

jhat <hprof_file_name>

Eclipse MAT

 

jdb

Misc.

"As of Java SE 5.0 update 7, the -XX:+HeapDumpOnOutOfMemoryError command-line option
tells the HotSpot VM to generate a heap dump when an OutOfMemoryError occurs (see
section 1.9).
As of Java SE 5.0 update 14, the -XX:+HeapDumpOnCtrlBreak command-line option tells the
HotSpot VM to generate a heap dump when a Ctrl-Break or SIGQUIT signal is received (see
section 1.10). "

Resources

http://www.oracle.com/technetwork/java/javase/index-137495.html

Friday, April 08, 2011

How to decommission nodes/blacklist nodes

HDFS

Put following config in conf/hdfs-site.xml:
<property>
  <name>dfs.hosts.exclude</name>
  <value>/full/path/of/host/exclude/file</value>
</property>

Use following command to ask HDFS to re-read host exclude file and decommission nodes accordingly.

./bin/hadoop dfsadmin -refreshNodes

MapReduce

Put following config in conf/mapred-site.xml

<property>
  <name>mapred.hosts.exclude</name>
  <value>/full/path/of/host/exclude/file</value>
</property>

Use following command to ask Hadoop MapReduce to reconfigure nodes.

./bin/hadoop mradmin -refreshNodes

Whitelist/Recommission

Also you can "whitelist" nodes. In other words, you can specify which nodes are allowed to connect to namenode/jobtracker. 

HDFS

Put following config in conf/hdfs-site.xml:
<property>
  <name>dfs.hosts</name>
  <value>/full/path/to/whitelisted/node/file</value>
</property>

Use following command to ask Hadoop to refresh node status to based on configuration.

./bin/hadoop dfsadmin -refreshNodes

MapReduce

Put following config in conf/mapred-site.xml

<property>
  <name>mapred.hosts</name>
  <value>>/full/path/to/whitelisted/node/file</value>
</property>

Use following command to ask Hadoop MapReduce to reconfigure nodes.

./bin/hadoop mradmin -refreshNodes

 

Support of -mradmin was added in 0.21.0. See JIRA issue https://issues.apache.org/jira/browse/HADOOP-5643 for details.

Saturday, March 19, 2011

Japan earthquake GPS data visualization gadget

I made a gadget version of QuakeSim Japan earthquake data visualization portal. It shows data (longitude, latitude and height) collected by GPS stations during Japan earthquake.

You can click http://www.google.com/ig/adde?synd=open&source=ggyp&moduleurl=hosting.gmodules.com%2Fig%2Fgadgets%2Ffile%2F105322631994749779353%2Fquakesim-japan.xml to add it to your iGoogle. After it is added, maximize it by clicking the icon near top right corner of the gadget

Link for the gadget is

http://www.google.com/ig/directory?url=hosting.gmodules.com%2Fig%2Fgadgets%2Ffile%2F105322631994749779353%2Fquakesim-japan.xml

Thank Xiaoming Gao for providing service pages.

Wednesday, March 02, 2011

Install ns2 (ns-2.33) on Ubuntu Maverick

Install prerequisite:
sudo apt-get install \  
    tcl tcl-dev \ 
    libotcl1 libotcl1-dev  \ 
    tclcl-dev tclcl \ 
    tk tk-dev

./configure failed complaining that some tcl, tk related files cannot be found. It turns out those packages have been installed but file locations are different than what the configure script expects. Following is a fix.

You need to change two variables in file configure : TCL_TCL_PLACES and TK_TCL_PLACES.

Add
    /usr/share/tcltk/tcl$TCL_VERS \
    /usr/share/tcltk/tcl$TCL_HI_VERS
to variable TCL_TCL_PLACES.

Add
    /usr/share/tcltk/tk$TK_HI_VERS \
    /usr/share/tcltk/tk$TK_VERS"
to variable TK_TCL_PLACES.

execute ./configure

Official page: http://www.isi.edu/nsnam/ns/ns-build.html