Friday, December 31, 2010

Hadoop tips

Change logging level

For each daemon, there is a service at http://daemon_address:port/logLevel through which you can get and set logging level.
Use command line
hadoop daemonlog -getLevel daemon_address:port fullQualifiedClassName
hadoop daemonlog -setLevel daemon_address:port fullQualifiedClassName logLevel
Permanent change
Change file log4j.properties. Example
    log4j.logger.org.apache.hadoop.mapred.JobTracker=DEBUG
    log4j.logger.org.apache.hadoop.mapred.TaskTracker=DEBUG
    log4j.logger.org.apache.hadoop.fs.FSNamesystem=DEBUG

Commission and decommission nodes
Following four config parameters are related:
dfs.hosts
dfs.hosts.exclude
mapreduce.jobtracker.hosts.filename (mapred.hosts for old version)
mapreduce.jobtracker.hosts.exclude.filename (mapred.hosts.exclude for old version)

For HDFS, execute "hadoop dfsadmin -refreshNodes" after you change the include file or exclude file.
From the mailing list, I know "mradmin -refreshNodes was added in 0.21". So for MapReduce, you can use "hadoop mradmin -refreshNodes" after you change the include file or exclude file to commission or decommission a node respectively.
To permanently add or remove a node, you also need to change slave file conf/slaves.
Block scanner report
http://datanode_address:50075/blockScannerReport
If you want to check blocks and block locations of a specific file, use following command:
hadoop fsck file_to_check -files -blocks -locations -racks
Note: you should execute it on master node.
Use "hadoop fsck /" to check health of the whole file system.

Thursday, December 23, 2010

Extract some continuous lines from a file

Sometimes, I want to extract some continuous lines from a file, e.g. line 10 to line 100. I was thinking whether there is any linux command to do that. Unfortunately, I did not find the command in vanilla Linux distros. Suddenly, I found that it can achieved by combining commands head and tail.

Let's say you want to extract line min to line max, both inclusive.
Calculate nlines=(max-min+1). Then use following command:

cat <filename>|head -n <max>|tail -n <nlines>

Friday, December 17, 2010

vim quickfix and location list

Commands	Description
copen	open quickfix window
cclose	close quickfix window
cwindow	open quickfix window if its content is not empty.

cc [nr]	display error [nr]
cr	display the first error.
cfirst	display the first error.
clast	display the last error.
[count]cn	display [count] next error
[count]cp	display [count] previous error
[count]cnf	display first error in the [count] next file
[count]cpf	display first error in the [count] previous file

For commands related to location list, just replace first 'c' with 'l' in above commands.

Thursday, December 16, 2010

Compile and Install vim on Linux

Commands

wget ftp://ftp.vim.org/pub/vim/unix/vim-7.3.tar.bz2
tar jvxf vim-7.3.tar.bz2
cd vim73
./configure --enable-gui=no \
            --enable-multibyte \
            --enable-cscope \
            --disable-netbeans \
            --prefix=<your_desired_vim_home>
make
make test
make install

Change you environment variable PATH to add vim bin path. Following commands are for bash.

Temporary change
export PATH="<your_vim_home>/bin;$PATH"

Permanent change
echo -e '\nexport PATH="<your_vim_home>/bin;$PATH" ' >> ~/.bash_profile

Wednesday, December 15, 2010

Delete executables and object (.o) files

Following two commands can delete all ELF executables and ELF object (.o) files.

find ./ -executable -type f|xargs -I{} file {} | grep ELF|cut -d ':' -f 1|xargs -I{} rm {}

find ./ -name "*.o" -type f|xargs -I{} file {} | grep ELF|cut -d ':' -f 1|xargs -I{} rm {}

Some executables may not be found because its "x" bit is not set. Use following command to find them

find ./ -type f|xargs -I{} file {} | grep ELF|cut -d ':' -f 1|xargs -I{} rm {}

Note:
-executable is not supported in old version of find.

Monday, December 13, 2010

How to change hostname in Ubuntu

Temporary change

hostname <new_host_name>

Permanent change

Edit /etc/hostname to specify your new hostname
sudoedit /etc/hostname
sudo service hostname start

Ubuntu init scripts and upstart jobs

Init script

Those scripts are located in directory /etc/init.d. Note: some of them have been converted to upstart jobs (see next section) and they should not be invoked directly. To check whether it's a upstart job, check directory whether file /etc/init/<job>.conf exists for a specific job.

upstart jobs

http://upstart.ubuntu.com/ Use "man 5 init" to see the syntax of the conf file.

Each upstart job has a conf file in directory /etc/init/<job>.conf. You should not directly invoke the init script to start/stop the job. You should use commands initctl to do that. E.g. initctl restart hostname.
initctl list will list upstart jobs that are running.

service

It can be used to interact with the init scripts, no matter they are upstart jobs or regular init script. For upstart jobs, it does not run /etc/init.d/<job> . Instead it runs "start <job>" directly.

E.g. service hostname status

invoke-rc.d

Another tool to start/stop init jobs. In my opinion you should use command service because invoke-rc.d does NOT detect whether the job is a upstart job or regular init job. Usually, this is not a big deal because upstart job shell script automatically calls initctrl related commands (start, stop, reload, etc) .

Example - network

I will give an example about how network interfaces are managed by init daemon.

As you may know, ifup and ifdown can be used to bring up or down network interfaces.

/etc/network/interfaces are used by ifup and ifdown to know how you want your system to connect to the network.

Sample file

# interfaces lo and eth0 should be started when ifup -a is invoked.
auto lo eth0
# eth1 is allowed to be brought up by subsystem hotplug.
allow-hotplug eth1
# For interface lo, it should use internet protocol and it is a loopback device.
iface lo inet loopback
# Interface eth1 uses internet protocol and dhcp for configuration
iface eth1 inet dhcp

For upstart job networking, its config file is /etc/init/networking.conf:

description "configure virtual network devices"

start on (local-filesystems
and stopped udevtrigger)

task

pre-start exec mkdir -p /var/run/network

exec ifup -a

Notice the last line? Yes, it invoke ifup to bring up those interfaces that are marked as "auto" in file /etc/network/interfaces.

Job network-interface is used when a network interface is added or removed. Its config file is /etc/init/network-interface.conf

description "configure network device"

start on net-device-added
stop on net-device-removed INTERFACE=$INTERFACE

instance $INTERFACE

pre-start script
    if [ "$INTERFACE" = lo ]; then
    # bring this up even if /etc/network/interfaces is broken
    ifconfig lo 127.0.0.1 up || true
    initctl emit -n net-device-up \
        IFACE=lo LOGICAL=lo ADDRFAM=inet METHOD=loopback || true
    fi
    mkdir -p /var/run/network
    exec ifup --allow auto $INTERFACE
end script

post-stop exec ifdown --allow auto $INTERFACE

Line "exec ifup --allow auto $INTERFACE" bring up the newly added interface if it is set to be brought up automatically. The trigger event is "net-device-added" or "net-device-removed" which is sent by upstart-udev-bridge. It basically forwards events received from udev to init daemon. When your network interface (e.g. eth0) is detected by udev, finally a net-device-added event is sent to network-interface upstart job which runs ifup to bring it up.

upstart job hostname. (/etc/init/hostname.conf). It includes following line:
exec hostname -b -F /etc/hostname
Now you should know how to change hostname.
Edit file /etc/hostname, run command "sudo service start hostname", "sudo start hostname", or "sudo initctl start hostname".

Recover corrupted partition table

Partition table of my linux drive was corrupted recently. I could not start up my Ubuntu.

I burned a Ubuntu CD. But when I tried to boot into the liveCD, it always gave me errors. It seems to be a CD burning/CD drive problem. Then I made a live USB drive which worked great. Following two tools can be used to "guess" the partition table.

gpart
This program is kind of old and is not maintained any longer. It can just recognizes some file systems (ext3, ext4, etc are not recognized correctly)
testdisk
This is a great tool which a text UI. You can find information here. You just follow the instructions. Check the "guessed" partition table match your real partition table(if you have backup, you are lucky.).

Run fsck to check integrity of your file system.

Afterthoughts:

Back up your partition table!
USB drive is more stable than CD in this case.

Sunday, November 28, 2010

printf cheatsheet

Format string

% [flag]* [minimum_field_width] [precision] [length_modifier] <conversion_specifier>

Flags

Flag	Description
#	The value should be converted to an "alternate form".
0	zero padded. By default, blank padded
-	left adjusted
' '	(A single space) A blank should be left before a positive number (or empty string) produced by a signed conversion.
+	A sign is always placed before a number produced by a signed conversion.

Minimum Field Width

"a decimal digit string (with non-zero first digit). If the value has fewer characters, it will be padded. In no case does a nonexistent or small field width cause truncation of a field."

Precision

a period ('.') followed by an optional decimal digit string. It has different meanings for different conversions.

Precision	Description
d, i, o, u, x, and X	minimum number of digits to appear	printf("%.2d", 1) ==> "01"
a, A, e, E, f, and F	number of digits to appear after the radix character	printf("%.2f",0.1) ==> "0.10"
g and G	maximum number of significant digits
s and S	maximum number of characters to be printed from a string	printf("%.2s","hello") ==> "he"

Length Modifier

For each conversion specifier, there is expected argument type. For example, for conversion d, and i, type of arguments should be int. Length Modifiers can be used to specify argument types rather than expected type by default.

Modifier	Conversion	Argument types
hh	d, i, o, u, x, or X	signed char or unsigned char
hh	n	signed char*
h	d, i, o, u, x, or X	short int or unsigned short int
h	b	short int*
l	d, i, o, u, x, or X	long int, or unsigned long int
	n	long int*
	c	wint_t
	s	wchar_t
ll	d, i, o, u, x, or X	long long int, or unsigned long long int
ll	n	long long int*
L	a, A, e, E, f, F, g, or G	long double
j	d, i, o, u, x, or X	intmax_t, uintmax_t
z	d, i, o, u, x, or X	size_t, ssize_t
t	d, i, o, u, x, or X	ptrdiff_t

Conversion specifier

conversion	arguments	notation
d, i	int argument	signed decimal notation
o, u, x, X	unsigned int	unsigned octal, unsigned decimal, unsigned hexdec notation	abcedf are used for x. ABCDEF are used for X.
e, E	double	rounded and converted in the style [-]d.ddde±dd	precision is 6 by default.
f, F	double	rounded and converted to decimal notation in the style [-]ddd.ddd	precision is 6 by default.
g, G	double	converted in style f or e (or F or E for G conversions). Style e is used if the exponent from its conversion is less than -4 or greater than or equal to the precision.
a, A	double	converted to hexadecimal notation. for a, (using the letters abcdef) in the style [-]0xh.hhhhp±d;	C99; not in SUSv2
c	int	converted to an unsigned char, and the resulting character is written
s	const char*	Characters from the array are written.
p	void *	printed in hexadecimal (as if by %#x or %#lx)
n	int *	The number of characters written so far is stored into the integer indicated by the int * (or variant) pointer argument.

Some Hard Drive and File System benchmark tools

IOMeter

iozone

http://www.iozone.org

Document: http://www.iozone.org/docs/IOzone_msword_98.pdf
Manual: http://linux.die.net/man/1/iozone

Read, write, re-read, re-write, read backwards, read strided, fread, fwrite, random read/write, pread/pwrite variants

iozone -a | tee result.txt

iozone supports bunch of command line options. I summarized them in following table

Category	Options	Note
Auto mode	-a: record size 4k - 16M, file size 64k - 512M. -z: Used in conjunction with -a to test all possible record sizes. (Normally Iozone omits testing of small record sizes for very large files when used in full automatic mode. ) -A: more coverage
Test file	-f filename: the name for temporary file under test. -F fn1 fn2: # of files should be equal to # of processors/threads
Output	-b filename: output of an Excel compatible file -R: Generate Excel report.
Record size	-r #: record size -y #: minimum record size for auto mode -q #: maximum record size for auto mode
File size	-s #: size of the file to test -g #: maximum file size for auto mode -n #: minimum file size for auto mode
tests	-i #: specifies which test to run 0=write/rewrite, 1=read/re-read, 2=random-read/write, 3=Read-backwards, 4=Re-write-record, 5=stride-read, 6=fwrite/re-fwrite, 7=fread/Re-fread, 8=mixed workload, 9=pwrite/Re-pwrite, 10=pread/Re-pread, 11=pwritev/Re-pwritev, 12=preadv/Re-preadv	One will always need to specify 0 so that any of the following tests will have a file to measure. This means -I 0 creates files used by following tests. -i # -i # -i # is also supported so that one may select more than one test.
	-+p percent_reads: the percentage of threads/processes that will perform read testing in the mixed workload test case
	-+B: sequential mixed workload testing
throughput tests	-t #: Run Iozone in a throughput mode. -T: Use POSIX pthreads for throughput tests	This option allows the user to specify how many threads or processes to have active during the measurement.
processes/threads	-l #: lower limit on number of processes to run -u #: upper limit on number of processes to run
Timing	-c: include close() in timing calculation -e: include fflush(), fsync() in timing calculation
Other control	-H #: Use POSIX async I/O with # async operations -k #: Use POSIX async I/O (no bcopy) with # async operations. -I: use direct I/O if possible -m: use multiple buffers internally -o: Writes are synchronously written to disk -p: purges the processor cache before each file operation. -W Lock file when reading or writing. -K: Inject some random accesses in the testing.

Examples

It's important to use -I option to turn on DIRECT I/O. Otherwise, linux's page caches (buffer cache) may give you ridiculous fast read speed.

Auto Mode

iozone -a -n 512m -g 1g

Single Test

Sequential write, Sequential reads
iozone -r 64k -s 1g -b excel.xls -R -i 0 -i 1 -I
Use two processes to test sequential writes, sequential reads
iozone -r 64k -s 1g -b excel.xls -R -i 0 -i 1 -I -l 2 -u 2
Use one and two processes to tests (two runs. In first run, one process is created. In second run, two processes are created)
iozone -r 64k -s 1g -b excel.xls -R -i 0 -i 1 -I -l 1 -u 2
Random writes, Random reads
iozone -r 64k -s 1g -b excel.xls-R -i 0 -i 1 -K -I

Throughput Test

iozone -t 2
roughly equivalent to "iozone -l 2 -u 2"

Mixed Workload Test

iozone -r 64k -s 1g -b excel.xls -R -i 0 -i 8 -+p 50

Result visualization

./Generate_Graphs result.txt

It will a directory for each operation (e.g. read, write, fread, fwrite). In each directory, there are two files generated - iozone_gen_out.gnuplot and <operation>.ps (this file is generated after you view the corresponding result using gnuplot). Under the hood, it uses gnu3d.dem to render the data using Gnuplot after those data files are generated for all tested operations. You can call it directly without regenerating separate data files.

gnuplot gnu3d.dem

There are two more scripts that can be used for visualization - report.pl and iozone_visualization.pl.
When I tried to run them in Linux (using command ./report.pl result.txt), I had following error:

-bash: ./iozone_visualizer.pl: /usr/bin/perl^M: bad interpreter: No such file or directory

You can use command perl report.pl result.txt to run it successfully.
The solution is to change those two files from dos type to unix type (mainly the new line character conversion).

sudo aptitude install tofrodos
fromdos report.pl
fromdos iozone_visualizer.pl

Now, you should be able to report.pl and iozone_visualization.pl directly. When they are run, a directory named 'report_result' is created. In the directory are Gnuplot scripts (*.do files) and PNG images for all tested operations. Those PNG images are generated by running those Gnuplot scripts. For example, read.png is generated by running "gnuplot read.do". A HTML page (index.html) is generated by iozone_visualization.pl which contains all of those PNG images in the same page.

Bonnie

http://www.textuality.com/bonnie/

Bonnie64

http://code.google.com/p/bonnie-64
Bonnie++
After you download the source, just execute following commands to build and run.
./configure --prefix=installation_dir
./make
./make install
cd <installation_dir>/sbin/
./bonnie++
http://www.coker.com.au/bonnie++/
Manual: http://linux.die.net/man/8/bonnie++

Resources

Thursday, November 25, 2010

Permission of authorized_keys and private key file

~/.ssh/authorized_keys: should be 600

~/.ssh/id_dsa, ~/.ssh/id_rsa: must be 600

If permissions are not set correctly, the login process may fail without giving any useful information!

awk/gawk notes


RS	Record Separator	single character	That character separates the records.
		regular expression	Text in input matches reg exp separates records.
		null string	Records are separated by blank lines. The newline character always acts as a field separator, in addition to whatever value FS may have.
NR	number of records seen so far
FNR	The input record number in the current input file
ORS	output record separator
FS	Field separator	single character	Fields are separated by that character
		single space	fields are separated by runs of spaces and/or tabs and/or newlines.
		Null string	each individual character becomes a separate field.
		regular expression
OFS	output field separator
FIELDWIDTHS		a space separated list of numbers	each field is expected to have fixed width. The value of FS is ignored. Assigning a new value to FS overrides the use of FIELDWIDTHS, and restores the default behavior.
NF	number of fields		Decrementing NF causes the values of fields past the new value to be lost, and the value of $0 to be recomputed
IGNORECASE	Controls the case-sensitivity of all regular expression and string operations.	non-zero zero	non-zero: ignore case

Field Reference	How to reference a field	$1, $2, … $NF	Access a field. Assigning a value to an existing field causes the whole record to be rebuilt when $0 is referenced.
		$-1, $-2	fatal error
		non-existent fields	For read, produce null-string. For write, 1)increase NF 2) create intervening fields with null string 3) $0 is recomputed
		$0	whole record. assigning a value to $0 causes the record to be resplit, creating new values for the fields.
CONVFMT			A number is converted to a string by using the value of CONVFMT as a format string for sprintf(3), with the numeric value of the variable as the argument. However, even though all numbers in AWK are floating-point, integral values are always converted as integers.
OFMT

All arrays in AWK are associative, i.e. indexed by string values.

i = "A"; j = "B"; k = "C"
x[i, j, k] = "hello, world\n"

key is "A\034B\034C" and value is "hello, world\n". Key test: val in array. for(val in array)…

Sunday, November 14, 2010

XLink

http://www.xml.com/lpt/a/1038

Can be transformed to RDF?

One RDF use include to include other RDFs.

XLink and HLink: http://www.xml.com/lpt/a/1038

Enhancements to html link

When to actuate the link
In current html link impl, the link is actuated when it is clicked. In XLink and HLink, a link can be actuated when the page containing the link is loaded
More effects when a link is actuated.
embed, new, replace, etc.

HLink supports creation of arbitrary link element.

<hlink namespace="http://www.example.com/markup"
       element="home"
       locator="/"
       effect="replace"
       actuate="onRequest"/>
<hlink namespace="http://www.example.com/markup"
       element="home"
       locator="/icons/home.png"
       effect="embed"
       actuate="onLoad"/>

<home/>

XLink supports creation of links among more than two resources.
Add more metadata to links
Links can be specified outside the linked resources.
In HTML, users can only specify links within the source resource.
When you write
```
 <a href=”destination.resource”>source</a>
```
this piece of code must be located in the source html. In other words, the user cannot specify links among external resources.
XLink adds this support.

METS, DIDL, ORE

http://www.oreillynet.com/xml/blog/2008/06/oaiore_compound_documents_draf.html

http://www.oreillynet.com/xml/blog/2008/05/bad_xml.html

http://www.dehora.net/journal/2008/06/18/dates-in-atom/

http://dret.net/netdret/docs/wilde-cacm2008-xml-fever.html

http://www.tbray.org/ongoing/When/200x/2006/01/09/On-XML-Language-Design

Google AppEngine mail test

Mail

How to test: http://aralbalkan.com/1311

Two bugs:

http://code.google.com/p/googleappengine/issues/detail?id=626
For this bug, you can upgrade your python to new version (2.5.4 and up).

http://code.google.com/p/googleappengine/issues/detail?id=1061

http://groups.google.com/group/app-engine-patch/browse_thread/thread/1662f95d9cacee24

Deb package manipulation notes (deb make, view, install, etc)

Directory /var/lib/dpkg/info/ contains package related files. For each package, its conffiles, md5sums, preinst, postinst, prerm, postrm, list of installed files, etc are kept there.

dpkg-dev

debian/files: "The list of generated files which are part of the upload being prepared."

.changes: upload control file

dpkg-buildpackage
build binary or source packages from sources

dpkg-architecture: set and determine the architecture for package building

dpkg-checkbuilddeps: check build dependencies and conflicts. By default, debian/control is read.

dpkg-distaddfile: adds an entry for a named file to debian/files.
dpkg-genchanges:
dpkg-gencontrol: generate Debian control files
dpkg-gensymbols:

dpkg-name
dpkg-scanpackages: create Packages index files
dpkg-scansources: create Sources index files
dpkg-shlibdeps
dpkg-source: packs and unpacks Debian source archives.
dpkg-vendor: query vendor information
dpkg-parsechangelog: get changelog information

Vendor

/etc/dpkg/origins/default

devscripts

debchange

debhelper

dh-make

This package is useful when you have a regular source package (not debian source package) and want to debianlize it.
dh_make must be invoked within a directory containing the source code, which must be named <packagename>-<version>. The <packagename> must be all lowercase, digits and dashes.
As I mentioned, there are two types of debian source packages – native and non-native.
For non-native package, obviously you need the original source tree. The reason is that the original source tree is needed to deb tools to generate diff. dh_make makes sure original source tarball(<packagename>_<version>.orig.tar.gz) exists.
Option –f can be used to specify location of the tarball.
If –f is not given, dh_make searches parent directory for file <packagename>_<version>.orig.tar.gz and directory <packagename>_<version>.orig. If either of them exists, it will be fine. If neither exists, dh_make will complain and exit.
If you want to create a original source tarball based on the code in current directory, use option "—createorig". Then current directory is copied to <packagename>_<version>.orig in parent directory.

key: public key
secret: private key

Trusted pub keys are stored in file /etc/apt/trusted.gpg (not /etc/apt/trustdb.gpg)

apt-key list

gpg --recv-keys --keyserver keyserver.ubuntu.com key_ID_here;
gpg --export --armor key_ID_here | sudo apt-key add -

http://wiki.debian.org/SecureApt
https://help.ubuntu.com/community/SecureApt

Downloaded deb packages are stored at /var/cache/apt/archives/ and /var/cache/apt/archives/partial/.

Low-level understanding

Deb binary package format

man deb
The manual describes debian binary package format
deb package is ar archive. So you can read content of a deb package using command:
ar tf pkg_name.deb
On my machine, the output is

    debian-binary 
    control.tar.gz 
    data.tar.gz

Extract content of a deb pkg using command:
ar xof pkg_name.deb

Deb control

control.tar.gz is a control file. Its format is deb-control.
"It is a gzipped tar archive containing the package control information, as a series of plain files, of which the file control is mandatory and contains the core control information."
Use command tar zvxf control.tar.gz to extract control files. The most important file is control. The format of the file is described in man deb-control.
conffiles: this file lists all configuration files used by this package.
control:
md5sums
postinst
postrm
preinst
prerm

Deb data

"It contains the filesystem as a tar archive, either not compressed".

High-level understanding

Ubuntu provides some tools to make it more convenient to manipulate deb package so that users don't need to use ar, tar, etc to extract files/information manually.

First, command dpkg-deb comes really handy

    dpkg-deb –I: provides information of a deb pkg. (Extracts info from file control) 
    dpkg-deb –c: list content of the package. (Extracts info from data.tar.gz)
    dpkg-deb –x: extract a deb archive 
    dpkg-deb –X: extract a deb archive and print list of extracted files. 
    dpkg-deb –e: extract control information to DEBIAN directory if not specified.
                 (Extract files from control.tar.gz)

Deb Source Package

Format of source package is described in section "SOURCE PACKAGE FORMATS" within manual "man dpkg-source".
Also read http://www.debian.org/doc/debian-policy/ch-source.html for more info.

There are two types of source packages: native and non-native.
Layout of native package

  .dsc: includes package info and md5 checksum for the package content.
  .tar.gz

Layout of non-native package:

  .dsc: debian source control 
  .orig.tar.gz: source code
  .diff.gz: 1)patches applied to the source code; 2) debian package (debain/ dir)

Download a source package instead of binary packge:

  apt-get source pkg_name     #Download and unpack
  apt-get source --download-only pkg_name    #only download

Then command dpkg-source comes handy to manipulate source package.

  dpkg-source –x pkg_name.dsc    # Extract a source package.

If you use command "apt-get source pkg_name", the package has been download and extracted. So you don't need to execute this command. If you use command "apt-get source --download pkg_name", you can use this command to extract the downloaded package and apply the patch.
If you don't want the patch to be applied, add option --skip-debianization.

If the directory where you execute command "dpkg-source –x" is different from the directory where downloaded source package is stored, option "-su, –sp, sn" can be used to specify where source tarball will be copied to current direcotory.

In all cases any existing original source tree will be removed! So be sure to backup your code if it is in current directory.

  dpkg-source –sn –x pkg_name.src    #original source tarball is not copied to current directory. But source tree is unpacked to current dir and patch is applied
  dpkg-source –sp –x pkg_name.src    #source tarball is copied to current directory, unpacked, and patch is applied
  dpkg-source –su –x pkg_name.src    #Copy source tarball to current directory, both original source tree and patched source tree are extracted.

If you want the original source is extracted also, use command "dpkg-source –su –x pkg_name.dsc".
When I extracted the source package, I got following warning:
gpgv: Can't check signature: public key not found
dpkg-source: warning: failed to verify signature on ./pkg_name.dsc

This means public key has not been found which is needed to verify signature of the package. The dpkg-source manaul can tell you more:

--no-check
  Do not check signatures and checksums before unpacking.
--require-valid-signature
  Refuse to unpack  the source package if it doesn't contain an OpenPGP signature that can be verified either with the user's trustedkeys.gpg keyring, one of the vendor-specific  keyrings,  or  one of the official Debian keyrings (/usr/share/keyrings/debiankeyring.gpg and /usr/share/keyrings/debianmaintainers.gpg).

debian/ direcotory

Version:
https://wiki.ubuntu.com/PackagingGuide/Complete#changelog

changelog

Default file is located at debian/changelog. Changelog contains a list of changes. Note: it has a specific format. Command debchange can be used to edit the file.

debchange –a        #append a changelog entry at current version
debchange –i         #increase release number for non-native packages (2.4-1ubuntu1 –> 2.4-1ubuntu2).
debchange –v        #create a changelog entry for a arbitrary new version.
debchange --create #create a new changelog file
debchange –c changelogfile #edit a specified changelog file instead of default one.

Read http://www.debian.org/doc/debian-policy/ch-source.html#s-dpkgchangelog for more info.

dpkg-source –b # build source package
man deb-version
Debian package version number format

Export:
gpg --export-secret-keys keyID
gpg --export keyID    #export public key
gpg --gen-key
gpg –k   #list pub keys
gpg –K   #list secret keys

copyright

Read this: https://wiki.ubuntu.com/PackagingGuide/Basic#Copyright%20Information

control

https://wiki.ubuntu.com/PackagingGuide/Complete#control

rules
It specifies how to compile, install the app and create the .deb package.
https://wiki.ubuntu.com/PackagingGuide/Complete#rules

DEBFULLNAME:
DEBEMAIL:

Package build

Binary package

dpkg-buildpackage
debuild: wrap dpkg-buildpackage and some other tools. Or you can set variable DEBSIGN_KEYID to the key id.
Use debuild –kKEYID to specify the key used to sign the package.
If you want to pass parameters to dpkg-buildpackage, set variable DEBUILD_DPKG_BUILDPACKAGE_OPTS.

debsign –kkeyID
debsign –m'LastName FirstName (Comment) <email_address>'

Source package: debuild –S

lintian
Debian package checker
lintian -Ivai *.dsc

sudo pbuilder build pkg_name.dsc

dpkg-query –s pkg_name    #conf files are listed

https://wiki.ubuntu.com/PackagingGuide/Complete
https://wiki.ubuntu.com/DebootstrapChroot
https://wiki.ubuntu.com/PackagingGuide/Basic
https://wiki.ubuntu.com/PbuilderHowto
https://help.ubuntu.com/community/GnuPrivacyGuardHowto

http://www.debian.org/doc/FAQ/ch-pkg_basics.en.html

http://www.debian.org/doc/manuals/maint-guide/index.en.html

http://www.debian.org/doc/debian-policy/

How to build Hadoop in Eclipse

Subclipse

Install SVN
I installed "Slik-SVN" because it provides JavaHL lib for 64bit Windows.
Install Subclipse plugin to Eclipse.
Change eclipse.ini to add following parameters after -vmargs:
-Djava.library.path=/usr/share/jni/lib (For linux)
-Djava.library.path=<svn_install_dir>/bin (For Windows)
Start eclipse
Goto WIndow --> Preference --> Team --> SVN
In section "SVN Interface", it should say something like "JavaHL (JNI) … SlikSvn". If it says "JavaHL(JNI) not available", it means subclipse cannot find JavaHL library. Check step 3).

Code checkout

Example: svn co http://svn.apache.org/repos/asf/hadoop/common/tags/release-0.21.0/
You can also check out code within Eclipse using Subclipse plugin.

Install Ant and Ivy

Read http://ant.apache.org/ for how to install Ant.

Download ivy jar and put it into directory ANT_HOME/lib/. If ANT_HOME is not specified explicitly, it is the installation directory.

IvyDE

Hadoop is managed by ivy. You need IvyIDE Eclipse plugin. Read http://ant.apache.org/ivy/ivyde/ for more info. IvyDE includes ivy jar file itself. So it does not use the ivy jar you installed in last step. Also it seems that ANT_HOME variable is set to <eclipse_dir>\plugins\org.apache.ant_1.7.1.v20090120-1145 (version number may vary for you).

Shell and Unix commands on Windows

Hadoop build file invokes sh and some other linux commands such as tr, sed to build the project. Of course those commands don't exist on Windows.

Following two projects port linux tools to Windows:

I use the first one. You just download the tarball and decompress it to a directory. This directory must be passed to Ant. The usual way is you put it into environment variable "PATH". Ant will pick it up automatically. It's true for command line use of Ant. It does not work well for Ant within Eclipse. Following sections include instructions about how to pass PATH to Ant in Eclipse.
For command line use, try command "ant compile".

Create Eclipse Project for Hadoop

New --> Java Project
Select "Create project from existing source". Then select the directory where code is located.
Click "Next"
CHANGE OUTPUT DIRECTORY TO <workspace_name>/build.
The default directory bin is used by Hadoop for different purposes.
Click "Finish"
Add JDK's tools.jar to build path. It is not included in JRE.
Change source directories to tell Eclipse which directories include Java source code.
Right click project name --> Build Path --> Configure Build Path… --> Source
Make IvyDE to manage ivy dependencies.
Right click project name --> Build Path --> Configure Build Path… --> Libraries --> Add Library --> IvyDE Managed Dependencies --> Next --> (couple of IvyDE setting steps) --> OK.
It may take some time for IvyDE to resolve dependencies.

Create Run Configuration

Right click "build.xml" --> Run As --> Ant Build … (not "Ant Build")
A dialog should pop up

Switch to "Targets" tab: select corresponding target (e.g. compile) you want to execute.
Switch to "JRE" tab: select "separate JRE"
This step is for Windows users.
Switch to Environment Tab: set PATH. (to include where those linux tools are included on Window)
You can click "Select" and choose variable "Path". But in my case, its value does NOT include all of the content of the variable (use "path" command in command line). probably, Eclipse has some restriction about length of value of environment variable. If it's too long, it will be truncated.
Click "Run"

See Console for messages.

Customize project builder

It's more convenient to use "Builder" than right click "build.xml" --> Run As --> Ant Build … --> Run. Following steps tell you how to use ant as default builder. Then you can use "Project-->Build Project" to build your project (same as any regular native Eclipse Java project).

Right click project name --> Properties --> Builder --> New --> Ant Builder

Select the build file (usually "build.xml").
Switch to "Targets" tab. Specify which targets are executed when the project is built or cleaned.
This step is for Windows users.
Switch to "Environment" tab. Add PATH environment variable if needed. (to include where those linux tools are included on Window)

Deselect default java builder.

Wednesday, November 10, 2010

package installation log on Ubuntu (dpkg, apt-get, aptitude)

Dpkg log

All deb package operations must go through deb system. So no matter you use apt-get install or deb -I, it will be logged in /var/log/dpkg.log.

lesspipe can show content of .gz files directly. But it cannot show normal text file Sad smile .

Show install and upgrade history for dkg.log.#.gz files:

ls /var/log/dpkg.log*|sort -r|xargs -I{} lesspipe {}|egrep "^[0-9\-]+[[:space:]][0-9:]+[[:space:]](install|upgrade)[[:space:]]"

Show install and upgrade history for dkg.log and dpkg.log.# files:

ls /var/log/dpkg.log*|sort -r|grep -v ".gz"|xargs -I{} cat {}|egrep "^[0-9\-]+[[:space:]][0-9:]+[[:space:]](install|upgrade)[[:space:]]"

apt-get log

apt-get logges to /var/log/apt/term.log

aptitude

/var/log/aptitude

Resources

http://superuser.com/questions/6338/how-do-you-track-which-packages-were-installed-on-ubuntu-linux

As far as logs, apt-get notoriously doesn't have one;
dpkg does (at /var/log/dpkg.log) but it's famously hard to parse and can only be read with root privileges;
aptitude has one at /var/log/aptitude and you can page through it with regular user privileges.

paratrac on Ubuntu

Compile ftrack

Dependencies

Depends on: fuse-dev, glib-dev, gthread-dev

sudo apt-get install libfuse-dev libglib2.0-dev

Use following commands to check whether they are installed successfully

pkg-config --libs --cflags glib-2.0
pkg-config --libs --cflags fuse

Build

cd fuse/ftrac/
./configure prefix=your_prefix
make
make install

Add ftrack to PATH:
export PATH=your_prefix/bin:$PATH
which ftrac

Use fusetrac.py

add parent directory of paratrac to PYTHONPATH

python fusetrac.py -t /tmp/fuse/

FUSE FS is mounted and a monitoring page is shown. From now on, when you access /tmp/fuse, data on monitoring page will be changed.

cd /tmp/fuse

Tuesday, November 09, 2010

Python in Ubuntu/Debian

http://www.debian.org/doc/packaging-manuals/python-policy/ch-python.html

site module

http://docs.python.org/library/site.html

I assume sys.prefix and sys.exec_prefix are /usr. It may be different on your machine.

/usr/lib/pythonX.Y/site-packages

/usr/lib/site-python

"It sees if it refers to an existing directory, and if so, adds it to sys.path and also inspects the newly added path for configuration files."

Note: sub-directories are not added.

If .pth files exist in those directories, its contents are additional items (one per line) to be added to sys.path.

local admin

A special directory is dedicated to public Python modules installed by the local administrator, /usr/local/lib/pythonX.Y/dist-packages for python2.6 and later, and /usr/local/lib/pythonX.Y/site-packages for python2.5 and earlier. For a local installation by the administrator of python2.6 and later, a special directory is reserved to Python modules which should only be available to this Python, /usr/local/lib/pythonX.Y/site-packages. Unfortunately, for python2.5 and earlier this directory is also visible to the system Python. Additional information on appending site-specific paths to the module search path is available in the official documentation of the site module.

Central repository

It seems all (not all?) modules are installed into directory /usr/share/pyshared. It's a central module repository for python. Python modules in other system directories are symbolic references to files in this directory.

/usr/lib/pyshared/python2.6/: .so python extensions?

python-central

python-central is a tool for Python module management.

pycentral: register and build utility for Python packages. It manages python modules you installed.

pyversions prints information about installed, supported python runtimes,

py_compilefiles: compiles Python .py source files into .pyc or .pyo bytecode format.

It adds hooks for runtime change:

/usr/share/python/runtime.d/pycentral.rtinstall
/usr/share/python/runtime.d/pycentral.rtremove
/usr/share/python/runtime.d/pycentral.rtupdate

python-support

python-central is another tool for Python module management.

modules managed by python-support are installed in another directory which is added to the sys.path using the .pth mechanism. The .pth mechanism is documented in the Python documentation of the site module.

During installation, it adds a file /usr/lib/python2.6/dist-packages/python-support.pth which contains /usr/lib/pymodules/pythonX.Y/. This directory also contains the byte-compiled modules for version pythonX.Y.

update-python-modules can be used to rebuild those modules.

It also adds hooks for runtime change:

    /usr/share/python/runtime.d/python-support.rtinstall
    /usr/share/python/runtime.d/python-support.rtremove
    /usr/share/python/runtime.d/python-support.rtupdate

Saturday, November 06, 2010

euca2tools

Resources:

http://wiki.debian.org/euca2ools

http://open.eucalyptus.com/wiki/Euca2oolsUsing

If you choose to use REST APIs, following options are necessary:

-U: endpoint to which requests are sent

-a: access key ID

-s: secret key

Note: for some tools, access key ID is specified via option "-A" instead of "-a", "-S" instead of "-s".

Install a package from Debian repository for Ubuntu

You can directly use Debian repository. But Debian packages may or may not be compatible with Ubuntu. So you take your own risks by doing so. Another way is to download source and build the package, which is described below.

1) Add a line to /etc/apt/sources.list

deb-src repo-url

2) Update package index

sudo apt-get update

3) Install dependencies: (These dependencies are downloaded from Ubuntu repository, not Debian repository)

sudo apt-get build-dep pkg_name

4) Download source and build package

apt-get -b source pkg_name

Now you should have a pkg_name.deb file generated in current directory.

5) Install the package

sudo dpkg -I pkg_name.deb

6) Revert /etc/apt/sources.list file by removing the line added in step 1).

7) Rebuild package index

sudo apt-get update

You are done!

Wednesday, November 03, 2010

"Twitter is over capacity"

Saturday, October 30, 2010

Arrays in Bash (Cheatsheet)

Description	Indexed array	Associative array
Declare an array	declare -a array	declare -A array
Assignment	array=(value1 value2 … valuen)	array=([sub1]=value1 … [subN]=valueN) Note:You must declare it first you can use declare -A array=([sub1]=value1 … [subN]=valueN)
Expand to all values	${array[@]} or ${array[*]}	${array[@]} or ${array[*]}
Expand to all keys	${!array[@]} or ${!array[*]}	${!array[@]} or ${!array[*]}
length of an array	${#array[@]} or ${#array[*]}	${#array[@]} or ${#array[*]}
Remove entire array	unset array or unset array[@] or unset array[*]	unset array or unset array[@] or unset array[*]
Remove an element	unset array[index]	unset array[index]
Array join	array1+=(${array2[@]}) or array1+=(v1 v2 … vN)	array1+=([sub1]=v1 [sub2]=v2 … [subN]=vN)
Length of an element:	${#array[index]}	${#array[index]}

Resources

http://tldp.org/LDP/abs/html/arrays.html

Friday, October 29, 2010

Bash config file loading order

interactive	login	files sourced	Related option
Y	Y	/etc/profile, {~/.bash_profile\| ~/.bash_login\| ~/.profile } Only one of the files in brackets will be executed.	--noprofile
N	Y		--noprofile
Y	N	/etc/bash.bashrc, ~/.bashrc	--norc

Kindle 3rd generation

I got problems when I tried to read some books in Chinese.

I will basically investigate following questions

What encodings are supported by Kindle
What fonts are supported
If the fonts used in the PDF cannot be found, how Kindle handles it.
What formats are supported?

Wednesday, October 27, 2010

GNOME notes

GDM

/etc/init.d/gdm, /etc/init/gdm.conf: these two files are used to start gdm (actually gdm-binary).

Default display manager is stored here: /etc/X11/default-display-manager

GDM configuration: http://library.gnome.org/admin/gdm/2.32/gdm.html#configuration

Tools: gdmsetup

/etc/gdm

/user/share/gdm/

After the user logins in, gnome-session is invoked.

http://library.gnome.org/admin/system-admin-guide/stable/sessions-1.html.en

man gnome-session

Default session is specified at /desktop/gnome/session/default_session.

Use gconf-editor to view value of /desktop/gnome/session/default_session. In my case the value is "gnome-settings-daemon".
/usr/lib/gnome-settings-daemon/gnome-settings-daemon --gconf-prefix=/apps/gdm/simple-greeter/settings-manager-plugins. The description of option "--gconf-prefix" is "GConf prefix from which to load plugin settings"

Some resources:

http://live.gnome.org/SessionManagement/GnomeSession

http://standards.freedesktop.org/desktop-entry-spec/latest/

Autostart-spec: http://standards.freedesktop.org/autostart-spec/autostart-spec-latest.html

http://lists.freedesktop.org/archives/xdg/2007-January/007436.html

Autostart apps are specified in directory /etc/xdg/autostart/, ~/.config/autostart/

Use /usr/bin/gnome-session-properties to set session properties (e.g. autostart programs). Those changes are specific to a user.

Tuesday, October 26, 2010

JAASRealm + SSL

http://tomcat.apache.org/tomcat-6.0-doc/realm-howto.html#JAASRealm

This is a good article on the topic: http://blog.frankel.ch/custom-loginmodule-in-tomcat. Some of following code is borrowed from the article.

Following steps are listed on the official tomcat doc. I will elaborate each of them.

"Write your own LoginModule, User and Role classes based on JAAS (see the JAAS Authentication Tutorial and the JAAS Login Module Developer's Guide) to be managed by the JAAS Login Context (javax.security.auth.login.LoginContext) When developing your LoginModule, note that JAASRealm's built-in CallbackHandler only recognizes the NameCallback and PasswordCallback at present. "

package test;

import java.io.IOException;
import java.util.Map;

import javax.security.auth.Subject;
import javax.security.auth.callback.Callback;
import javax.security.auth.callback.CallbackHandler;
import javax.security.auth.callback.NameCallback;
import javax.security.auth.callback.PasswordCallback;
import javax.security.auth.callback.UnsupportedCallbackException;
import javax.security.auth.login.LoginException;
import javax.security.auth.spi.LoginModule;

/**
 * Login module that simply matches name and password to perform authentication.
 * If successful, set principal to name and credential to "AuthorizedUser".
 *
 * @author Nicolas Fränkel. Modified by Gerald Guo.
 * @since 2 avr. 2009
 */
public class PlainLoginModule implements LoginModule {

    /** Callback handler to store between initialization and authentication. */
    private CallbackHandler handler;

    /** Subject to store. */
    private Subject subject;

    /** Login name. */
    private String login;

    /**
     * This implementation always return false.
     *
     * @see javax.security.auth.spi.LoginModule#abort()
     */
    @Override
    public boolean abort() throws LoginException {

        return false;
    }

    /**
     * This is where, should the entire authentication process succeeds,
     * principal would be set.
     *
     * @see javax.security.auth.spi.LoginModule#commit()
     */
    @Override
    public boolean commit() throws LoginException {

        try {

            PlainUserPrincipal user = new PlainUserPrincipal(login);
            PlainRolePrincipal role = new PlainRolePrincipal("AuthorizedUser");

            subject.getPrincipals().add(user);
            subject.getPrincipals().add(role);

            return true;

        } catch (Exception e) {

            throw new LoginException(e.getMessage());
        }
    }

    /**
     * This implementation ignores both state and options.
     *
     * @see javax.security.auth.spi.LoginModule#initialize(javax.security.auth.Subject,
     *      javax.security.auth.callback.CallbackHandler, java.util.Map,
     *      java.util.Map)
     */
    @Override
    public void initialize(Subject aSubject, CallbackHandler aCallbackHandler, Map aSharedState, Map aOptions) {

        handler = aCallbackHandler;
        subject = aSubject;
    }

    /**
     * This method checks whether the name and the password are the same.
     *
     * @see javax.security.auth.spi.LoginModule#login()
     */
    @Override
    public boolean login() throws LoginException {

        Callback[] callbacks = new Callback[2];
        callbacks[0] = new NameCallback("login");
        callbacks[1] = new PasswordCallback("password", true);

        try {

            handler.handle(callbacks);

            String name = ((NameCallback) callbacks[0]).getName();
            String password = String.valueOf(((PasswordCallback) callbacks[1]).getPassword());

            if (!name.equals(password)) {

                throw new LoginException("Authentication failed");
            }

            login = name;

            return true;

        } catch (IOException e) {

            throw new LoginException(e.getMessage());

        } catch (UnsupportedCallbackException e) {

            throw new LoginException(e.getMessage());
        }
    }

    /**
     * Clears subject from principal and credentials.
     *
     * @see javax.security.auth.spi.LoginModule#logout()
     */
    @Override
    public boolean logout() throws LoginException {

        try {

            PlainUserPrincipal user = new PlainUserPrincipal(login);
            PlainRolePrincipal role = new PlainRolePrincipal("admin");

            subject.getPrincipals().remove(user);
            subject.getPrincipals().remove(role);

            return true;

        } catch (Exception e) {

            throw new LoginException(e.getMessage());
        }
    }
}

"Although not specified in JAAS, you should create seperate classes to distinguish between users and roles, extending javax.security.Principal, so that Tomcat can tell which Principals returned from your login module are users and which are roles (see org.apache.catalina.realm.JAASRealm). Regardless, the first Principal returned is always treated as the user Principal. "
Also read the API doc http://tomcat.apache.org/tomcat-5.5-doc/catalina/docs/api/org/apache/catalina/realm/JAASRealm.html. If authentication succeeds, your LoginModule must attach at least a user principal and a user role to subject.

package test;

import java.security.Principal;

public class PlainRolePrincipal implements Principal {

    String roleName;
    
    public PlainRolePrincipal(String name) {
        roleName = name;
    }
    public String getName() {
        return roleName;
    }
    
    public String toString() {
        return ("RolePrincipal: " + roleName);
    }   

    public boolean equals(Object obj) {
        if (this == obj) {
            return true;
        }   
        if (obj instanceof PlainRolePrincipal) {
            PlainRolePrincipal other = (PlainRolePrincipal) obj;
            return roleName.equals(other.roleName);
        }   
        return false;
    }   

    public int hashCode() {
        return roleName.hashCode();
    }   
}

You get the idea, you can implement class PlainUserPrincipal in a similar way.

"Place the compiled classes on Tomcat's classpath "
"Set up a login.config file for Java (see JAAS LoginConfig file) and tell Tomcat where to find it by specifying its location to the JVM, for instance by setting the environment variable: JAVA_OPTS=$JAVA_OPTS -Djava.security.auth.login.config==$CATALINA_BASE/conf/jaas.config "
Create a JAAS config file:
-----------------------------
```
CertBasedCustomLogin {
    test.CertBasedLoginModule
    sufficient;
};
```
-----------------------------
When you launch tomcat, use -Djava.security.auth.login.config= to specify where the config file is stored.

"Configure your security-constraints in your web.xml for the resources you want to protect"
The goal of the whole process is to protect some resources. This step specifies which resources should be protected.

<security-constraint>
    <web-resource-collection>
        <web-resource-name>Secure Content</web-resource-name>
        <url-pattern>/cert-protected-users/*</url-pattern>
    </web-resource-collection>
    <auth-constraint>
        <role-name>AuthorizedUser</role-name>
    </auth-constraint>
    <user-data-constraint>
        <transport-guarantee>NONE</transport-guarantee>
    </user-data-constraint>
</security-constraint>
<!-- ... -->
<login-config>
    <auth-method>CLIENT-CERT</auth-method>
    <realm-name>The Restricted Zone</realm-name>
</login-config>
<!-- ... -->
<security-role>
    <description>The role required to access restricted content </description>
    <role-name>AuthorizedUser</role-name>
</security-role>

Basically, it says only users with role "AuthorizedUser" can access the resources cert-protected-users/*.
Note:role-name must match the role attached to subject in step 1) ("AuthorizedUser" in our case) for successful access.

"Configure the JAASRealm module in your server.xml"
Actually, to put web app specific context config into server.xml is not recommended. Instead, I put a file named context.xml under directory META-INF.
```
<Context>
    <Realm className="org.apache.catalina.realm.JAASRealm" appName="CertBasedCustomLogin"
        userClassNames="test.PlainUserPrincipal"
        roleClassNames="test.PlainRolePrincipal">
    </Realm>
</Context>
```
The value of appName must match the name specified in step 4.
Add "-Dsun.security.ssl.allowUnsafeRenegotiation=true" for renegotiation support. (Read http://java.sun.com/javase/javaseforbusiness/docs/TLSReadme.html for more information)

Note

Some versions of Tomcat have problems to support JAASRealm + SSL mutual auth (https://issues.apache.org/bugzilla/show_bug.cgi?id=45576). I tried 6.0.18, 6.0.20 and 6.0.29. Only 6.0.20 works for me. 6.0.29 gave errors when I tried.

More resources:

http://tomcat.apache.org/tomcat-6.0-doc/realm-howto.html#JAASRealm

http://tomcat.apache.org/tomcat-5.5-doc/catalina/docs/api/org/apache/catalina/realm/JAASRealm.html

http://wiki.metawerx.net/wiki/Web.xml.AuthConstraint

https://issues.apache.org/bugzilla/show_bug.cgi?id=45576

http://java.sun.com/javase/javaseforbusiness/docs/TLSReadme.html

Monday, October 25, 2010

web.xml schema (simplified)

  <xsd:complexType name="web-appType">

    <xsd:choice minOccurs="0" maxOccurs="unbounded">
      <xsd:group ref="j2ee:descriptionGroup"/>
      <xsd:element name="distributable" type="j2ee:emptyType"/>
      <xsd:element name="context-param" type="j2ee:param-valueType">
        <xsd:annotation>
          <xsd:documentation> 
            The context-param element contains the declaration
            of a web application's servlet context
            initialization parameters.  
          </xsd:documentation>
        </xsd:annotation>
      </xsd:element>

      <xsd:element name="filter" type="j2ee:filterType"/>
      <xsd:element name="filter-mapping" type="j2ee:filter-mappingType"/>
      <xsd:element name="listener" type="j2ee:listenerType"/>
      <xsd:element name="servlet" type="j2ee:servletType"/>
      <xsd:element name="servlet-mapping" type="j2ee:servlet-mappingType"/>
      <xsd:element name="session-config" type="j2ee:session-configType"/>
      <xsd:element name="mime-mapping" type="j2ee:mime-mappingType"/>
      <xsd:element name="welcome-file-list" type="j2ee:welcome-file-listType"/>
      <xsd:element name="error-page" type="j2ee:error-pageType"/>
      <xsd:element name="jsp-config" type="j2ee:jsp-configType"/>
      <xsd:element name="security-constraint" type="j2ee:security-constraintType"/>
      <xsd:element name="login-config" type="j2ee:login-configType"/>
      <xsd:element name="security-role" type="j2ee:security-roleType"/>
      <xsd:group ref="j2ee:jndiEnvironmentRefsGroup"/>
      <xsd:element name="message-destination" type="j2ee:message-destinationType"/>
      <xsd:element name="locale-encoding-mapping-list" type="j2ee:locale-encoding-mapping-listType"/>
    </xsd:choice>

    <xsd:attribute name="version" type="j2ee:web-app-versionType" use="required"/>
    <xsd:attribute name="id" type="xsd:ID"/>
  </xsd:complexType>

Friday, October 22, 2010

Apache SSL + SVN notes

Recently I started to use Apache http server again. I am trying to build a SVN repository which can be accessed through HTTPS.

Environment

Ubuntu
Apache 2.2.17 source

Doc: http://httpd.apache.org/docs/2.2/

Build

./configure --prefix=/home/gerald/servers/httpd-2.2.17 --enable-ssl --enable-dav --enable-so
make
make install

add bin directory to your PATH

add man pages:

function addManPath() { 
    if (($# != 1)); then return 0; fi

    path="$1" 
    if [ "x$MANPATH" == "x" ]; then 
        export MANPATH="$(manpath):$path" 
    else 
        export MANPATH="${MANPATH}:$path" 
    fi  
}

addManPath "~/servers/httpd-2.2.17/bin/man"

start up apache server: apachectl start

benchmarking: ab -n 10000 -c 100 http://localhost:80/

Show modules:
httpd -M //show all loaded modules
httpd -S      // show parsed virtual host settings
httpd -l    //listed compiled in modules
httpd -L   //list available configuration directives
httpd -V //show compile settings (not settings for compiling the whole package, the settings for compiling the server - httpd).

Configure SSL

Prepare your certificate and private key.
Uncomment line "Include conf/extra/httpd-ssl.conf" in httpd.conf.
Change file "conf/extra/httpd-ssl.conf". The most important directives are SSLCertificateFile and SSLCertificateKeyFile.
Test whether you can access your website through HTTPS.

SSL + SVN

Get modules dav_svn and authz_svn

wget http://altruistic.lbl.gov/mirrors/ubuntu/pool/universe/s/subversion/libapache2-svn_1.6.5dfsg-1ubuntu1_i386.deb

dpkg-deb -x libapache2-svn_1.6.5dfsg-1ubuntu1_i386.deb

copy two module (.so files) to apache modules directory.

Configure modules

Edit file <Apache>/conf/extra/dav_svn.load

LoadModule dav_svn_module modules/mod_dav_svn.so
LoadModule authz_svn_module modules/mod_authz_svn.so

Edit file <Apache>/conf/extra/httpd.conf, add following two lines

Include conf/extra/dav_svn.load
Include conf/extra/dav_svn.conf

Edit file <Apache>/conf/extra/dav_svn.conf

<Location /svn/> <!-- trailing / is necessary!! -->
  DAV svn
  
  SSLRequireSSL  # enforce use of HTTPS
  #SVNPath /var/lib/svn
  SVNParentPath /home/svn/projects
  SVNListParentPath on 
  
  AuthType Basic
  AuthName "Subversion Repository"
  AuthUserFile Apache_Dir/conf/dav_svn.passwd

  # To enable authorization via mod_authz_svn
  AuthzSVNAccessFile Apache_Dir/conf/dav_svn.authz

  Require valid-user
</Location>

http://stackoverflow.com/questions/488778/how-do-i-list-all-repositories-with-the-svnparentpath-directive-on-apachesvn

Create authentication and authorization files

Create password file: htpassword -cm <Apache>/conf/dav_svn.passwd gerald

Edit file <Apache>/conf/dav_svn.authz

[groups]
admin=gerald
guests=guest

[/]
@admin=rw

[repository_name:/directory]
@admin=rw

Test

Restart Apache httpd server.
Go to https://your_ip/svn/ (note: the trailing / is necessary!)

Permission Problem

If you see following error when you try to commit some code:

svn: Commit failed (details follow):
svn: Can't open file '/path/to/your/repo/db/txn-current-lock': Permission denied

follow these steps:

Execute command: ps -wwf $(pgrep httpd)
You should say one of the processes is run as root. All other processes are run as daemon (in my case).
To make httpd able to access(read/write) your svn repository, you should set the file permissions of svn repository correctly.
chown -R gerald:daemon /path/to/svn/repo
chmod -R 770 /path/to/svn/repo

Friday, October 15, 2010

Replace token using ant in Maven 2

<plugin>
    <artifactid>maven-antrun-plugin</artifactid>
    <executions>
      <execution>
        <id>Copy and filter af file</id>
        <goals><goal>run</goal></goals>
        <phase>prepare-package</phase>
        <configuration>
          <tasks> 
            <copy file="source_file" filtering="true" failonerror="true" overwrite="true" tofile="dest_file">
              <filterset>
                <filter value="${variable_name}" token="token_to_be_replaced" />
                <filter value="value" token="token_to_be_replaced" />
               </filterset>
            </copy>
          </tasks>
        </configuration>
      </execution>
    </executions>
</plugin>

Wednesday, September 15, 2010

XAuth!

Meebo and some other supporters just released XAuth. The video on page http://xauth.org/info/ is really informative. XAuth provides front-end solution for registration of various web service sessions.

If you are authenticated to a service, the service puts a registry entry into XAuth local storage.
Other mashup apps/publisher websites can ask XAuth for a list of web services that the user has been authenticated to. Then the app can adjust UI according to the retrieved data.

In current reference implementation, it requires HTML5 features – postMessage and local storage.

Javascript code: http://github.com/xauth/xauth
Official web site: http://xauth.org/

Friday, December 31, 2010

Thursday, December 23, 2010

Friday, December 17, 2010

Thursday, December 16, 2010

Wednesday, December 15, 2010

Monday, December 13, 2010

Temporary change

Permanent change

Init script

upstart jobs

service

invoke-rc.d

Example - network

Sunday, November 28, 2010

Format string

Flags

Minimum Field Width

Precision

Length Modifier

Conversion specifier

IOMeter

iozone

Examples

Auto Mode

Single Test

Throughput Test

Mixed Workload Test

Result visualization

Bonnie

Bonnie64

Bonnie++

Resources

Thursday, November 25, 2010

Sunday, November 14, 2010

Mail

dpkg-dev

devscripts

debhelper

dh-make

Low-level understanding

Deb binary package format

Deb control

Deb data

High-level understanding

Deb Source Package

changelog

copyright

control

Package build

Binary package

Subclipse

Code checkout

Install Ant and Ivy

IvyDE

Shell and Unix commands on Windows

Create Eclipse Project for Hadoop

Create Run Configuration

Customize project builder

Wednesday, November 10, 2010

Dpkg log

apt-get log

aptitude

Resources

Compile ftrack

Dependencies

Build

Use fusetrac.py

Tuesday, November 09, 2010

site module

local admin

Central repository

python-central

python-support

Saturday, November 06, 2010

Wednesday, November 03, 2010

Saturday, October 30, 2010

Friday, October 29, 2010

Wednesday, October 27, 2010

Tuesday, October 26, 2010

Note