Monday, December 12, 2011

RabbIT

Enable filter rabbit.filter.ReverseProxy, by adding it to the front of httpinfilters value.
Configure values for sector rabbit.filter.ReverseProxy.
httpinfilters=rabbit.filter.ReverseProxy,rabbit.filter.HttpBaseFilter,rabbit.filter.DontFilterFilter,rabbit.filter.BlockFilter,rabbit.filter.RevalidateFilter
......
[rabbit.filter.ReverseProxy]
# This filter is not enabled by default, add it to httpinfilters if you want it.
# This Filter makes rabbit work as an accellerator for one web site.
# Change requests
transformMatch=^/(.*)
transformTo=http://<target_host>:
<port>/$1
# Deny proxy requests, you probably want this.
# deny=^http(s?)://.*
deny=
# If we want to allow admin access.
allowMeta=true
Run RabbIT:
java -jar jars/rabbit4.jar -f conf/rabbit.conf
Resource: http://www.khelekore.org/rabbit/

Friday, December 02, 2011

Memory allocation settings in Hadoop

Edit file conf/mapred-site.xml to change amount of memory allocated to sorting:

<property> 
    <name>io.sort.mb</name>
    <value>300</value>
</property>

Edit file conf/mapred-site.xml to change amount of memory allocated to each map/reduce task:

<property>
    <name>mapred.child.java.opts</name>
    <value>-Xmx800m</value>
</property>

Edit file conf/hadoop-env.sh to change amount of memory allocated to Hadoop daemons:

export HADOOP_HEAPSIZE=1000

Change ports used by Hadoop

Edit file conf/hdfs-site.xml to change ports used by HDFS

    <property>
        <name>dfs.secondary.http.address</name>
        <value>0.0.0.0:51090</value>
    </property>
    <property>
        <name>dfs.datanode.address</name>
        <value>0.0.0.0:51010</value>
    </property>
    <property>
        <name>dfs.datanode.http.address</name>
        <value>0.0.0.0:51075</value>
    </property>
    <property>
        <name>dfs.datanode.https.address</name>
        <value>0.0.0.0:51475</value>
    </property>
    <property>
        <name>dfs.datanode.ipc.address</name>
        <value>0.0.0.0:51020</value>
    </property>
    <property>
        <name>dfs.http.address</name>
        <value>0.0.0.0:51070</value>
    </property>
    <property>
        <name>dfs.https.address</name>
        <value>0.0.0.0:51470</value>
    </property>

Edit file conf/mapred-site.xml o change ports used by MapReduce

    <property>
        <name>mapred.job.tracker.http.address</name>
        <value>0.0.0.0:51030</value>
    </property>

    <property>
        <name>mapred.task.tracker.http.address</name>
        <value>0.0.0.0:51060</value>
    </property>

exclude directories when using GNU tar

tar zvcf name.tar.gz --exclude path/to/dir1 --exclude path/to/dir2 path/to/tar

Note:

  1. Do not include a trailing '/' in the path of excluded directories.  Otherwise, it won't work.
  2. Put --exclude before the directory/file to be tarred.