Archive for the ‘Systems’ Category.

Move NetApp Root Volume (vol0) to a New Aggrigate

By default vol0 is the root volume on a NetApp storage device and is stored on aggregate aggr0. After accidentally assigning too many disks to aggr0 I found the need to decrease the size of the aggregate. Unfortunately this is not possible. I had to create a new aggregate to copy vol0 to and then change the new volume to be the the root volume.

Getting Hadoop MapReduce 0.20.2 Running On Ubuntu

I decided to setup a Hadoop cluster and write a MapReduce job  for my distrbuted systems final project. I had done this before with an earlier release and it was fairly straight forward. It turns out it is still straight forward with Hadoop 0.20.2, but the process is not well documented and the configuration has changed. Hopefully I can clear up the process here.

What is Hadoop MapReduce?

MapReduce is a powerful distributed computation technique pioneered by Google. Hadoop MapReduce is an open source implementation written in Java that is maintained by the Apache Software Foundation. Hadoop MapReduce consists of two main parts: the Hadoop distrbuted file system (HDFS) and the MapReduce system.

Getting Hadoop

The first step is to download Hadoop. Go to http://hadoop.apache.org/mapreduce. It is worthwhile to read up on how Hadoop and MapReduce work before you move onto the installation and configuration.

Plan The Installation

Before the actual installation there is a bit of planning to be done. Hadoop works best when run from a local file system. However for convienceince it is also nice to have a common NFS file share to save configuration and log files. Below is an image of what I setup. For the distributed setup at least two nodes are required.

Initial Setup

Before doing any setup of the actual Hadoop system there is some initial setup that needs to be completed, namely the creation of a directory on each node and a shared ssh key. The first step is the easiest. A hadoop install directory needs to be created on each nodes that is going to be a part of the system. The directory must have the same name and location on each node. It is recommended not to use an NFS file share for the installation directory as it can affect performance.

After the install directory has been created a shared ssh key needs to be generated on each node and added to the authorized_hosts file. This allow for passwordless ssh login and is required by the Hadoop cluster startup scripts.

Open Firewall Ports

Hadoop requires a number of ports to be open for the system to work.

Port Function
50010 DataNode Port
50020 JobTracker Service
50030 MapReduce Administrative Page
50105 Backup/Checkpoint node
54310 HDFS File System
54311 JobTracker Service
50060 TaskTracker Port
50070 DFS Administrative Webpage (namenode)
50075 DataNode Port
50090 SecondaryNameNode Port

Configuration Files

There are three main configuration files that need to be edited: hdfs-site.xml, mapred-site.xml, and core-site.xml. Each file resides in the conf folder where Hadoop is extracted from. There are a lot of parameters that can go into each file but only a few basic ones needs to be set. I have provided my configuration files below. The final file that needs to be edited is hadoop-env.sh, which is a shell script that sets up Hadoop environment variables. At the very least the $JAVA_HOME variable needs to be uncommented and properly set.

core-site.xml

hdfs-site.xml

mapred-site.xml

hadoop-env.sh

Set the Slaves and Master

The master node needs to be defined in the hadoop_dir/conf/masters file. Each slave node needs to be hadoop_dir/conf/slaves file, one machine name/IP address per line.

Deploy the Installation and Configuration Files

The installation and configuration files need to be deployed to each node in the cluster. The easiest way to do this is through scp. I wrote the script below so that I could run a command on each node in my cluster. Another alternative is the Cluster SSH program (cssh). Either approach is preferable to logging onto each node to run  a command.

Using my run_comm.sh script I ran scp on each node in the cluster:

./run_comm.sh "scp -r ~/hadoop /opt/hadoop/hadoop"

This runs the command in quotes on each node in the cluster. In this case I copied the Hadoop installation fom the NFS share (my home directory) to a local directory on each node.

run_comm.sh

Formatting the NameNode

Now that the Hadoop files are on each node the NameNode can be formatted to setup the Hadoop File System.

hadoop_dir/bin/hadoop namenode -format

Starting the Hadoop File System

Now that the namenode has been formatted the distributed file system (DFS) can be started. This is done by using the start-dfs.sh script in the bin directory of the Hadoop installation.

hadoop_dir/bin/start-dfs.sh

The status of the Hadoop File System can be viewed from the administrative page on on the master server, http://master_server:50070.

Starting the MapReduce System

The final step to setting up MapReduce is to start the MapReduce system. This is done by using the start-mapred.sh script that is located in the bin directory of the Hadoop installation.

hadoop_dir/bin/start-mapred.sh

The status of the MapReduce system can be viewed from the administrative page on on the master server, http://master_server:50030.

Submitting a MapReduce Job

Now that the cluster is up and running it is ready to start accepting MapReduce jobs. This is done using the hadoop executable from the bin directory of the Hadoop installation and a jar file that contains a MapReduce program. An example of running the WordCount demo program provided with Hadoop is shown below.

hadoop_dir/bin/hadoop jar jar_location/wordcount.jar org.myorg.WordCount /file_dir_in_hdfs /output_dir_in_hdfs

Performance Report in the Virtual Infrastructure Client

VMware vCenter server reports a lot of performance information and displays tables in the Virtual Infrastructure client. They provide a nice at a glace view, but do not allow for anything more. While poking around the GUI I found a feature to export the performance data to Excel by going to file-reports-performance. This is a nifty tool that is not very well documented.

Your Phone, Google, and the Cloud

googlesync2

Google has had sync available for quite some time, but up until recently it has only allowed for contacts and calendars to be synchronized between your phone and Google.The feature has been a great and allowed users to easily back their data up to the “cloud” where it will forever reside. Recently this feature got even better with the addition of the ability to sync mail and the addition of push. Now your phone can maintain a connection with Google allowing new emails, contact updates, and calendar updates to be automatically pushed to your phone. So far the service has been great (flaky at times though). The only down side- watching your cellphone battery die much faster.

MSI Wind U100 Netbook Review

MSi Wind Netbook

I recently order an MSi Wind 10″ Netbook. I will be traveling to Europe in a week and wanted something small and portable that will let me surf the web and do some on-the-road web development. After hours of searching the MSi U100-451US seemed like the best choice. I picked one up for $279.99 at eWiz.com.

Specs

Processor: Intel Atom N270 1.6 GHz (with hyperthreading)

RAM: 1 GB DDR (PC5300 (upgraded to 2GB)

Included Peripherals: On board mic, 0.3 MP digital camera, wireless, 10/100 Ethernet,  bluetooth

Adding a 1GB RAM Stick

A great how to on installing RAM in the Wind is available here: http://www.laptopmag.com/advice/how-to/msi-wind-ram.aspx. One thing that is not mentioned on the webpage is that some models of the Wind have a “warranty void if removed” sticker covering one of the screws on the bottom of the netbook.  It took me far longer then it should have to realize that there was a screw under the sticker that was preventing me from removing the bottom.

A word to the wise, do not go with really cheap RAM. I bough a cheap RAM stick from ebay for $12 and it did not work. I ended up going to the local computer store and getting a Crucial 1GB PC 5300 RAM stick for around $40.

Installing Vista

One of my first orders of business was to install Vista Ultimate on my new netbook. At this point you are probably laughing “haha Vista on a netbook, what an idiot.” I was a little apprehensive about installing Vista on such an underpowered machine (my main pc running Vista has 8 GB of RAM with a Quad core processor). I was inspired by this article: http://www.notebookreview.com/default.asp?newsID=4505 which actually says they saw an important in running Vista over XP in terms of both performance and battery life. Low and behold Vista runs like a dream, even with the sidebar running and the Aero theme running. In fact the thing that really hurt my “User Experience Rating” the most was the GPU, which actually scored respectably for a netbook. Aside from the GPU, the CPU scored the next worse which again is not surprising from the atom chip, but this is still respectable for a netbook.

Netbook Score

These results were after I downloaded all of the drivers and MSi utilities from the MSi website. Surprisingly all of the Vista drivers are available from the support site. The version says Windows XP 32bit oallthe downloads, but they contain Vista drivers as well.

Adobe CS4

After installing Vista and downloading serveral hundred megabytes of updates I loaded up Adobe CS4. Again you are probablly saying something to the effect “Photoshop won’t run on a netbook”. That was my exact initial response, but I figured I would give it a try anyway. I was pleasently suprised. Photoshop actually ran very smoothly and the layout changed to take advantage of the netbook’s quirky resolution. Now I’m not saying that I have an amazing editing machine, but I can open and edit psd files on the go as well  as fire up dreamweaver to do some web developement.

Photoshop on a Netbook

Conclusion

The MSi Wind is a solid mobile computing solution. I find myself using my netbook more and more. Its ultraportable and ultralight design make it perfect for a college student who wants to take notes in class or those that want to be able to use the internet on the go without havng the carry around a heavy full-sized laptop. It also can run some fairly intense applications like Photoshop with only a slight delay.

Exchange 2007 and Active Directory

As part of a project I am working on for my internship with MITRE I was tasked with building a Domain containing a Server 2003 Domain Controller, exchange 2007 Server, Microsoft Office Sharepoint Services (MOSS) 2007 Server, and SQL Server 2005. Each service was installed in a server 2003 virtual machine and configured several months ago. The Domain Controller virtual machine was having some issues yesterday so it was wiped and a new Domain Controller was set up. All of the virtual machines were able to rejoin the domain without problem. However when the Exchange server had a number of quirks. The virtual machine hung for about 20 minutes on “applying security policy” during start up and was having a host of authentication issues that prevented outlook web access (owa) from working properly. Since the exchange server was part of a testbed environment the easiest solution was to build a new exchange virtual machine. The moral of the story is that exchange is extremely reliant on the domain controller and very sensitive to its configuration. Since the environment is entirely virtual there should have been more backup copies and snapshots of the virtual machines to revert to, both are good practices that will be applied in the future.

Totally Awsome Gaming PC for Under $750

Back in September 2008 (before the severe economic downturn)  it become clear that my 2 year old laptop was no longer powerful enough to handle the average work load, let along gaming. After months of searching and price hunting I put together the rig outlined in this post. Without cutting corners I was able to build a quite powerful machine without breaking the bank (well not too much anyway).

Parts

CPU: Intel Quad Core Q6700 2.66 GHz -$215.70

CPU Fan: Thermaltake Silent 775D -$18.96 (after $10 mail in rebate)

Motherboard: XFX nForce 680i LT -$83.01 (after $20 mail in rebate)

RAM: 8GB OCZ SLI PC6400 DDR2 800MHz (4x2048MB) -$80 (after $40 rebate on each 4 GB kit)

GPU: XFX GeForce 8800 GT -$105 (after $20 rebate)

PSU: Antec BP550 Plus 550W ATX12V -$59.99

Hard Drive: 1 TB SAMSUNG Spinpoint F1 HD103UJ -$96.99

Optical Drive:SAMSUNG 22X DVD Burner -$25.99

Case: Briza 8-Bay ATX -44.99

Total: $ 730.63

All of the links above are to the actual places that I bought the parts from. There was a lot of rebates to deal with but I had shopped around for months and was convinced that the sales were good enough to warrant building my new PC. Notice that all the parts did not come from one site. I meticulously shopped each part so I could get the most out of my money.

Building the PC

I stated with the motherboard, 4 GB of RAM, the CPU, and the GPU. The case arrived a few days later. I had an old 500 GB SATAII hard drive lying around that I initially salvaged for the build.

It became clear very quickly that the 400 watt power supply that came with the case would be less then ideal. I also realized that if I wanted to get an OS onto the new PC an optical drive would be a nice thing to have. A quick order at NewEgg had the missing parts within 3 business days ( free 3 day shipping rock!).

Blood Sweat and Tears

After I had assembled the system I noticed something very odd. The system would not POST with both 2 GB RAM modules installed. Hours on the phone with OCZ (their support staff is amazing) lead to the recommendation that I send one of the RAM modules back. After receiving the new RAM I still had the same issue and the system was very unstable with just the one module.

Another hour on the phone, this time with XFX, yielded the suggestion that I return the motherboard to TigerDirect for a replacement. A quick call to tiger and they emailed me a label to ship out the motherboard for free. It turns out tiger has an awesome RMA system.

A few days later I got the replacement motherboard. I slipped in both 2 GB RAM modules and the system booted up. The increase of speed from 2 to 4 GB of memory convinced me that adding another 4 more GB for $40 (after rebates of course) was well worth the money.

A few months later I decided it was time for another upgrade. The hard drive that I salvaged for the original build was very loud. I had been shopping for a 1TB for a long time and finally gave into buying one. I found a Samsung SpinPoint for $100 with free shipping from ZipZoomFly and went for it.

Choosing an OS

Due to the fact that I have a processor that supports 64-bit Operating Systems it seemed like the best idea to go 64-bit. I installed an OEM version Windows Vista Ultimate. The decision to go with Vista was based on the desire to take advantage of the eye-candy offered by the Aero theme and Direct X 10.

Results of the Build

The PC is stable and super fast. No overclocking was done because it was not justified. A screen shot of the Windows Experience Index is included below. The only thing that the system falls a bit short on is RAM. I could have easily spent a lot more money on super high quality ram, but the OCZ SLI Ready modules that I got deliver more then sufficient performance.

vista-test