Friday, December 31, 2010

My Little NAS

I'm not sufficiently disciplined to perform backups manually. I admire people that are. It's necessary for me to automate that or it's not going to happen on a sufficiently regular schedule. I've done that using rsync to copy my ~/Documents directory to my home file server. The brief description is that I create a new copy on the first day of every month and then use the rsync --link-dest=DIR option to create incremental backups the remaining days of the month. These incremental backups then get overwritten a month later but the first day of the month backups get preserved forever. Hard drive space is cheap and getting cheaper.

The weak spot in this strategy was the lack of offsite backups. I decided to do something about that. The overall strategy was to colocate a backup server at our son's place so I could backup over the Internet. There are cloud based services that do this, but I'm not sure many support Linux clients. Instead I chose to trade up front equipment costs against a monthly fee. I also didn't want to have to be counting bytes to make sure I didn't exceed any quotas. And finally, I like the thought of using an Ubuntu based NAS where I have full control over operation and setup rather then be limited by the APIs and services provided by a vendor. The backup to the remote is handled by an appropriate rsync command that runs over an ssh connection.

Here is the H/W I chose:
I went with low power drives because I was more concerned about power usage than performance. Backup will be over cable Internet connections where upstream bandwidth is capped so extra performance would be wasted. The WDEARS drive presented an additional wrinkle. It uses 4K sectors but hides that fact from the OS. (More on this later.) The overall scheme was to mirror the two drives and install Ubuntu 10.04 LTS Server. I briefly considered a BSD based NAS package, but it didn't do something I needed. (Boot from a RAID, IIRC.)

Another nicety is that this board seems to reliably support Wake On Lan. (WOL) In fact, it is possible to WOL across the Internet if you can get your router properly configured. That certainly gets into some arcane aspects of routing, but we were able to configure it on my DD-WRT based router as well as my son's (?) D-Link offering. I suggest if you decide to do this, you search for a forum post, wiki, etc. that details how to do it with your equipment. With that information in hand, we found this to be not too difficult.

To install, I downloaded the 32 bit server install CD .iso and used it to create a bootable USB flash drive. The actual installation presented a few wrinkles. As mentioned, the WDEARS lies about physical sector size. The result is that default partitioning will not align partitions on the drive's native sectors. This will degrade performance because writes to the drive will need to be blocked to the actual physical sector size. Reads should suffer less. I had a go at getting the partitioning right but it seemed that the installation tools did not provide that capability. Were I to do this again, I'd either hook up the drive to another system or boot a live CD and partition the drives with whatever tools did this the best. I think that fdisk actually provides that capability now but I could not find that using the tools on the install CD. Since I was not too worried about performance, I settled for the tools at hand and moved on.

I had found a couple descriptions for how to install to a RAIDed boot partition, but these seemed to not be too helpful for 10.04 LTS. RAID is one of those corner cases and I think support for boot may be in some flux. The layout I finally wound up with was separate partitions on each drive which were then combined into RAID devices. I think I could have RAIDED the entire drives but then I would have needed to use LVM to provide separate root and data partitions. I thought life would be simpler without LVM. What finally seemed to work was to install to a single drive for which the partitions were RAIDed. There were two RAID1 partitions which were operating in degraded mode during the install. Once the system was up and running, I added the second drive partitions to the RAIDs and everything was fully working. I think. I did not actually try to remove either drive and verify that the system would still boot. I can't assume that should a drive fail, the system will still boot. It seems likely that the boot process would hang on the failed drive anyway. At worst, I think I might need to boot a live CD/USB to get access to by data should there be a failure. I'm comfortable with that.

Another wrinkle turned out to be ignorance of what my backup commands were doing. As mentioned above, I was using symlinks to reduce backup space when backing up to my local drive. What I did not realize was that the rsync command I used to copy to the remote backup dereferenced the links. I discovered this when I attempted to resize the local backup LVM partition. I wiped out my local backup copy. I found I could not restore my remote backup because it was too big as a result of all of the dereferenced links. I would up biting the bullet and upgrading my local raid (5x 200GB RAID5) to new drives (2x 2TB RAID1.) I have since added the --links option to my rsync command which should fix the expanding storage requirements problem.

I still have a couple issues to deal with. The rsync command on the local PC runs on a user account. because of that, it encounters permission problems for other users that do not make their ~/Documents directory world readable. That could be sidestepped by running the local rsync command as root, but that further complicates credentials on the remote host to allow the ssh connection. (In other words, I could not get that to work.) I suppose the easier way is to modify the perms on the ~/Documents directories. That seems like an acceptable solution on a home LAN used by husband and wife. The other issue is some setup for notification of problems. I don't have sufficient discipline to regularly check the health of various systems. I could automate that, but I need to set up mail delivery to effect that. Years ago I set up sendmail and used mail spools to handle this sort of stuff, but I have left postfix unconfigured. I need to dig into email configuration and sort out how to integrate that into our system where I normally receive my email via gmail.

Thursday, December 30, 2010

The Importance of Benchmarks

I've been running a bunch of benchmarks on our various systems lately. I go through phases...

I plan to upgrade some systems in the near future so I want to make sure that I'm going to get better performance from the new stuff. (Or hopefully marvel at the increase in performance.) On the other hand, some systems are being downsized to some extent. I plan to replace an old Athlon 64 3400+ with an Atom based system similar to what I'm using for offsite NAS. I am curious if I'll be giving up some performance and if so, how much. A part of the new system will be to use USB sticks for the boot drive because I can. I've done a boatload of performance measurements on RAIDed USB drives as well as the variety of drives on other systems. It looks like four RAIDed USB sticks provide performance not too far from spinning drives. I need to sift through all of the data I've collected before I can conclude anything.

Over the holidays our son gave us his retired Dell Vostro 1700 to replace the aging Thinkpad T42 used by SWMBO. For those not familiar, the Vostro is apparently one of those laptops that is really a desktop with a hinged display. It actually has room for two laptop drives. The specs are pretty impressive, falling just a bit short of my Thinkpad T500. I had fun loading it with Ubuntu and running some benchmarks.

One of the areas where the Vostro falls short is that it only supports Wireless G. I'm sold on Wireless N and happy that my netbook (Eee PC 901) and smart phone (Droid X) both support draft 802.11n. (Is it still a draft standard?) I looked up the specs on the Dell website and found that they listed a couple 802.11n cards. I looked around and now have a card upgrade enroute for the princely sum of $15 US.

I want to make sure the card works so I found some benchmarks to run. I started with netperf. It's a nice simple package that installs a server and a client which communicate with each other to measure throughput. It is now installed on all of our systems. I also did some large file copies across the wireless LAN, but that can be affected by the speed of file I/O. And caching. I was surprised that the second time I copied a 340,393,984 byte file from the remote system, it finished in a few seconds with no network activity.

The other surprise was that wireless throughput was better on the Vostro (with 802.11g) than on the Thinkpad T500 which supports 802.11n. :( I investigated and found that the router was set up to support G, not N. I don't know how that happened. It has two radios and at one point I was trying to figure out how to set one up to run G and the other to support N only. It was my hope to get better N throughput by doing that, but I don't think the router is capable of that configuration. I probably misconfigured it while working on that. I now set it to support N as well as G and throughput for the T500 went up considerably. In fact, some tests with netperf reported throughput higher than between two hosts on our wired LAN (100baseT) It also become much more variable. And shortly after the change, it lost the connection between the T500 and the AP to the point where the T500 requested the WPA password. My research also revealed that the firmware that ships with Ubuntu 10.10 for the wireless card in my T500 (Intel 5300) may have problems. The 10.10 install disables 802.11n if it detects an Intel 5300/5100 card. I'm glad I haven't upgraded my T500 to Ubuntu 10.10 yet. I hope N isn't going to be flaky.

Other items of note that I found from benchmarking.

  • An Atom processor is not all that far behind a five year old Socket 939 Athlon.
  • An Athlon 64 X2 3800+ benchmarks slower than Athlon 64 3400+, at least for single threaded tasks. I think it is because the faster processor is in a system that has all four RAM slots populated which causes the memory to run at a slower speed - DDR333 vs. DDR400.
  • Newer hard drives are pretty much faster than older drives. I guess that should be no surprise.
  • Four USB sticks on the same USB bus configured as RAID0 (striped) are considerably faster than a single USB stick on the same USB 2.0 bus. I would have thought that (at least for reading) the USB sticks would be close to saturating the bus. Apparently not.
Pending upgrades: An Atom based system (Atom D525, 2GB RAM) has shipped to replace the Athlon 64 3400+. Along with that I have a gigbit Ethernet switch coming. Hopefully this one will last longer than the other one I had which seemed to lose ports during electrical storms. I suppose using it to connect stuff within several feet of each other should help. (Toonsinator, NAS and the rest of the LAN.)

Monday, December 13, 2010

Get started!

I recall hearing that a URL with hyphens in it is stupid. Derp!

(So there!)

I'm fiddling with RAIUFD (Redundant Array of Inexpensive USB Flash Drives.) That started with a new system I have planned. The ultimate goal is to combine a file server with another system into one case. The case is a monster Lian Lee case that has space for dozens of 3 1/2" drives. When I built it about five years ago it used a RAID of six 200GB drives (5 drives in a RAID5 configuration with a cold spare.) I've since upgraded it to use two 2TB drives configured as a RAID0 (mirror.) So I've got this huge case with an AMD Athlon 64 motherboard and lots of empty space. It idles along at about 95 watts. My plan is to combine this system with a similar desktop. The Lian Lee case will hold two systems. I'm even considering powering them both from the same power supply.

The main system will be the combination of the best of the two desktops (AMD Athlon 64 X2 processor, 4GB RAM, etc.) For now. It will own the space for the rear connections and external devices such as a USB/firewire panel, DVD burner and <shudder> floppy disk drive. For now.

Behind the main board, I'll mount a micro-ATX all in one board which will drive the two 2TB drives and support the functionality of a NAS. The board I've chosen is an Intel D510MO based on a dual core Atom. The only external connection for it will be an Ethernet cable. (I'll open the case in order to attach keyboard and monitor cables as needed.) This is where the RAIUFD comes in. The NAS drives aren't partitioned for installing an OS. The D510MO has two SATA ports. I could use an add in card for an additional SATA drive; a notebook drive would be nice. Or I could use an ATA drive. I have some laying around. But they're old and use power. I had planned on just getting a decent USB thumb drive for a boot device.

Then I ran across the idea of striping several USB drives. Google helped me find prior art. It has been done and it has provided a speed benefit over a single flash drive. I ordered four 4GB Mushkin flash drives to configure in a four way stripe set. Cost for 16GB storage is <$25 US.

I already have two 4GB Kingston flash drives so I fiddled around with them. Configuring them as RAID0 is pretty straight forward. Measuring throughput is not. I finally settled on Iozone with some arguments that seem to provide a quick test that hopefully produces indicative results.

iozone –Ra -s1024 -r 8 –i 0 –i 1 -c

Results were inconsistent from run to run, often producing a result that was double or half of the typical value. Perhaps this is a consequence of the wear leveling or blocking algorithms in the drive. It also seemed sensitive to CPU horsepower. There was less benefit on an Atom based netbook than on my laptop (Core 2 Duo.) The laptop also allowed me to get the drives on different USB busses which also had to help. In any case, before too long I'll get to install Ubuntu Server on this before too long and see what happens. Since it's going to be a file server and not something I plan for interactive use, I'm not too concerned about program load times. High traffic directories like /tmp will wind up in RAM.

More prior art... I have a remote NAS built on an Atom based all in one PC (Foxconn R20-D1) with a couple mirrored 2TB drives. It's partitioned to boot off the RAID. It uses 35W when busy and works well with wake-on-LAN so I only need to wake it up when backups are scheduled. (That's all scripted, of course.)