Oct 132023
 

Do you have a disk in your computer to keep data on? Really? It must be quite old then. Most of us are switching to solid-state devices.

And even if your hard disk really is spinning rust, it technically isn’t one disk; it’s a number of them (individually called platters).

IBM terms all appropriate storage devices DASDs (direct-access storage device) which because it refers to what the storage device does rather than describes how it is constructed. Except for the difficulty pronouncing it, it makes a far better name.

How about cheating and referring to them as DASes?

Wooden and Concrete Seating
May 022012
 

One of the cool things about “the cloud” is that there are numerous different companies all offering cloud-based storage of one kind or another. You can even get quite a bit of storage for free, and different solutions offer different cool solutions – such as Dropbox where my phone is configured to automatically send photos up to it. And there are plenty of other solutions out there :-

  • Box
  • Google Drive (of course you may already be using Google Docs which means you essentially have storage related to that).
  • SkyDrive (although for some mysterious reason, Microsoft doesn’t supply a Linux client)
  • iCloud
  • Wuala
  • SpiderOak
  • Ubuntu One – which despite the name, isn’t just for Ubuntu!
  • And in a note for myself, there’s also SparkleShare which is essentially a DropBox client to talk to your own servers.
Undoubtedly there are a whole ton more, but I think I’ve gotten the “big names” covered. The best strategy is of course to find the one whose client works with all the platforms you use (phone, PC, laptop, etc.), comes with the most free storage, and the cost of getting more storage is the least (in decreasing order of importance). Of course in the real world, you are likely to end up with more than one – simply because it’s tempting to look at the next “new thing” or because you want more cheap storage, or simply because other people insist you use service X.

Now if you use multiple cloud-storage solutions, you have a bit of a problem – different clients offering different functionality, different amounts of storage available, and remembering what you put on which “cloud-disk”. Plus of course there is the interesting problem of security – different providers provide different levels of privacy and operate in different jurisdictions where different laws apply.

Different Clients

Different clients work in different ways with different features. For instance, for a Linux user :-

  1. The Dropbox client seems to work pretty well, but it doesn’t appear in a list of filesystems (i.e. when you type df) so you can’t instantly see how much space is still available, etc. At least not in the standard way.
  2. Box(.net) lacks a Linux client, so you have to hack something together. Perfectly possible for more geeky users, but even for us there is the danger that a hackish solution may suddenly stop working mysteriously. Or rather that is more likely.
  3. Ubuntu One doesn’t seem to work via a filesystem interface at all.
  4. And that seems to be the same with SpiderOak.
It may be different for Windows users (I’m too lazy to check – if anyone wants to submit details, please go ahead), but I doubt it.

Whilst cloud storage providers may offer additional features to differentiate their product, they are all essentially the same as a removable hard disk, usb memory stick, or some other kind of removable storage. Whilst the additional features are very welcome, why should we have to learn a new way of managing storage just because it is out there in the cloud ?

Privacy

There is a great deal of paranoia about storing private data in the cloud with the assumption that creepy organisations such as Google will do something nasty with the data. Well maybe, but the likelihood of Google being that interested in an individual’s data is a little unlikely. Of  course just because the cryptogeeks are a little paranoid does not mean they are completely wrong – there are privacy issues involved.

Firstly, Google could be looking at your data to determine things about you that would be of interest to advertisers – to present targeted adverts at you. Which at best can be a little weird.

Next we like to believe that the laws of our country will protect us from someone picking through our personal data. That someone could be the company supplying the storage, or it could be the government in the country where the storage is hosted. That would probably be fine if the storage was restricted to one location where we could be sure that the government protected us, but where is the storage located?

Much of the time the storage is located in foreign jurisdictions where there is no guarantee that any kind of privacy will be respected – especially if a foreign government takes an interest in your data. Don’t forget the laws of say the USA are not designed to protect citizens of any EU country (or visa-versa). There are of course agreements such as the EU Safe Harbour agreement, but it is possible that it does not offer as much protection as assumed – it is not really intended for private individuals choosing to put their own personal data into foreign jurisdictions.

Probably most of us do not have to worry about this sort of thing (although we can choose to), but some may have to be cautious about this sort of thing. Some of us deal with personal data about third parties – sometimes very personal data – and need to consider whether storing such data in the cloud is being appropriately responsible about the data privacy. For example, a contractor who stores information about their clients should be taking actions to ensure that data is not accidentally leaked (or hacked and published).

The easy answer to this problem is to assume that cloud storage is not safe for sensitive personal data, because there is a simple solution to the problem that still allows the cloud to be used. Use encryption such as TrueCrypt to ensure that even if the cloud leaks your data, it is still encrypted with a method that is not known to the cloud provider.

Store It Twice!

There have been occasions where storage providers have removed access to storage either permanently or temporarily – such as the Megauploads site. Whilst it is perhaps unlikely, it is possible for a cloud service provider to disappear and for the customers to lose their data – even if the cloud provider claims that there is some protection against this sort of thing happening. But it could happen, so it is sensible to ensure that if you store data in the cloud, that you should ensure that you have copies of that data elsewhere.

 

Oct 272009
 

Whether you are using ufs filesystems or zfs storage pools, Solaris has a rather nifty way of migrating storage from one SAN to another wih no (or little) downtime. Or various other reasons involving moving from one disk to another. The key advantage to the following method is to reducing or eliminating downtime. Even if your users can take the hit, not having to slowly watch a multiterabyte filesystem copying from one disk to another is reason enough to use this technique.

Basically it is by using mirroring. Using mirroring to copy a disk might seem a little odd to begin with, but once you’ve seen it work you’ll be a fan.

For UFS (and SVM) Filesystems

This section assumes that the source disk device (cXXXXX) is set in the variable ${sourcedisk} and the destination is in ${destdisk}.

For UFS filesystems, the first step (which does require an outage) is to :-

  1. Stop the application that uses the filesystem being migrated.
  2. Unmount the filesystem.
  3. Encapsulate the existing filesystem device into a SVM metadevice: metainit d1001 1 1 ${sourcedisk}
  4. Create a mirror device with the new metadevice as a submirror: metainit d1000 -m d1001
  5. Change the references in /etc/vfstab to the old device name (${sourcedisk}) to the new mirror (not sub-mirror!) device – d1000
  6. Remount the filesystem and restart the application.

This should take no more than 10 minutes and is the only outage involved. There are two remaining sets of steps :-

  1. Create a new metadevice using the new disk: metainit d1002 1 1 ${destdisk}
  2. Attach the new metadevice to the mirror as an additional sub-mirror: metattach d1000 d1002

At this point, the mirror will start resilvering. It may take some time to complete, but the time it takes to do so does not really matter. In particular the resilvering process should not cause a performance problem to your application – the application I/O takes priority.

When the resilvering is complete :-

  1. Remove the metadevice containing the old SAN disk: metadetach d1000 d1001
  2. Remove the metadevice that is no longer required: metaclear d1001
  3. Attach “nothing” to the mirror metadevice (this is to ensure that the mirror grows to the size of the new submirror): metattach d1000
  4. Finally, ignore the warning on the manual page (which is outdated) and grow the filesystem: growfs -M /mount/point /dev/md/rdsk/d1000

You will see that I have used the metadevice names d1000 (for the mirror), d1001 (for the old sub-mirror), and d1002 (for the new submirror). Whatever device names you use, it is worth trying to be consistent – it helps a lot when you have dozens of filesystems to process.

ZFS Storage Pools

This is even simpler. If you have a storage pool called ${pool} which contains a single device called ${sourcedisk}, you simply :-

  1. Attach the new device: zpool attach ${pool} ${sourcedisk} ${destdisk}
  2. Wait for the resilvering to finish.
  3. Dettach the old device: zpool detach ${pool} ${sourcedisk}

Of course be aware of anything you read on the Internet! I have not actually tested the above; I’m merely regurgitating memory that has recently been exercised – I’m doing a SAN migration at work right now.