DIY Fusion Drive: an attempt to retrofit a pre-fall 2012 Mac with an SSD and a traditional hard disk

Just a couple days ago Apple released news that new Macs would have an option called Fusion Drive, a technology that would take a fast SSD and a traditional hard disk to bond the two together to create a “self optimizing” single volume. Apple announced that the new technology was included in Mountain Lion, which made me curious as to whether existing Macs may be able to use the technique and have a Fusion Drive “retrofitted”.

I did a quick search on the web to find this excellent article on Anandtech and a FAQ page by Apple, evidently showing that Fusion Drive is simply a volume created using LVM (Logical Volume Management) that aggregates two physical disks. This is a fairly common practice in the Unix world. It also turns out that Apple had introduced LVM support back in 10.7.4 and some previous analysis was already available on how to use diskutil to manage LVMs, but people had failed to use the addDisk command to add a second physical drive to a LVM group. It is possible that Apple made LVM smarter in 10.8 to distribute data over a group in a more efficient manner, maybe determined by the speed of the underlying physical storage technology. If this theory is correct, that would mean it should all be just a matter of creating a LVM and persuade the OS to run from it.

In the following guide I managed to create a LVM using an SSD and a traditional hard disk, to combine the two to one volume. The machine I used is a MacBook Pro, 17 inch, early 2011. I removed the optical drive and replaced it with a big, traditional hard disk. The original HDD was replaced with a fast, 240GB SATA-3 SSD.

It is difficult to prove that this is exactly the same as to having Apple’s new Fusion Drive enabled, as it is very difficult to determine where individual pieces of the file system are physically stored and if these pieces are actually managed between the two disks actively to achieve better performance. There is a fair chance that there really isn’t more to this than setting up the LVM. Unfortunately I don’t have access to a Mac with a real Fusion Drive to compare the outputs of commands such as”diskutil cs list”. It’s also possible that additional settings are required to enable the active management of the LVM, but according to Apple, every Mac running Mac OS X 10.8.2 should be able to read Fusion Drives in target mode, which at least seems to indicate that current Macs using the latest OS should technically be equipped to deal with a Fusion Drive without requiring further software.

So far the technique has proven itself to run at the speeds I was used to have when I was hosting all apps and the OS on the SSD exclusively. I will continue to verify the performance and see whether it behaves in the same manner as described by Apple and Anandtech. Suggestions on how to most efficiently verify this would be very welcome.

Be warned, this guide is for the tinkerer only and those that don’t mind a total loss of their data. Messing around with diskutil will wipe your disks and you have to rely on a functioning backup to get your data back.

  1. Make a full backup of your data, all of it! If you don’t know how to do this, don’t attempt to follow this guide any further. You have been warned! I used TimeMachine for the backup.
    1. In order to speed things up, you may want to connect your Mac to a Time Capsule using a wired Ethernet connection and turn WiFi off.
    2. If you have a bootcamp partition with Windows, make sure to also backup all of that data.
  2. Make sure you are on Mac OS X 10.8.2
  3. Create a USB restore key with the latest OS X version. You can find instructions on this article on CNET. Apple also provides a tool, but I have not verified whether it will create a key with the latest version of OS X. I downloaded Mac OS X 10.8.2 from the AppStore on a Mac OS X 10.7 machine. That ensured that the restore image (ESD) would be coming with the latest version of OS X available.
  4. Reboot to Recovery Mode with the USB key, holding Alt/Option and choosing the USB key to boot from. This takes you into the recovery mode from where you can access both physical drives properly online task manager.
  5. Delete all partitions on both physical drives. You can use Disk Utility for that.
  6. Now open the terminal by going to Utilities -> Terminal
  7. Create a new Logical Volume group with both disks
    1. Use “diskutil list” to identify the names of your physical disks. In my case it was disk0 (SSD) and disk1 (HDD)
    2. Run “diskutil coreStorage create FusionDrive disk0 disk1”. This is the step where others seem to have failed. I figured you could create the LVM by supplying both disks at the same time, instead to creating a LVM with one disk first and then trying to use “addDisk” for the second disk, which doesn’t seem to function. You can also replace the “FusionDrive” string with your own choice of LVM group name. It does not matter what the name is.
    3. List the new LVG using “diskutil coreStorage list”. You should see something like this:

      Get the overall group size and copy the UUID of your LVM, in my case “F628F010-3CFF-4BC7-90CE-CD61EC7C44E1”, then run “diskutil coreStorage createVolume F628F010-3CFF-4BC7-90CE-CD61EC7C44E1 jhfs+ MacintoshFD 975g”. Replace the UUID with your copied LVG UUID and replace 975 with the actual size of your LVG. You can also rename MacintoshFD to whatever you want your main volume to be named to. If the name contains spaces, make sure to contain the string in quotes (“”). Also, make sure to subtract some 10G from the size to leave enough room for a recovery partition.

  8. I did proceed to close the Terminal window and restore a fresh Mac OS X install. You should also be able to restore all your data from a Time Capsule at this point. I did install the OS first to see whether the LVM was bootable and then returned into Recovery Mode to restore from my Time Capsule. After a couple hours my machine booted back into my previous Mac OS X environment.
  9. Congratulations, now you have a Fusion Drive-esque volume. Not only will have higher performance from this setup, you also get the aggregate sizes of your mechanical disk and your SSD, together in one volume.

Like I pointed out earlier, this may just be all the magic behind the new Fusion Drive and I’m currently trying to verify whether this setup behaves the same, or I just created a dumb LVM that lacks the “secret sauce”, like a particular flag being set in the LVM header I don’t know yet about. It would be interesting to compare this setup with a real Fusion Drive and if you have access to a Mac with an actual Fusion Drive shipped by Apple, I encourage you to run the “diskutil coreStorage list/info” commands and post your results in the comments. I’d love to take a look at the real setup. Please let me know if you have an idea at a good strategy to test the setup and verify that it does indeed move data around between the SSD and the HDD in a smart manner.

UPDATE 10/31/2012:

While I wrote this on the weekend and didn’t have much time to dig into verifying whether this setup works like a real Fusion Drive this week ( I do have day job 🙂 ), jollyjinx posted a guide on tumbler on how he went ahead to verify that the above setup does indeed work like you would expect a Fusion Drive to behave. I do have quite a bit of background IO “noise” on my machine, because the LVM is actually hosting the actual OS, but after following JollyJinx’s steps and verifying the results on my own machine and I am pretty much convinced at this point, that Fusion Drive just works “automagically” if you use it with an SSD and a plain old hard drive. After all, the OS has several ways to know that it is actually attached to an SSD by either checking the storage type (I’m sure there is some vendor PID that can be queried from the hardware) or by just keeping performance logs on how the hardware behaves, which would let the OS infer which drive is faster. Despite not leaving much control over whether such behavior is really desired, this is what I would expect from Apple; to employ such heuristics in order to just do what they think is the right thing to do on such hardware.

As to my own results, using iostat in a shell, I was able to see how the IO load was distributed to either one of the disk when writing/reading data, but rarely to both at the same time (which would indicate proper RAID 0 style striping). I could also see that data was moved over from the mechanical disk to the SSD, after accessing some data that was previously read from the HDD. The moment you stop reading such data you can observe IO activity transferring over to the SSD for a while, roughly to the amount of data you were accessing. Accessing the same data again then returns IO activity almost exclusively from the SSD instead of the HDD, which seems to indicate that the OS had decided to move the data over to the faster drive. This sure looks like it’s exactly what Apple was promising with the Fusion Drive.

Sweet, I think I’ll keep this setup!