So there's another post going around today about how physical media is on the way out, because it's more profitable and somehow legal for companies to just not let you have shit you pay for. Last time posts like that were making the rounds, there was a very good one about how it's more an issue of ownership, rather than the format. Physical media just happens to be something that existed before a company could just say "we don't want you to have this anymore" and be able to take it away, and also existed back when concepts like "What if I run a business renting these?" and "I should be able to sell this thing when I'm done with it" could still be legally protected, at least in the US.
Cohost's @cathoderaydude 1 made a wonderful video today about some ancient CD and DVD jukebox/autochanger devices, which you should immediately go watch, but during that he mentions how maintaining backups for a business just stinks. From 48:37:
Backing up data is a tremendous pain. It was then [~1993], it still is now. There's a whole subreddit [r/DataHoarder] dedicated to people trying to figure out how to store lots of data without Constant Effort to keep it alive. The general consensus is: You can't, don't try. Everything rots. Everything degrades. Burned CDs, tapes, powered down hard drives, SSDs They all die, even if they're sitting in a cool dark place. Your best bet is to buy huge spinning disks, put them in RAID arrays, monitor them, and be ready to replace them when, not if, they stop working. Nobody has found a truly superior solution. No, don't bring up tape, we're not having that discussion.
This is of course all correct. If you're a business that needs to keep your data available from backups but still private, these are your options2, and they stink AND cost a lot. I've worked at a place that kept tape backups, they suck. I'm talking "The tape spends 22 hours out of 24 backing up and you have a 2 hour window to swap tapes" suck. But that's not important anyway because businesses usually don't need to backup movies and games, which is what I want to talk about.
When data is non private, we do have a good way to back it up. Torrents, basically. A RAID array is just putting the same data across multiple disks, so one disk failure doesn't ruin the data. When you learn about backups you learn they're all about removing single points of failure by duplicating them. Have multiple RAID cards3 so one can't corrupt your data on multiple disks. Have multiple machines so one failing won't corrupt your data across multiple arrays. Have your data in multiple physical locations so that a storm can completely destroy one and you won't lose your data. Think about how someone would try to do this at a consumer scale. Even if you can afford just the storage to have two copies of your data, most people can't afford to rent out a half rack in a datacenter and keep a spare machine running there, you'd probably find a friend you could keep a second machine in. If the data doesn't have to be private, it'd make more sense to just give them a copy and ask them not to delete it, right?
Torrents are basically this, they already exist and they work at scale. There's a reason why they were for a long time the single best way, and still arguably one of the better ways, to get full copies of Linux distros before high bandwidth mirrors were common, and when a few CDs of data was a heck of a lot to be downloading. It is, for media, a mostly solved problem.
Torrents aren't perfect, if you need a copy of the data and you don't have it on hand, it can only go at best as fast as your internet does. Most people still don't have symmetrical internet either, so it's more difficult to contribute to the swarm and more likely that you can't get data as fast as your ISP will let you. They're not terribly private either, as your IP is just sitting right there in the peer list. I recall, but do not care to research, stories about how people have attempted to corrupt the data available in a swarm of peers as well, I do not recall how well it worked if it happened. But, I believe they are in 2023 the single best way to maintain a copy of say, The Simpsons.
Which is another good point about how streaming sucks, you can't even trust them to put up good quality copies of their own damn content. I remember reading they somehow fucked up old seasons of The Simpsons on Ratcast Plus when it launched. I've never used it so I can't comment further than that. For stuff that debuted in the streaming age, there's usually a copy up online pretty darn fast after it's available, but for old stuff you're still out looking for more or less DVD ISOs to circumvent any possible Poor Encoding Decisions.
-
To be read in the same tone as "TV's Frank", with love.
-
Stuff like AWS S3 or Backblaze is just, paying someone else to maintain a lot of spinning disks that they're ready to replace when they fail.
-
Nowadays HBAs instead of RAID cards but you get the point; the test will usually still think that people use hardware RAID. Yuck.