Again, there are many choices and is driven by the business reason.
• Do all the media files need to be online and instantly available?
• Alternatively can restores be done in timely manner from a LTO tape that resides on a shelf?
• Does the archive need to be kept in perpetuity?
• Should I use the cloud to store my digital archive like an Amazon Glacier?
Whatever happens, content must be protected and secured – unless you have a minimum of two copies of every media and metadata file, you do not have it.
With big data, trying to manually manage your content is nigh-on-impossible. An automated system is necessary to manage and protect content. Digital preservation and data management is not just about multiple copies, but it is also mitigating against future scalability issues and technology obsolescence. If files become corrupt or lost, then recovering them should be easy and preferably automated for high availability.
The aforementioned proposal uses a commercially off the shelf ingest platform to digitise the files which, in turn, are written to MatrixStore. From the MatrixStore a serious of QC checks are done. On completion, after a week, the MatrixStore will then move content to LTO5 tape. For this business, it was felt that having all the high resolution on disk was not required. However, being able to find content (the proverbial needle in the haystack) was most definitely required and have the proxies instantly browsable on the MatrixStore. To begin with, a fully blown MAM system was not feasible due to budget constraints, but they had a tactical issue they wanted to solve in terms of searching on content, and MatrixStore withDropSpot allows that.
As an object based storage device, as well as protecting the content with multiple copies and being able to write content to tape, MatrixStore stores metadata with the media clips and allows it to be searchable. Using the client tool DropSpot, content can be found even if it has been moved off to tape. DropSpot will even supply the barcode number of the tape that contains the requested media clips.
The proxies will also remain on disk permanently allowing low res content to be continually available to be browsed as if it’s on local disk, but still shareable across the network to many concurrent users.
So what about the cloud? Media clips require lots of bandwidth – you need a solid Internet connection, something that is not always available and can be expensive, especially given the bandwidth that media files require. Can you trust your cloud provider to always be there to serve you your content? What happens if they go out of business? The alternative? Build your own private cloud. Many new MAM systems can be retrofitted to this workflow, giving a browser type interface and offering a cloud type service. The difference being you host it, and if any partners in your workflow should be bought out or disappear, you have mitigated the risk of loss.
In terms of storage specific data management and digital preservation including how to mitigate against hardware obsolescence, then please refer to an earlier blog I did hear a few years ago, all of which is still very relevant.
As with the true nature of this subject, once you scratch the surface, it is always morphs into something much bigger than most people originally expect. I mean, digitising a few old media files:- How difficult can it surely be?