Problems Processing Petabytes of Data
As more data is generated than ever before the ability to process it in an efficient and intelligent manner becomes more complex. The challenge is how to handle all the assets. How can petabytes of high definition CCTV footage be processed to spot the moment an event occurs? How can content be automatically classified and logged with little or no human interaction? How can petabytes of video data reformat or transcoded into different formats without moving that content between the storage and the servers needed to perform the processing?
Traditional approaches that require content to be moved around internal or external networks are broken and inefficient.
Processing Data Where it Lives
MatrixStore object storage from Object Matrix has been designed to process data, in place, where it lives. A MatrixStore cluster combines multiple nodes, each consisting of CPU and storage, to provide a self-managing digital preservation platform. The added advantage of using intelligent nodes is that the CPU can be used for other tasks when the core digital preservation work has been completed. Those tasks can range from automated metadata extraction to detailed data analysis. MatrixStore PiP (Process in Place) is the framework provided to perform tasks on data without moving the data around. Put simply, PiP utilises the power of the MatrixStore cluster to perform the processing where the data lives.
Example: Metadata Extraction and Indexing (Adobe XMP, AS10, AS11, EXIF metadata)
Many formats of content get delivered with important and valuable metadata built in. Having a storage solution that understands those formats and that can automatically index the metadata means people or further processes are not required to perform the task. MatrixStore currently supports AS11, AS10, EXIF and Adobe XMP metadata.Using XMP as an example: XMP files allow you to transfer your image metadata, ratings, tags, keywords, geolocation, and other attributes about the image outside of the actual JPG, CR2, or other file type. The XMP format was created by Adobe to standardize how metadata is stored and transferred and is a standard used by many news entities, photographers, photojournalists, image resellers, etc. The example shows:
Media managers at remote locations add XMP metadata to the content they are creating.
That content is sent to a central MatrixStore repository into a vault that is configured for XMP metadata extraction.
MatrixStore PiP extracts the metadata automatically when the cluster is not performing digital preservation tasks. The metadata is stored as part of the object within MatrixStore and indexed to enable search operations.
The data is not delayed for pre-processing nor moved for post-processing. It is processed in place, where it lives.
MatrixStore PiP Benefits
Streamlines processes for sharing content across local or global teams.
Extracts metadata from ingested assets and makes it available for search via a fast and powerful distributed database.
Makes the metadata available via an API and shares metadata with technology partner’s applications.
Because the archive’s content is processed by local CPUs within direct attached servers there is no need to move petabytes of content in and out of the archive to perform post-processing.
Provides a flexible, scalable and extensible framework that allows advanced data analytics and advanced post processing algorithms to be carried out.