Fixity and Digital Preservation
Do you know the level of preservation of your content?
A short time back, the NDSA (National Digital Stewardship Alliance) published some excellent articles that look at the different levels of digital preservation you should apply to your content and how you can measure it. The concepts they described are valid and very relevant nowadays but some adjustments could be made to the importance of some of the categories they defined.
A fundamental goal of digital preservation is to establish and check its “fixity” or stability.
In the context of digital preservation, fixity is the property of a digital file or object being fixed or unchanged. This is synonymous with bit-level integrity. Fixity information offers evidence that one set of bits is identical to another.
The PREMIS data dictionary defines fixity information as “information used to verify whether an object has been altered in an undocumented or unauthorized way.” Fixity information is normally based on checksums or cryptographic hashes.
There are a whole range of reasons to collect, maintain and verify fixity information on your digital content:
Assure the good reception of the content
Assure the content hasn’t changed unexpectedly
Assure the content hasn’t changed in transfers
Support the repair of corrupted or altered content
Monitor hardware degradation
Allow change in a portion of the content leaving the rest intact
Support the monitoring of production or digitalisation processes
Document provenance and history
Detect human errors in the manipulation of the content
There are different approaches on how and when to generate fixity information:
At regular intervals
Into storage systems
On portions of the content
Obviously, the generation of fixity information requires accessing the content. The uncertainty principle can be applied here in the sense that accessing the content to extract the fixity information will generate different effects in the systems holding or accessing the content depending on how and how often fixity is generated and checked, some of those effects can be:
Removing CPU time from other services because they are calculating fixity information
Degradation of the hardware holding the data
Redundant fixity information if different systems are calculating it, etc…
A very important aspect to take into account is where to store fixity information. There are different approaches:
In object’s metadata
Together with the content
Embedded in the content
NDSA defined five general categories in order of importance:
Storage and Geographic Location
File Fixity and Data Integrity
During the past years in the media industry, the metadata has become as important as the data so we could argue that you should consider protecting the metadata at the same level of importance as the data.
At Object Matrix
From the inception of MatrixStore, our main product, we have taken the approach to protect the metadata of the objects at the same level as the data. Metadata resides with the objects and gets the same protection and policies as the data.
We calculate and check fixity information as checksums at different points in our customer’s workflows where our software is involved. For instance, our client applications calculate independently and verify checksums with MatrixStore in a transparent way to the users in order to detect transfer errors.
MatrixStore lets you choose the level of integrity (that is, which checksum algorithm) to use in transfers and fixity verification so that you can control how the fixity generation affects the performance of hardware.
Internally, MatrixStore also can regenerate and verify fixity information about the content, including the metadata in order to repair the content if an authorised or unexpected change as happened. MatrixStore is able to correct the content using a good instance of the object within the same system or from a remote system.
At Object Matrix, the security and integrity of the content are core features in our products in order to help our customers to achieve and maintain a high level of preservation of their content.
We raised the level of protection of the metadata up to the same level as the data.
You can implement a high level of digital preservation of your content as recommended by the NDSA using Object Matrix products.
About Object Matrix
Object Matrix is the award winning software company that pioneered object storage and the modernisation of media archives. It exists to enable global collaboration, increase operational efficiencies and empower creativity through deployment of MatrixStore, the on-prem and hybrid cloud storage platform. Their focus on the media industry gives them a deep understanding of the challenges organisations face when protecting, processing and sharing video content. Customers include: BBC, Orange, France Televisions, the NBA, BT, HBO, TV Globo, MSG-N and NBC Universal.