Fixity and Digital Preservation

NAB sport, Las Vegas 2018 Object Matrix

Do you know the level of preservation of your content?

A short time back, the NDSA (National Digital Stewardship Alliance) published some excellent articles that look at the different levels of digital preservation you should apply to your content and how you can measure it. The concepts they described are valid and very relevant nowadays but some adjustments could be made to the importance of some of the categories they defined.

A fundamental goal of digital preservation is to establish and check its “fixity” or stability.

In the context of digital preservation, fixity is the property of a digital file or object being fixed or unchanged. This is synonymous with bit-level integrity. Fixity information offers evidence that one set of bits is identical to another. 

The PREMIS data dictionary defines fixity information as “information used to verify whether an object has been altered in an undocumented or unauthorized way.” Fixity information is normally based on checksums or cryptographic hashes.

There are a whole range of reasons to collect, maintain and verify fixity information on your digital content:

  • Assure the good reception of the content

  • Assure the content hasn’t changed unexpectedly

  • Assure the content hasn’t changed in transfers

  • Support the repair of corrupted or altered content

  • Monitor hardware degradation

  • Allow change in a portion of the content leaving the rest intact

  • Support the monitoring of production or digitalisation processes

  • Document provenance and history

  • Detect human errors in the manipulation of the content

There are different approaches on how and when to generate fixity information:

  • On ingest

  • On transfer

  • At regular intervals

  • Into storage systems

  • On portions of the content

Obviously, the generation of fixity information requires accessing the content. The uncertainty principle can be applied here in the sense that accessing the content to extract the fixity information will generate different effects in the systems holding or accessing the content depending on how and how often fixity is generated and checked, some of those effects can be:

  • Removing CPU time from other services because they are calculating fixity information

  • Degradation of the hardware holding the data

  • Redundant fixity information if different systems are calculating it, etc…

A very important aspect to take into account is where to store fixity information. There are different approaches:

  • In object’s metadata

  • In databases

  • Together with the content

  • Embedded in the content

NDSA defined five general categories in order of importance:

  • Storage and Geographic Location

  • File Fixity and Data Integrity

  • Information Security

  • Metadata

  • File Formats

During the past years in the media industry, the metadata has become as important as the data so we could argue that you should consider protecting the metadata at the same level of importance as the data.

At Object Matrix

From the inception of MatrixStore, our main product, we have taken the approach to protect the metadata of the objects at the same level as the data. Metadata resides with the objects and gets the same protection and policies as the data.

We calculate and check fixity information as checksums at different points in our customer’s workflows where our software is involved. For instance, our client applications calculate independently and verify checksums with MatrixStore in a transparent way to the users in order to detect transfer errors.

MatrixStore lets you choose the level of integrity (that is, which checksum algorithm) to use in transfers and fixity verification so that you can control how the fixity generation affects the performance of hardware.

Internally, MatrixStore also can regenerate and verify fixity information about the content, including the metadata in order to repair the content if an authorised or unexpected change as happened. MatrixStore is able to correct the content using a good instance of the object within the same system or from a remote system.

Conclusion

At Object Matrix, the security and integrity of the content are core features in our products in order to help our customers to achieve and maintain a high level of preservation of their content.

We raised the level of protection of the metadata up to the same level as the data.

You can implement a high level of digital preservation of your content as recommended by the NDSA using Object Matrix products.

References:

http://www.digitalpreservation.gov/documents/NDSA-Fixity-Guidance-Report-final100214.pdf

http://www.digitalpreservation.gov/documents/NDSA_Levels_Archiving_2013.pdf

About Object Matrix

Object Matrix is the award winning software company that pioneered object storage and the modernisation of media archives. It exists to enable global collaboration, increase operational efficiencies and empower creativity through deployment of MatrixStore, the on-prem and hybrid cloud storage platform. Their focus on the media industry gives them a deep understanding of the challenges organisations face when protecting, processing and sharing video content. Customers include: BBC, Orange, France Televisions, the NBA,  BT, HBO, TV Globo, MSG-N and NBC Universal.