An electronic content management system that breaks through the paper barrier

Blogs and Articles

An electronic content management system can organise digital assets, but paper records have baffled these systems for years.

18 March 20217 mins
Two people looking at laptop

Printed documents and images have long been all but inaccessible to electronic content management systems. Thanks to machine learning, those barriers are breaking down.

More than 40 years since the term "paperless office" was coined, the goal of a workplace without paper documents remains almost comically elusive. In fact, global paper production hit an all-time high in 2018.

That's a problem for organisations that want to harness all of their information assets to better understand their business. Many have adopted an electronic content management system (ECM), a type of software that captures, organises and enables the sharing of information. The ECM market is robust and is expected to grow from $26 billion in 2016 to nearly $94 billion in 2025, according to Grand View Research. ECMs excel at organising and optimising organisations' digital assets, but paper records have baffled these systems for years. They struggle to catalog analog information and the growing caches of multimedia information organisations are collecting.

MACHINE LEARNING TO ACCESS ANALOG INFORMATION

In most cases, an electronic content management system functions by searching and tagging content for easy retrieval. That process works fine for information that is already in a readable digital format, but it doesn't glean much information from paper records, images, video and audio files. Those files need to be scanned and tagged by humans, a process that is both laborious and inefficient, since people must effectively predict what terms others will use in the future to search for the information.

But thanks to machine learning, the valuable data buried in printed pages and digital media is finally seeing the light of day. Iron Mountain InSight™ is a new platform that makes this once-inaccessible content findable and usable.

Developed in collaboration with Google, Iron Mountain InSight™ uses image recognition and machine learning to scan printed documents and decipher the information within them. By sorting repeatedly through semi-structured documents such as receipts and invoices, Iron Mountain InSight™ can "learn" to recognise certain common elements. That data can then be extracted in digital form and loaded into spreadsheets, databases and applications.

For example, say an organisation wants to analyse five years' worth of printed receipts to discover which suppliers are offering the best prices or how prices have changed over time. A typical ECM makes it possible to find those documents, but then human users must extract the information by hand.

Iron Mountain InSight's™ innate learning capabilities enable it to figure out which fields contain SKU numbers, dates, prices and quantities with little or no prompting from human operators. It can even decipher handwritten information using the same technology automatic teller machines use to read handwritten checks.

MACHINE LEARNING TO ACCESS MULTIMEDIA INFORMATION

Those recognition capabilities also extend to multimedia files. For example, the software can scan photo archives and identify people, objects and landmarks. An insurance company could use this capability to classify images from claims records by the type of damage reported. Or, security professionals could use it to match faces captured by security cameras to known criminals.

There are applications of this technology across nearly every industry. For example, financial institutions can use it to validate the completeness of loan applications and quickly flag those that need extra attention. Media organisations can reduce copyright violations by identifying video content that is similar to their own. Energy organisations can analyse decades' worth of printed data about their field assets to help with yield forecasting and maintenance scheduling. And any organisation can convert documents submitted on paper into digital content that can be manipulated, indexed and stored.

Classic ECM systems excel at automating process workflows, reducing duplication and enabling regulatory compliance. Iron Mountain InSight™ is the first solution to overcome traditional barriers to discovering and understanding information that has long been inaccessible. Paper may be with us for a long time to come, but printed content can now become part of the digital treasure trove that organisations need to become more efficient.