Tuesday, June 3, 2014

Archiving In Webcenter Content - A Deeper Look

This article develops understanding of practical meaning of the word 'Archiving' as it applies to Content Management in general, practical business applications and ROI of your specific Web Center implementation. Participants will review different aspects of archiving - obsolete, hidden, cut off, deleted, information architecture and physical design of archiving systems as they consider a compromise between the cost of storage and speed of retrieval

Introduction

WebCenter Content offers a great set of tools we can use for archiving. It gives us Retention Management, Content Expiry, Replication and many other features and tools, but are we using them efficiently? And what does Archiving mean after all?

Back in September Deane Barker of Blend Interactive brought up a controversial topic in Content Management Professionals Group. He asked:


"'Archiving' is a popular word in content management, but what does it mean? Is there an accepted definition?"

Most of us remember the days when 'Integrated Document Archive and Retrieval Systems' or IDARS was a large market segment in its own right. Tape archives and off-site storage were there first things that came to mind. But what about the 'News Archive' section of our corporate intranet? Should it be even called an 'Archive'? And what about an Email Archive? Now what should we do with historic account statements for our customers?

All valid questions. The questions I suggest we should answer before looking at the wealth of tools that are available to us as part of our Oracle WebCenter Content license.

So may I suggest that you pause for a quick second and ask yourself this important question - the question that Deane has asked the crowd of just over 27,000 information management professionals back in September:

What does 'Archiving' really mean?


Which one of these options comes close to your organization's definition of 'Archiving'?

  1. Does it mean that you simply move your content to 'Archive' section, just as so many web sites now do with their 'News Archive' section? The content is still visible to the public, but simply moved to another section?
  2. Does it mean that we restrict permissions to hide the content from public, mark it as 'archived' but still leave it in its current location to stay accessible by content contributors?
  3. Or does it meant that we move the content to another location, still viewable by contributors but not publicly accessible and not in its original location?
  4. Or maybe it means that you move it to some backup medium and completely remove it from your WebCenter instance?
  5. Or does it simply mean "deletion of outdated content"?

Pondering this will help you get the most value - not just out of the tools that come with WebCenter, but infinitely more importantly - out of the business content that your WebCenter Instance is tasked with managing in the first place! So pause now and please do your thinking before continuing to read.

Archiving as means of addressing system limitations

Back in the late 1990s on we had little control over how CMS systems displayed content, so we were forced to 'archive' stuff to prevent it from showing up on the site. These days are long gone but even now we're still facing that type of limitations.

Here's one scenario that still forces people to 'archive' content stored in Contribution Folders - the notorious limit of 1000 content items and/or child folders per contribution folder. If your system uses Contribution Folders (as opposed to newer Framework Folders) - Folgers_g actually throws csCollectionContentMaxed exception when trying to add more than 1000 items to a folder.

Now, few people are aware of this, but you can actually solve this by increasing the limit by using configuration variables. So that prompts another question:

If there's no foreseeable limit on the number of items you can easily store in repository and you can simply change permissions to remove outdated content from public view - do you still need to 'archive' anything?

With the cost of storage going down year after year and ever faster searching tools - the pressure to take content out of repository just to improve performance has largely been solved. We seeing this trend all around us too. Gmail has increased its free mailbox storage space from 1Gb to 15Gb in less than 10 years so you hardly ever are forced to delete emails. 'Soft Delete' or 'Archive' as Gmail calls is what users now use instead of delete.

'Soft Delete' features

'Soft Delete' effectively removes content from the active view in the system, but it can always be accessed by search.

And that creates another problem! The problem that comes from proliferation of irrelevant content.

Relevant vs. Irrelevant content

Ability to find or restore anything is great but having multiple (including outdated) versions of important technical specs may get in the way of effective communication. The famous out of control SharePoint repositories are the first example that comes to mind.

You don't have to remove content, but that doesn't mean that you don't need to systematically identify content that is now less relevant. If the content is less relevant it could be tagged or moved to an archive location. If it isn't relevant at all, it should be deleted or sent down the pipe of your disposition process.

A discontinued product that no longer being sold doesn't need to show on the web site. Archiving keeps the information available to content managers in case of any legal/ compliance issues that may arise. And potential litigation is another great argument to get rid of the content that you are no longer obligated to keep.

Defining your governance

Now every organization is different and so is the decision making process when it comes to defining the life cycle of your content. Whether it is a single person or a cross-functional content steering committee that makes your governance decisions - your Information Management Team needs to have clear directions and a well defined life cycle for each type of content.

Applying Retention Policies

You need to define what types of content your system is managing and what 'archive' and 'delete' means for each type - and how it is carried out. Your corporate retention policy is key, and your legal colleagues should be part of the discussion.

With that in mind we should be ready to look at the tools available in WebCenter Content to make it easy for you to apply your content life cycle decisions


That's it for now. In the next article we will look at the set of tools available in WebCenter Content and how to best use them to implement our archiving strategies.

No comments:

Post a Comment