Friday, April 27, 2012

Metadata - data about data

Over 30 billion documents are created and consumed every year and over 85% are never retrieved. Fifty percent are duplicates in some way.  Most often, because people are afraid they will never find the original again, so they make a copy for their own use.  That copy is then modified, and becomes a different version of the original.  For every dollar spent to create a document, ten are spent managing it.  This data comes from Combined Knowledge LTD & English, Bleeker and Associates, Inc.

As you can see, finding documents ("findability") and managing them cost a lot of money.  So as a community, it is important for us to devise a better way to put them into our collaborative systems so that retrieving them is more efficient.  This is called "putability". 

Originally, the organizational structure was limited to folders.  No matter how logical the system may have seemed to its creator, it wasn't always as logical to the person searching for the data asset.  Several systems came along whose purpose was to assist in the structure of the folders such that items would be more "findable", for example ISO 9000 standards.  While not just a logical filing system, it does include standards that direct the management of files.

Most filing systems sort files based on some common attribute, for example, name.  In a school's filing system, students' physical records may be filed in a filing cabinet according to their last name, and then by their first name.  Perhaps, those records are also divided by the class the student is in (freshman, sophomore, junior or senior).  Maybe the folders are colored blue or pink to designate girls or boys.  These attributes (the name, class year and gender color) are metadata.  They are data about the data in the file, i.e. the student.  By combining those different pieces of data, the workers in the school can more quickly find the student about whom they need information.

With SharePoint 2010, we have the same advantage, but brought more to the forefront and digitized for us.  We have digital metadata.  We can set up columns (either the prebuilt columns that come with SharePoint or custom columns) and fill those with the type of data we need to store on our data asset.  Those columns can be searched using the search center in SharePoint.  If we enable Managed Metadata, we can even create new datatypes that can be used as data on our data assets and search by that type of data as well. 

SharePoint libraries can be configured to use a faceted filter that will bring back data assets which meet criteria in the following categories: Content Type, Choice Field, Managed Metadata, Person or Group Field, Date and Time Field, and Number Field.  The Metadata Navigation Filter acts similarly to the search on the Best Buy web site, where a user can narrow down his or her search for a television by brand name, size, resolution, screen type (LCD or Plasma), and price range.

This type of filtering is much more intuitive than folders, and therefore more usable and sustainable, and makes our files more findable.  Files can be found using the faceted filter immediately upon checkin, even before the crawl has been run, whereas advanced search (as powerful as it is) can't be used until the crawl is complete.

Of course, there are some drawbacks to managed metadata.  It cannot be used in datasheet view.  It cannot be used in Infopath forms.  It cannot be used in sorted  columns (which is why the faceted filter is provided).  It cannot be used in filtered views where the filter uses "contains".  Knowing these drawbacks upfront helps one decide whether metadata is for you. 

No comments: