Friday, April 27, 2012

SharePoint 2010 Content Type Disenfranchized from Syndication Hub

Content Types sent from the syndication hub across the enterprise are one of the biggest benefits of the syndication hub.  It allows standadization across the enterprise, and a way to ensure that the entire enterprise has access to the same authorized content types.  There is a catch though.  If the content type is changed at the local level, it *can* then detach itself from its parent in the syndication hub.  You will be able to tell this has happened when changes made at the syndication hub are no longer being accepted at the local level and you are seeing messages in the content publication log that say that there was a preimport check that failed because a particular column from the content type found a pre-exising column with the same name as a column needed by the content type.

Until now, the only resolution I had found for this was to delete any content in the corrupted library, then delete the corrupted content type.  Then I could delete the corrupted site columns from the local site columns.  Once all that was done, I could repush out the content type from the syndication hub.  As you can imagine, this is not a great solution if there is live content on that site.  So, I contacted MicroSoft.  Their solution was to read a blog that discussed this very problem.  Unfortunately, the solution was this very solution, and had been proposed by me.  I was not very happy that the best they could do was give me my own solution. But, I followed my own advice, and sadly moved forward.

Then, I had another library get corrupted; actually, several.  I don't know how they got corrupted, but this time, there was more than just a few live documents, but hundreds of thousands because we had already migrated content from 2007 to 2010.  There was no considering deleting the content this time.  I had to have a different solution.  I spent some time with my largest coffee cup and my dog.  We needed a fix and we needed it NOW.  And then something came to me.  Instead of trying to make the children accept changes maybe what I needed to do was change the parent to match the errant child. 

If I changed the parents so that they were the same as the child, then all the content types would again be in sync.  Then maybe if I changed it back at the syndication hub to the original parameters, they would all change back to the desired configuration together.  It was worth a shot.  After all, what could go wrong?  The ones still connected would change, and then just change back.  The ones not connected, if they didn't reconnect wouldn't be any MORE disconnnected.  But, to cut to the chase, IT DID WORK. :)

It does take a few steps and quite a bit of patience, but let's face it, being patient is what is needed when you choose to work with Windoze in the first place.  So, get your self a cup of java and settle in.

First, determine the list of sites on which the content type (ct) is broken.  You will refer back to this later.  Also, make a list of the values of the columns on the broken site(s).  Hopefully, your sites are all broken in the same way.  If not, you will be doing this in waves.  Luckily for me, all my sites had the same set of fields broken in the same way (i.e. three fields were set to "optional" that were supposed to be "required") so this is what I will illustrate fixing.

Now that you have the list of broken sites and the list of the columns and the states that they are in, you are ready to fire up the Content Syndication Hub (CSH).  Go to the CSH Site Settings->Content Types and select the ct in question.  Write down the way the columns *should be* for the ct when it is in the fixed state.  Save this configuration.  You will need it later.  Now change the values of the ct to match the *broken* content types as they are on the broken sites.  Now "manage" the publishing for this content type and republish it. 

Good, get another cup of java, and we will now go back to the broken sites.  On each broken site, go to site settings->content type publishing and check the "Refresh all content types" box and click OK.  This may take time depending on how many sites you need to visit.  Once you are done, get another cup of java.

Now, it is time to fire up Central Administration.  I hope you have administrative rights.  Go to Monitoring->Timer Jobs->Review Job Definitions and then select Content Type Subscriber for the correct web application on which your sites are housed.  If your site collections are on more than one web app, then you will need to run it on more than one subscriber job.

Make sure that you allow the job sufficient time to finish before moving on to the next step each time.  I have found that the Content Type Subscriber job can take a long time, and sometimes it appears to stall for a very long time.  So, you may need multiple cups of java to get through this.  This is not the time to give up coffee!  This is not the time to have a weak bladder.  But, if you are patient, this can fix the issue where somehow the local ct has had the status (required, optional, hidden) changed. 

I have not tried it in the case where the default value of a column has been changed locally in the list instead of by using the Column Default setting in the library settings, which is the only SAFE place to make that change.  The only reason I have not testing this is because I don't have a case where I have a library broken in that way right now.  Should that come up, I will certianly test it and report it back here.

Metadata - data about data

Over 30 billion documents are created and consumed every year and over 85% are never retrieved. Fifty percent are duplicates in some way.  Most often, because people are afraid they will never find the original again, so they make a copy for their own use.  That copy is then modified, and becomes a different version of the original.  For every dollar spent to create a document, ten are spent managing it.  This data comes from Combined Knowledge LTD & English, Bleeker and Associates, Inc.

As you can see, finding documents ("findability") and managing them cost a lot of money.  So as a community, it is important for us to devise a better way to put them into our collaborative systems so that retrieving them is more efficient.  This is called "putability". 

Originally, the organizational structure was limited to folders.  No matter how logical the system may have seemed to its creator, it wasn't always as logical to the person searching for the data asset.  Several systems came along whose purpose was to assist in the structure of the folders such that items would be more "findable", for example ISO 9000 standards.  While not just a logical filing system, it does include standards that direct the management of files.

Most filing systems sort files based on some common attribute, for example, name.  In a school's filing system, students' physical records may be filed in a filing cabinet according to their last name, and then by their first name.  Perhaps, those records are also divided by the class the student is in (freshman, sophomore, junior or senior).  Maybe the folders are colored blue or pink to designate girls or boys.  These attributes (the name, class year and gender color) are metadata.  They are data about the data in the file, i.e. the student.  By combining those different pieces of data, the workers in the school can more quickly find the student about whom they need information.

With SharePoint 2010, we have the same advantage, but brought more to the forefront and digitized for us.  We have digital metadata.  We can set up columns (either the prebuilt columns that come with SharePoint or custom columns) and fill those with the type of data we need to store on our data asset.  Those columns can be searched using the search center in SharePoint.  If we enable Managed Metadata, we can even create new datatypes that can be used as data on our data assets and search by that type of data as well. 

SharePoint libraries can be configured to use a faceted filter that will bring back data assets which meet criteria in the following categories: Content Type, Choice Field, Managed Metadata, Person or Group Field, Date and Time Field, and Number Field.  The Metadata Navigation Filter acts similarly to the search on the Best Buy web site, where a user can narrow down his or her search for a television by brand name, size, resolution, screen type (LCD or Plasma), and price range.

This type of filtering is much more intuitive than folders, and therefore more usable and sustainable, and makes our files more findable.  Files can be found using the faceted filter immediately upon checkin, even before the crawl has been run, whereas advanced search (as powerful as it is) can't be used until the crawl is complete.

Of course, there are some drawbacks to managed metadata.  It cannot be used in datasheet view.  It cannot be used in Infopath forms.  It cannot be used in sorted  columns (which is why the faceted filter is provided).  It cannot be used in filtered views where the filter uses "contains".  Knowing these drawbacks upfront helps one decide whether metadata is for you.