Friday, January 17, 2014

Thoughts On Programming Managed Metadata With SharePoint 2013’s CSOM Taxonomy API

Chris Domino, Director, Enterprise Architect

Introduction

Well, it's official: programmatically manipulating managed metadata still sucks in SharePoint 2013. I've complained about it here and many other places. However, to be fair, I feel like it sucks less than it did in the 2010 days. It seems more stable now. For example, the Term Store Management Tool is suffering from its "The Managed Metadata Service or Connection is currently not available" error less frequently. Also, the new term-driven navigation stuff is pretty cool.

But in the end, when I find myself in the weeds of, for example, using CSOM to automate the provisioning of taxonomy fields or the querying or taxonomy values from ListItems, the managed metadata architecture still feels hacky to me. To deal with these headaches, I'm going to cover a few of the charming peccadilloes you might encounter when using CSOM to programmatically get and set managed metadata field values.

In my opinion, the biggest hack of taxonomy in SharePoint is its dependency on hidden lists and fields and event receivers to wire everything together. Not only is this approach akin to using SQL triggers to implement application logic in a database, but it also causes acne on the otherwise beautiful face of my information architecture. Hidden "CatchAll" columns? Hidden note fields with generated internal names? Ugh. In order to see why the code needed to properly set a TaxonomyField in CSOM is interesting enough to blog about, let's start with some background.

Some Background

I always make a gold star effort to use the proper field value classes in SharePoint development, client and server APIs alike, when setting values on ListItems or SPListItems. The best example of this is the trusty little FieldUrlValue class that allows me to never care how the string representation of a url and its description should be formatted on a "Hyperlink or Picture" column.

If you dig into FieldValueUrl a bit, you'll see it has a nice, friendly, parameterless public constructor, and members for the two pieces of information that comprise it. You can new up an instance of it, populate the properties, and set it to a ListItem's field value (See Line #'s 1 - 5 below). And when you want to get an instance of one, just cast the value of the field back to it (Line #7).

Code Listing 1

  1. item["url field"] = new FieldUrlValue()
  2. {
  3. Description = "description",
  4. Url = "http://chrisdomino.com/blog"
  5. };
  6. ...
  7. string url = ((FieldUrlValue)item["url field"]).Url;

Lovely and clean. Should we expect the same from taxonomy fields? Of course. Are we that lucky? Of course not. The two field classes we'll be dealing with, TaxonomyFieldValue and TaxonomyFieldValueCollection, promisingly start out by having public constructors. But as we investigate further, we'll see that they are like our body's appendix: useless and prone to rupture.

Setting Taxonomy Values: The Problem

Single-Selection Fields

Considering TaxonomyFieldValue first, we can use it in the same manner as FieldUrlValue: new up an instance, set the Label, (to the text of the corresponding term) the TermGuid, (which is of course not really a guid, but obnoxiously a string representation of the term's unique id instead) and the WssId (the integer Id of the owning ListItem). This final parameter seems particularly hacky to me; how can an item's field value belong to a different item? When in doubt, use -1. It just works. Lame.

Anyway, once one of these is populated, we'd set it as the value of the corresponding ListItem's field, right? Then call Update? And ExecuteQuery? Finally, we'd refresh the page, and see our sparkly new managed metadata value proudly displayed? Unfortunately, all sarcasm aside, this is not in the cards for using CSOM to set a TaxonomyFieldValue.

Multi-Selection Fields

Before seeing why, let's take a look at what's going on the IEnumerable version of our field value for multi-selection managed metadata columns. The constructor for TaxonomyFieldValueCollection in CSOM is adventurous. It takes in a ClientRuntimeContext, a string for the field value, (which I'll complain about next) and finally something called a "creating field."

The first parameter is trivial, so let's consider the second. Like I said, assembling a string representation of a field value in SharePoint is bad news: it opens up the door to potential typos, hard coding, and incompatibilities across environments. Unless you're programmatically backed into a corner (as you'll see we will be) it's always better to use the API.

But indulging in TaxonomyFieldValueCollection's constructor, we need to cobble these characters together to form our second parameter. What does this object look like when wearing a nice string-y dress? You might assume that such a gown would be woven together by the concatenation of the string representations of each constituent element in the collection. But to see that that's not how this fabric is made, let's first look at the way a single TaxonomyFieldValue string is assembled:

<WssId>;#<Label>|<TermGuid>

Each angle-bracketed component is the corresponding property of a TaxonomyFieldValue. What's nice is that SharePoint will throw an exception (when ExecuteQuery is called) if the value is malformed. Although it's great to have this protection, be careful: exception handling should never be part of your logic flow. The resulting error message is actually pretty straightforward:

The given value for a taxonomy field was not formatted in the required <int>;#<label>|<guid> format.

The API is even smart enough to make sure that the guid in question corresponds to a term in the termset bound to the column. But once all of these checks pass and we have a proper string, recall that we're still expressing a single taxonomy value here. Therefore, let's see what multi-selection string values are wearing:

<Label1>|<TermGuid1>;<Label2>|<TermGuid2>

Interestingly, we don't care about that WssId all of a sudden. Also, notice how the single-selection value resembles a lookup field value (which is not at all surprising, since TaxonomyField inherits from LookupField). By comparison, the multi-selection field resembles a plain old frock that's been hastily sewn together with mismatched thread.

The kicker here is that if we intuitively use the multi-selection string representation of a taxonomy value for the second parameter of its constructor, a ServerException is thrown: "Value cannot be null." So how the hell do we build a collection of values? Maybe initialize the collection with a single value for the constructor's second parameter, (which one do you choose?) call Clear, and then finally squeeze off a bunch of Adds to fill it with terms?

Maybe in dream world where these methods actually exist and I wouldn't have to be writing this post. What works (and by "works" I mean "doesn't throw an exception") is passing in an empty string for this parameter, and then calling PopulateFromLabelGuidPairs on the TaxonomyFieldValueCollection object, passing in a multi-selection string formatted as above.

And we're not even done with this constructor! What about that "creating field" for the third value? From MSDN, this is "the Field the value is bound to." Although this makes more sense than "creating field" it's still not very helpful. "Bound to?" You can't get Field objects from a ListItem in CSOM, so do we grab it from the parent list's fields? The content type's FieldLinkCollection? The web's FieldCollection?

What works (again I'm using the same "doesn't bomb" definition for "works") is getting the field from the list's collection; any other column from any other collection throws the same "Value cannot be null" exception we saw before. I'm not going to go into any more detail here, because even after Frankenstein-ing this object together and setting it as a column value, we end up in the same place as we did with its single-selection cousin: no changes to the ListItem are persisted.

So as you can see, these field value types have cryptic properties, confusing constructors, and, at the end of the day, simply don't do anything. Unfortunately, the only path through these trying taxonomy woods is made dark by the shadows of string manipulation and hard coding. Although there are a lot of articles out there on how to code around these managed metadata malignancies, my solution is the most dynamic. If an API backs me into the aforementioned programmatic corner, I'll happily hack my way through the brush with strings.

Setting Taxonomy Values: The Solution

The secret is the aforementioned taxonomy hack I discussed at the beginning of this post: dealing with the hidden information architecture that managed metadata in SharePoint depends on. In order to persist one of these fields in CSOM, we need to set not only the column's value, but the corresponding hidden note column as well. Without this second value, the fields will not update. The fact that the TaxonomyFieldValue class doesn't do this for us automatically is a travesty.

My approach is to create an extension method off of ClientContext that takes in a ListItem and the field's internal name, and uses a model (which is just a class that holds the term data) to set the values. The only drawback is that it requires us to do some string puppeteering. But like I said, we have no choice since conventional CSOM simply does not get us there. Let's take a look at the code, which you'll see isn't even all that complicated.

First things first: the model. Named TaxonomyModel, this class has properties for the core components of a term: the name/value pair corresponding to its label and guid. Since the WssId can always be safely set to -1, there's no need to carry a value for it along for the ride within this DTO. Finally, to model the hierarchical nature of terms in a termset, each TaxonomyModel has a collection of itself. Here's what the class looks like:

Code Listing 2

  1. public class TaxonomyModel
  2. {
  3. #region Members
  4. public Guid Id { get; set; }
  5. public string Name { get; set; }
  6. public List<TaxonomyModel> Children { get; set; }
  7. #endregion
  8. #region Public Methods
  9. public override string ToString()
  10. {
  11. //return
  12. return string.Format("-1;#{0}|{1}", this.Name ?? string.Empty, this.Id == null ? Guid.Empty : this.Id);
  13. }
  14. #endregion
  15. }

Note that for the purposes of this post, the Children property is not used, although some simple recursion is all it would take to set the field's value to a child or grandchild term. But on to the good stuff: the code that programmatically sets taxonomy field values. The coolest aspect of this extension method is that it works for both single- and multi-selection columns. Other field types that have similar distinctions around selection options, such as lookups and choices, require different handling for each. But for my taxonomy setting logic, both are handled. Let's take a look:

Code Listing 3

  1. public static void SetTaxonomyFieldValue(this ClientContext context, ListItem item, string internalName, params TaxonomyModel[] values)
  2. {
  3. //set taxonomy field
  4. item[internalName] = values.Select(t => string.Format("-1;#{0}|{1}", t.Name, t.Id)).ToArray();
  5. //set hidden taxonomy field
  6. item[context.GetTaxonomyHiddenFieldName(internalName)] = string.Join(";", values.Select(t => string.Format("{0}|{1}", t.Name, t.Id)));;
  7. }

Although this method appears to be on the petite side, it actually packs quite a 1-2 punch. As previously stated, SetTaxonomyFieldValue extends ClientContext, taking in a ListItem, the internal name of the taxonomy field, and a params array of TaxonomyModels. The later allows us to more fluidly support multi-selection columns.

The first of the 1-2 punch is on Line #4, where we use a little lambda to set the specified field's value to an array of strings. Each one is formatted to match the single-selection convention for a TaxonomyFieldValue. If there's just one term, the value is set to an array with a single element; if there's many, then the array contains a string for each. This is how the same handling can apply to both single- and multi-selection columns.

Line #6 contains the follow-up punch, right in the managed metadata service application's face. First, on the left hand side of the assignment, you'll see we call a new method, GetTaxonomyHiddenFieldName, which we'll look at next. What's more interesting is the value that we're setting for it on the right: it's a single string that's formatted as a multi-selection TaxonomyFieldValueCollection. How bizarre: the single-selection convention is used as an array of strings, whereas the multi-selection variety is joined into a solitary string!

At least taxonomy is consistent in its weirdness. But instead of listening to me complain more, let's move on and take a look at GetTaxonomyHiddenFieldName. The idea is that we get the target field by its internal name from the root Web of the ClientContext, cast it to a TaxonomyField, pull the note field from it, and grab its internal name. This is the name of the secret hidden taxonomy field for our target column.

<Rant>

This name of this column is an all-lower case guid with no dashes...that doesn't match any of the unique id values for either field. Real classy, SharePoint.

</Rant>

Code Listing 4

  1. public static string GetTaxonomyHiddenFieldName(this ClientContext context, string internalName)
  2. {
  3. //initialization
  4. if (!context.Web.IsObjectPropertyInstantiated("Fields"))
  5. {
  6. //load fields
  7. context.Load(context.Web.Fields,
  8. ff => ff.Include(
  9. f => f.Id,
  10. f => f.InternalName));
  11. context.ExecuteQuery();
  12. }
  13. //get target field
  14. Field field = context.Web.Fields.ToList().SingleOrDefault(f => f.InternalName.Equals(internalName));
  15. if (field == null)
  16. return string.Empty;
  17. //get taxonomy field
  18. TaxonomyField taxField = context.CastTo<TaxonomyField>(field);
  19. context.Load(taxField, f => f.TextField);
  20. context.ExecuteQuery();
  21. //get note field
  22. Field noteField = context.Web.Fields.ToList().SingleOrDefault(f => f.Id.Equals(taxField.TextField));
  23. if (noteField == null)
  24. return string.Empty;
  25. //return