Sunday, April 10, 2011

Using oEmbed To Generate Thumbnails From Content On YouTube And Other Social Media Sites

Chris Domino, Director, Enterprise Architect

In my second-to-last post, I talked about showing Flash superimposed over Silverlight without the need to enable IsWindowless mode on the plugin. Something I foreshadowed but didn't discuss was where the Flash content came from, so I wanted to continue along the topic and elucidate this detail.

There is a new protocol out there called oEmbed. This spec, supported by all the major media providers out there (YouTube, Flickr, Hulu, Vimeo, etc.) is a REST-ish API that takes in a URL to a video or image, and responds with an XML or JSON object containing all the metadata you'll need to integrate this content into your application.

It's actually really easy. Check out the link above for the details, but all you need to do is construct a URL to the provider's specification (all are a bit different), and serialize the result into a .NET object. You can then persist it in a database or bind it to your UI. Here's an example of what a YouTube request will be (start by loading a page and copy-and-pasting the URL; I'm using ""):

The response will be a nice XML document containing several nice pieces of metadata:

  • Title
  • Flash object tag's HTML
  • Thumbnail Image URL
  • Height (both of plugin and thumbnail)
  • Width (both of plugin and thumbnail)

And much more. If you prefer JSON (which is faster to deserialize than XML), the APIs will have a way to specify the format for the response, either as a query string parameter or a part of the path. As you can see, you don't even need to encode the raw URL!

Some weirdness I ran into was the concept of "protected videos." These are proprietary pieces of content on the web that the author does not want displayed anywhere else except for where it was originally published. My guess is that it has something to do with copyrights or distribution or promotions or advertisements; money in other words.

It's difficult to detect this situation from the response, but an oEmbed request for protected content will fail; make sure you have some good error handling around this. I've found your best shot is to use a Try/Catch block, catching both WebExceptions and standard Exceptions. My fail-case contingency flow was to still persist the URL, display a message to the user (in lieu of the Flash content) when it was needed in my app, and provide a link to pop the video up in a new Window navigated to its home on YouTube or wherever.

Here is an example of a wrapper method I used to take in a URL and return the raw response from a provider's oEmbed endpoint:

  1. public string GetRawoEmbed(string url)
  2. {
  3. try
  4. {
  5. //initialization
  6. WebClient wc = new WebClient();
  7. //get base url
  8. Uri uri = new Uri(url, UriKind.Absolute);
  9. string baseUri = string.Format("{0}://{1}", uri.Scheme, uri.Authority).ToLower().Replace("www.", string.Empty);
  10. wc.BaseAddress = baseUri;
  11. //get provider
  12. oEmbedProvider provider = this.ObjectContext.oEmbedProviders.Where(o => o.BaseURL.ToLower().Replace("www.", string.Empty).Equals(baseUri)).FirstOrDefault();
  13. if (provider != null)
  14. {
  15. //get data
  16. return ASCIIEncoding.ASCII.GetString(wc.DownloadData(string.Concat(provider.oEmbedFormattedURL, url)));
  17. }
  18. else
  19. {
  20. //not supported
  21. return "UNSUPPORTED";
  22. }
  23. }
  24. catch (WebException)
  25. {
  26. //protected
  27. return "PROTECTED";
  28. }
  29. catch (Exception ex)
  30. {
  31. //error
  32. return "ERROR";
  33. }
  34. }


  • Line #12 is just a call into our database where we store the "supported" oEmbed providers. This is a "system" table in our schema that has the URL format string, friendly name, etc. for each endpoint.
  • Line #16 uses standard ASCII encoding to get the raw data as a string out of the response bits.
  • Errors: I know that returning strings like "ERROR" or "UNSUPPORTED" are not the most elegant way to allow your application to fail gracefully. But for our situation, this whole feature was a last minute enhancement, and we had no choice but to cut corners. However I feel that it's a small price to pay for getting the URL to a thumbnail image for a YouTube video essentially for free.

Once we get the data back from the oEmbed endpoint of our provider, the last step is to deserialize the string into an object and databind it to the UI. Here's that logic:

  1. private OEmbed DeserializeOEmbed(string result)
  2. {
  3. //get result
  4. if (result.Equals("ERROR"))
  5. {
  6. //handle error
  7. return null;
  8. }
  9. else if (result.Equals("PROTECTED"))
  10. {
  11. //handle protected
  12. return null;
  13. }
  14. else if (result.Equals("UNSUPPORTED"))
  15. {
  16. //handle unsupported
  17. return null;
  18. }
  19. else
  20. {
  21. //deserialize video metadata
  22. try
  23. {
  24. //open stream
  25. using (MemoryStream ms = new MemoryStream(Encoding.Unicode.GetBytes(result)))
  26. {
  27. //deserialize
  28. DataContractJsonSerializer serializer = new DataContractJsonSerializer(typeof(OEmbed));
  29. return (OEmbed)serializer.ReadObject(ms);
  30. }
  31. }
  32. catch (Exception ex)
  33. {
  34. //error
  35. return null;
  36. }
  37. }
  38. }

Assuming there are no errors along the way, most of the magic is in Line #28, where we use the trusty WCF DataContractJsonSerializer to turn the JSON we get from the providers into our business object. Hmm. The term "Business Object" might be a bit too stuffy to use for something as cool as dynamically displaying Flash content over Silverlight, but I'll get over it.

Well that's about it, except for one final note. In case you're wondering if there's a more dynamic way to call the oEmbed providers based on the input URL than checking a database or configuration file of "supported" endpoints, well then you're right. There is a clearing house service out there that does just that: takes a URL, and tries to parse it against dozens and dozens of providers.

Check out for the API. It's free. Just use it as you would any other oEmbed provider's API. The endpoint looks like this: /1/oembed?url=[copy-and-pasted URL from the browser here]


If you are like me, you'll be awesome and spend hours refactoring your code until it is pristinely organized into reusable methods, has the absolute minimum amount of external dependencies it needs to physically run, and is as dynamic and beautiful as possible. If you're not like me, then you're probably not even reading this, which means I'm really just talking to myself. Anyway, I waged a fierce battle against my team in favor of using the clearing house. Why introduce a new database table to our schema that has nothing to do with our data model, only to effectively reduce the number of providers we support?

I was overruled, in favor of the argument that if the clearing house went away, the feature would be dead, and we'd have to do it more statically anyway. I see the wisdom and foresight of this, (I know you're reading, Jonathan) but I personally dislike architecting against the apocalypse. Whenever a healthy debate ensues in the project room, I hate the arguments that begin with "Well, what if..."

There's always a "what if" when it comes to software development.

Until we get to whatever is waiting for us after the Internet, we will ultimately be dependent on it for almost every app we build. And guess what? The Internet is made out of plastic and wires and other people's code. Everything is dependent on something else. We depend on electricity not going down. We depend on routers not going down. We depend on the .NET Framework not going down. We depend on YouTube not going down. So although the clearing house is indeed the weakest link in our logical chain of dependencies, we can't eliminate it based on that alone.

If the clearing house goes down, then the feature will break. But until we have contingency plans for the rest of the components failing, I am comfortable taking that risk. Sometimes building beautiful, dynamic software requires a more ballsy architecture. I'd rather have to explain to the client why the feature went down (since it won't be my fault, technically) than why the hot new media website on the Internet isn't supported by our application.