The SUESS Series
When Silverlight first came out, I was initially excited simply because, as I told my colleagues, "Now, we can do all that Flashy stuff too!" Growing up a .NET developer, my task lists contained to-do items such as building reports, optimizing algorithms, and modeling business objects; designing animations, playing videos, and encoding media were simply not part of my life.
I remember when media first started popping up on websites years ago. "How the hell do you write a movie player?" I thought to myself, mesmerized by whatever banner ad first showed off the possibilities of the technology to me. Little did I ever think that the ability to do cool stuff like that (back when people actually tried to shoot the dancing monkey in ads to win a free XBox) would find its way to the Microsoft stack upon which I teetered uncomfortably, at times, looking out at an uninterrupted future of stored procedures and service references.
Oh but it did! And in true Microsoft fashion, they not only made this stuff possible, they made it easy. So what's the answer to the exasperated question I asked above? It takes one little line of XAML to write a movie player. I was almost disappointed at how simple it was; the ease of such an implementation was a little anticlimactic. When I first learned basic programming concepts such as OO and recursion in high school, it was hard! When I taught myself SharePoint (2003), for example, I felt as though I had really endured a rite of passage.
But one line of XAML? Ok, I guess. Looks like I'm a Web 2.0 developer now...
How quickly I forget what it really means to be in the Microsoft development paradigm. Version (n) of a technology makes the old selling points of version (n-1) trivial, and the future selling points of version (n+1) possible. So now that playing media is easy, we can spend more time solving the ancillary problems that are now feasibly solvable.
Where do the movies that we can now play come from? Where do we put them? Does Silverlight support them? How do we make the viewing experience decent for all users across the Internet? Once we make it possible to play media, we need to keep right on going and implement every step of what I call SUESS or the Silverlight Uploader, Encoder, and Smooth Streamer.
SUESS is a process that handles not only playing media, but uploading and encoding it as well. It takes advantage of several cutting edge technologies to do this: Silverlight 4, WCF, Expression Encoder, and IIS Media Services (both of the later were in version 3 at the time of this writing). The details of building each piece of SUESS are scattered around the Internet in various blogs and forums. My goal here is to bring them all together and show how these technologies interoperate to deliver a true, full featured media solution.
Architecturally speaking, SUESS has a lot of moving parts, but they all fit together very nicely. Let's start with everyone's favorite Visio architecture diagram:
Following is a description of all of SUESS's constituent components, and the interactions among them.
The "components" section describes the system from the perspective of each machine's logical role. Grammatically, these are the nouns, where the following "interactions" section outlines the verbs.
- Component 1: Client - The Client obviously is the user's machine, where all Silverlight code runs. SUESS begins and ends here, starting with the file upload, and ending with viewing the media. The Uploader control is straight Silverlight and .NET on the client, using WCF to communicate with the Web and Media Servers. Both of these interactions (uploading and encoding) could be very long running, and employ progress bars to keep user's blood pressures down. The later uses Silverlight 4's new support for the WCF Polling Duplex (bi-directional) binding, enabling the server to initiate calls and beam data down to the client!
The media viewer is another story. The out-of-the-box MediaElement does not support the IIS SmoothStreaming format. I was pretty surprised to discover that, after building this entire infrastructure to convert media into a format that was designed specifically for Silverlight, Silverlight didn't even support it out-of-the-box! The answer is actually part of the IIS SmoothStreaming SDK, which gives us the SmoothStreamingMediaElement. It's currently still in beta, and I'll talk about this more later. Click here to download it (this page has some links to other great resources for ramping up on IIS SmoothStreaming).
As with any other Silverlight solution, the Silverlight 4 plugin needs to be installed on the user's browser. This requirement is of course the fuel for many heated Silverlight vs. Flash debates, but that is a topic for a different thought.
- Component 2: Web Server - The Web Server hosts the actual web site, as well as the WCF services that are used to upload the file from the client. This server has a pretty typical configuration, requiring only IIS, ASP.NET, and WCF. Basically, once the Client initiates an upload, the file is sent over the wire in 1 MB chunks (which is easily configurable depending on your infrastructure). The Web Server then creates a new file on the File Server, and simply accumulates each chunk into it until the file is completely uploaded. "Chunking" files not only makes progress bars easy to implement, but also doesn't crush the bandwidth between the server and client. I also have some tricks in the code that harden it against periodic network burps, allowing large files to upload uninterrupted.
In addition, these services also facilitate file cancellation, which is concurrently easy within the chunking paradigm. A separate call is made to the server that stops the upload, deletes the temp file, and resets the progress bar. Finally, (and this is out of scope for this article but well worth mentioning) we could implement an optional clean up job to delete any file fragments resulting from uploads that do fail (since, let's face it, Internet connections still go down).
- Component 3: File Server - Regardless of how your site is hosted, this is your data tier. Although there's no reason why the uploaded files couldn't live in SQL server, I prefer to store them on the file system for both performance and ease of serving purposes. To this end, the File Server has two shared folders: one for temp uploads and one for encoded files ready to be served. Basically, all file chunks are routed through the Web Server and end up in the temp folder.
Once a file is completely uploaded, the client turns around and calls the Media Server, directing it to grab the file from its temp location on the Files Server, and encode it. The new IIS SmoothStreaming-encoded version is saved in the final destination folder and the original file is deleted from the temp folder. The aforementioned clean up job runs at night (when no one is presumably uploading media) and deletes all files from the temp folder.
- Component 4: Media Server - Here is where the magic happens. This server has IIS Media Services 3.0 installed, which enables IIS SmoothStreaming. A virtual directory in IIS is mapped to the file destination folder on the File Server, and serves these files to Silverlight 4 clients. Additionally, Expression Encoder 3 (the "full" version with IIS SmoothStreaming support) is installed here. In order the use the SDK, the product must be licensed and installed. The Encoder SDK is hosted in a WCF service and does the actual work to encode native media into the SmoothStreaming format.
You might be wondering why we have two IIS servers in this farm. The main reason is because media encoding is super processor-intensive, and will suck up every last CPU cycle it can find, spiking all cores. In order to keep your web site humming along, it's a good idea to offload this work if at all possible. A dedicated Media Server also gives us a nice separation of concerns: your Web Server deals with many concurrent users consuming your site's pages; your Media Server deals with fewer people viewing movies and fewer yet uploading and encoding content. However, if you are constrained by physical (or virtual) hardware, there's no technical reason (other than performance) why SUESS couldn't grow very decent media tomatoes in a web garden.
Before diving deeply into each component's details and code in SUESS, I'd like to make another pass through architecture from the perspective of each interaction. As with any distributed system, there is a lot of communication that needs to happen to facilitate each server's role. For example, the client needs to tell the Media Server when a file is done uploading, which it can't know until the Web Server informs it that it's properly been copied to the File Server.
- Interaction A: File Upload - The "media conversation" begins when the client's Uploader control has a valid file selected. It calls a WCF service on the Web Server and sends the first "chunk" (which is an array of bytes) of the file, as well as the file name with a uniqueness-enforcing Guid appended to it.
The Web Server then reads the File Server's temp upload path from a config file, builds the full file name, and simply checks if such a file already exists. If so, it opens it, seeks to the end, and writes the chunk to the file; otherwise it creates a new file with the uploaded chunk.
The fact that Silverlight service calls are asynchronous does some favors for us, but makes the rest of the uploading algorithm more difficult. The favor is making progress bars easy. When each async call finishes, we know exactly how much of the file we've uploaded, (chunk size times number of chunks uploaded over total file size) and can simply use a Storyboard to animate a ProgressBar, since we'll always be on the UI thread.
However, being asynchronous, we can't simply "loop" through the file, and upload the next chunk when the current one finishes. So what the Uploader does is keeps track of the position in the file we are in, then recursively uploads the next chunk in the completed event of each pervious chunk's service call. This way, we don't keep the Browser's UI thread locked up, and can maintain a very accurate progress bar. We'll go into more detail about this algorithm when we look at the code for the Uploader later.
- Interaction B: File Storage / Cancellation / Clean Up Job - The File Upload interaction is pretty straight forward. On the surface, it would appear that File Storage is as well. After the Web Server finishes copying the last chunk to the File Server, the client then kicks off the Encoding process. It's the same story with cancellation. The client breaks the recursion on its end to stop uploading, and tells the Web Server to delete the file from the File Server. Pretty easy, right?
Well, there's one little (huge) problem: the IIS double hop issue. If it weren't for this, there would be no reason to break this interaction out into its own discussion. Basically, under NTLM, IIS cannot be a client and a server at the same time. What this means is that the code behind a service call (IIS being the "server") cannot then turn around and make a call of its own (making IIS a "client") using the same credentials it was passed. This plugs a security hole in which your credentials could otherwise be used to access resources elsewhere on the network you don't explicitly have rights to.
To get around this (as circumventing security continues to accumulate an obnoxiously high percentage of time in my career) we need to either implement a delegation-based security model such as Kerberos, or, more realistically, explicitly impersonate the calls to the File Server. So from this interaction-based perspective, the Uploader is free to call the Web Server or Media Server directly, but these calls cannot be propogated along to the File Server without suffering though a second hop. (It is with great restrain that I did not include pictures of rabbits jumping around the servers in my architectural diagram above: an image the double hop issue always puts in my head.) So make sure you plan out your security model carefully (unless you are using anonymous access or a web garden, in which case this isn't a problem). You'll appreciate this when it's 1 AM, you have all kinds of access denied exception red ink in your Event Viewer, and are staring cross-eyed at the security tab of your upload folder and wondering how giving "Everyone" "Owner" permissions couldn't possibly be working!
- Interaction C: File Encoding - The next step is to encode the media into IIS's SmoothStreaming format. In order for SUESS to be successful, we cannot constrain users into a narrow array of files they can upload. To this end, we can use the Expression Encoder SDK to convert almost any native media a user might have into the proprietary IIS SmoothStreaming format. Interaction-wise, all we need to do is make a service call to the Media Server and tell it which file in the temp uploads path to encode. I'll defer the more in depth discussion about SmoothStreaming for now; there is, however, one important technical footnote to this interaction.
Since the Encoder SDK is 32 bit only, we need to be careful on Windows Server 2008 R2, which is 64 bit (or any 64 bit OS). Although your code will compile just fine, the 64 bit wpw3 process cannot call the 32 bit Encoder DLLs. This will happen even if you compile against the "Any CPU" platform in Visual Studio. To work around this, make sure your app pool in IIS runs in 32 bit mode. (To set this, right click an app pool in IIS, select "Advanced Properties," and set "Enable 32 Bit Applications" to True in the "(General)" section.)
- Interaction D: Media Encoding Process - The final interaction is the actual encoding process, which is kicked off in the previous section. Our Media Server's WCF encoder service is basically a wrapper around the Encoder SDK. The way it works is that the encoder service takes the file name from the client (along with other optional metadata, such as video quality) and creates an Encoder Job (which is the main workhorse of the SDK).
The Job object has a few events that we can hook, primarily "EncodeProgress." This, combined with WCF duplexing, allows a truly bi-directional communication for this interaction. The Job object fires the event when the percentage complete changes, and WCF actually calls a method on the Silverlight Uploader down on the client, which uses the information to update a progress bar! This is huge for this interaction for several reasons. First of all, encoding in general is slow, and real time progress bars give users a visual indicator that things are indeed chugging along on the server. Also, Silverlight 4's native support for the WCF Polling Duplex binding means no kuldges for accurately reporting progress. I've seen (and experimented with myself) a lot different polling schemes to simulate bi-directional communication between WCF and Silverlight. Now that it's all wrapped up nicely for us, we can present our users with a much richer experiment much easier. Finally, there's also a performance benefit on the network side, since, like the chunky upload, we don't have one massive service call sucking up all our bandwidth.
The rest of this interaction is between the Media Server and the File Server. The IIS SmoothStreaming format isn't just a single file with a particular encoding; it's an entire file structure with manifests, data files, etc. So after Encoder reads the source file from the File Server, it does its thing and dumps all the resultant files into the destination folder. Finally, when it's done, it deletes the originally uploaded file, and fires the "EncodeCompleted" event on the client. Since the destination folder on the File Server is already mapped to a virtual directory under the Media Server's IIS with Media Services installed, the new SmoothStreaming media is automatically ready to be smoothly streamed to Silverlight clients!
That does it for the architectural overview of SUESS. Like I said, the main goal here is to wrap all aspects of media serving on the web into one system: uploading, encoding, storing, and viewing. The Microsoft stack of IIS Media Services, Silverlight, WCF, and Expression Encoder provides an unprecedented media platform to deliver rich, "Web 2.0" content to our users. SUESS is merely the glue that sticks all these technologies together.
Look for future posts from me (and soon, I promise!) that will dive into each component with deep technical discussions and code samples. Until then!