DigitallyCreated
Blog

Only showing posts tagged with "C#"

Getting the Correct HTTP Status Codes out of ASP.NET Custom Error Pages

If you are using the default ASP.NET custom error pages, chances are your site is returning the incorrect HTTP status codes for the errors that your users are experiencing (hopefully as few as possible!). Sure, your users see a pretty error page just fine, but your users aren’t always flesh and blood. Search engine crawlers are also your users (in a sense), and they don’t care about the pretty pictures and funny one-liners on your error pages; they care about the HTTP status codes returned. For example, if a request for a page that was removed consistently returns a 404 status code, a search engine will remove it from its index. However, if it doesn’t and instead returns the wrong error code, the search engine may leave the page in its index.

This is what happens if your non-existent pages don't return the correct status code!

This is what happens if your non-existent pages don't return the correct status code!

Unfortunately, ASP.NET custom error pages don’t return the correct error codes. Here’s your typical ASP.NET custom error page configuration that goes into the Web.config:

<customErrors mode="On" defaultRedirect="~/Pages/Error">
    <error statusCode="403" redirect="~/Pages/Error403" />
    <error statusCode="404" redirect="~/Pages/Error404" />
</customErrors>

And here’s a Fiddler trace of what happens when someone request a page that should simply return 404:

A trace of a request that should have just 404'd

A trace of a request that should have just 404'd

As you can see, the request for /ThisPageDoesNotExist returned 302 and redirected the browser to the error page specified in the config (/Pages/Error404), but that page returned 200, which is the code for a successful request! Our human users wouldn’t notice a thing, as they’d see the error page displayed in their browser, but any search engine crawler would think that the page existed just fine because of the 200 status code! And this problem also occurs for other status codes, like 500 (Internal Server Error). Clearly, we have an issue here.

The first thing we need to do is stop the error pages from returning 200 and instead return the correct HTTP status code. This is easy to do. In the case of the 404 Not Found page, we can simply add this line in the view:

<% Response.StatusCode = (int)HttpStatusCode.NotFound; %>

We will need to do this to all views that handle errors (obviously changing the status code to the one appropriate for that particular error). This not only includes the pages you’ve configured in the customErrors element in the Web.config, but also any views you are using with ASP.NET MVC HandleError attributes (if you’re using ASP.NET MVC). After making these changes, our Fiddler trace looks like this:

A trace of a request that is 404ing, but still redirecting

A trace of a request that is 404ing, but still redirecting

We’ve now got the correct status code being returned, but there’s still an unnecessary redirect going on here. Why redirect when we can just return 404 and the error page HTML straight up? If we are using vanilla ASP.NET Forms, this is super easy to do with a quick configuration change; just set redirectMode to ResponseRewrite in the Web.config (this setting is new since .NET 3.5 SP1):

<customErrors mode="On" defaultRedirect="~/Error.aspx" redirectMode="ResponseRewrite">
    <error statusCode="403" redirect="~/Error403.aspx" />
    <error statusCode="404" redirect="~/Error404.aspx" />
</customErrors>

Unfortunately, this trick doesn’t work with ASP.NET MVC, as the method by which the response is rewritten using the specified error page doesn’t play nicely with MVC routes. One workaround is to use static HTML pages for your error pages; this sidesteps MVC routing, but also means you can’t use your master pages, making it a pretty crap solution. The easiest workaround I’ve found is to defenestrate ASP.NET custom errors and handle the errors manually through a bit of trickery in the Global.asax.

Firstly, we’ll handle the Error event in our Global.asax HttpApplication-derived class:

protected void Application_Error(object sender, EventArgs e)
{
    if (Context.IsCustomErrorEnabled)
        ShowCustomErrorPage(Server.GetLastError());
}

private void ShowCustomErrorPage(Exception exception)
{
    HttpException httpException = exception as HttpException;
    if (httpException == null)
        httpException = new HttpException(500, "Internal Server Error", exception);

    Response.Clear();
    RouteData routeData = new RouteData();
    routeData.Values.Add("controller", "Error");
    routeData.Values.Add("fromAppErrorEvent", true);

    switch (httpException.GetHttpCode())
    {
        case 403:
            routeData.Values.Add("action", "AccessDenied");
            break;

        case 404:
            routeData.Values.Add("action", "NotFound");
            break;

        case 500:
            routeData.Values.Add("action", "ServerError");
            break;

        default:
            routeData.Values.Add("action", "OtherHttpStatusCode");
            routeData.Values.Add("httpStatusCode", httpException.GetHttpCode());
            break;
    }

    Server.ClearError();

    IController controller = new ErrorController();
    controller.Execute(new RequestContext(new HttpContextWrapper(Context), routeData));
}

In Application_Error, we’re checking the setting in Web.config to see whether custom errors have been turned on or not. If they have been, we call ShowCustomErrorPage and pass in the exception. In ShowCustomErrorPage, we convert any non-HttpException into a 500-coded error (Internal Server Error). We then clear any existing response and set up some route values that we’ll be using to call into MVC controller-land later. Depending on which HTTP status code we’re dealing with (pulled from the HttpException), we target a different action method, and in the case that we’re dealing with an unexpected status code, we also pass across the status code as a route value (so that view can set the correct HTTP status code to use, etc). We then clear the error and new up our MVC controller, then execute it.

Now we’ll define that ErrorController that we used in the Global.asax.

public class ErrorController : Controller
{
    [PreventDirectAccess]
    public ActionResult ServerError()
    {
        return View("Error");
    }

    [PreventDirectAccess]
    public ActionResult AccessDenied()
    {
        return View("Error403");
    }

    public ActionResult NotFound()
    {
        return View("Error404");
    }

    [PreventDirectAccess]
    public ActionResult OtherHttpStatusCode(int httpStatusCode)
    {
        return View("GenericHttpError", httpStatusCode);
    }

    private class PreventDirectAccessAttribute : FilterAttribute, IAuthorizationFilter
    {
        public void OnAuthorization(AuthorizationContext filterContext)
        {
            object value = filterContext.RouteData.Values["fromAppErrorEvent"];
            if (!(value is bool && (bool)value))
                filterContext.Result = new ViewResult { ViewName = "Error404" };
        }
    }

This controller is pretty simple except for the PreventDirectAccessAttribute that we’re using there. This attribute is an IAuthorizationFilter which basically forces the use of the Error404 view if any of the action methods was called through a normal request (ie. if someone tried to browse to /Error/ServerError). This effectively hides the existence of the ErrorController. The attribute knows that the action method is being called through the error event in the Global.asax by looking for the “fromAppErrorEvent” route value that we explicitly set in ShowCustomErrorPage. This route value is not set by the normal routing rules and therefore is missing from a normal page request (ie. requests to /Error/ServerError etc).

In the Web.config, we can now delete most of the customErrors element; the only thing we keep is the mode switch, which still works thanks to the if condition we put in Application_Error.

<customErrors mode="On">
    <!-- There is custom handling of errors in Global.asax -->
</customErrors>

If we now look at the Fiddler trace, we see the problem has been solved; the redirect is gone and the page correctly returns a 404 status code.

A trace showing the correct 404 behaviour

A trace showing the correct 404 behaviour

In conclusion, we’ve looked at a way to solve ASP.NET custom error pages returning incorrect HTTP status codes to the user. For ASP.NET Forms users the solution was easy, but for ASP.NET MVC users some extra manual work needed to be done. These fixes ensure that search engines that trawl your website don’t treat any error pages they encounter (such as a 404 Page Not Found error page) as actual pages.

Performing Date & Time Arithmetic Queries using Entity Framework v1

If one is writing legacy code in .NET 3.5 SP1’s Entity Framework v1 (yes, your brand new code has now been put into the legacy category by .NET 4 :( ), one will find a severe lack of any date & time arithmetic ability. If you look at LINQ to Entities, you will see no date & time arithmetic methods are recognised. And if you look at the canonical functions that Entity SQL provides, you will notice a severe lack of any date & time arithmetic functions. Thankfully, this strangely missing functionality in canonical ESQL has been resolved as of .NET 4; however, that doesn’t help those of us who haven’t yet upgraded to Visual Studio 2010! Thankfully, all is not lost: there is a solution that only sucks a little bit.

Date & time arithmetic is a pretty necessary ability. Imagine the following simplified scenario, where one has a “Contract” entity that defines a business contract that starts at a certain date and lasts for a duration:

Contract Entity Diagram

Imagine that you need to perform a query that finds all Contracts that were active on a specific date. You may think of performing a LINQ to Entities query that looks like this:

IQueryable<Contract> contracts = 
    from contract in Contracts
    where activeOnDate >= contract.StartDate &&
          activeOnDate < contract.StartDate.AddMonths(contract.Duration)
    select contract

However, this is not supported by LINQ to Entities and you’ll get a NotSupportedException. With canonical ESQL missing any defined date & time arithmetic functions, it seems we are left without a recourse.

However, if one digs around in MSDN, one may stumble across the fact that the SQL Server Entity Framework provider defines some provider-specific ESQL functions with which one can use to do date & time arithmetic. So we can write an ESQL query to get the functionality we wish:

const string eSqlQuery = @"SELECT VALUE c FROM Contracts AS c
                           WHERE @activeOnDate >= c.StartDate
                           AND @activeOnDate < SqlServer.DATEADD('months', c.Duration, c.StartDate)";

ObjectQuery<Contract> contracts = 
    context.CreateQuery<Contract>(eSqlQuery, new ObjectParameter("activeOnDate", activeOnDate));

Note the use of the SqlServer.DATEADD ESQL function. The “SqlServer” bit is specifying the SQL Server provider’s specific namespace. Obviously this approach has some disadvantages: it’s not as nice as using LINQ to Entities, and it’s also not database provider agnostic (so if you’re using Oracle you’ll need to see whether your Oracle provider has something like this). However, the alternatives are either to write SQL manually (negating the usefulness of Entity Framework) or to download all the entities into local memory and use LINQ to Objects (almost never, ever an acceptable option!).

Notice that using the CreateQuery function to create the query from ESQL returned an ObjectQuery<T> (which implements IQueryable<T>)? This means that you can now use LINQ to Entities to change that query. For example, what if we wanted to perform some further filtering that can be customised by the user (the user’s filtering preferences are set in the ContractFilterSettings class in the below example):

private ObjectQuery<Contract> CreateFilterSettingsAsQueryable(ContractFilterSettings filterSettings, MyEntitiesContext context)
{
    IQueryable<Contract> query = context.Contracts;

    if (filterSettings.ActiveOnDate != null)
    {
        const string eSqlQuery = @"SELECT VALUE c FROM Contracts AS c
                                   WHERE @activeOnDate >= c.StartDate
                                   AND @activeOnDate < SqlServer.DATEADD('months', c.Duration, c.StartDate)";
        query = context.CreateQuery<Contract>(eSqlQuery, new ObjectParameter("activeOnDate", filterSettings.ActiveOnDate));
    }
    if (filterSettings.SiteId != null)
        query = query.Where(c => c.SiteID == filterSettings.SiteId);
    if (filterSettings.ContractNumber != null)
        query = query.Where(c => c.ContractNumber == filterSettings.ContractNumber);
    
    return (ObjectQuery<Contract>)query;
}

Being able to continue to use LINQ to Entities after creating the initial query in ESQL means that you can have the best of both worlds (forgetting the fact that we ought to be able to do date & time arithmetic in LINQ to Entities in the first place!). Note that although you can start a query in ESQL and then add to it with LINQ to Entities, you cannot start a query in LINQ to Entities and add to it with ESQL, hence why the ActiveOnDate filter setting was done first.

For those who use .NET 4, unfortunately you’re still stuck with being unable to do date & time arithmetic with LINQ to Entities by default (Alex James told me on Twitter a while back that this was purely because of time constraints, not because they are evil or something :) ). However, you get canonical ESQL functions that are provider independent (so instead of using SqlServer.DATEADD above, you’d use AddMonths). If you really want to use LINQ to Entities (which is not surprising at all), you can create ESQL snippets in your model (“Model Defined Functions”) and then create annotated method stubs which Entity Framework will recognise when you use them inside a LINQ to Entities query expression tree. For more information see this post on the Entity Framework Design blog, and this MSDN article.

EDIT: Diego Vega kindly pointed out to me on Twitter that, in fact, Entity Framework 4 comes with some methods you can mix into your LINQ to Entities queries to call on EF v4’s new canonical date & time ESQL functions. Those methods (as well as a bunch of others) can be found in the EntityFunctions class as static methods. Thanks Diego!

Deep Inside ASP.NET MVC 2 Model Metadata and Validation

One of the big ticket features of ASP.NET MVC 2 is its ability to perform model validation at the model binding stage in the controller. It also has the ability to store metadata about your model itself (such as what properties are required ones, etc) and therefore can be more intelligent when it automatically renders an HTML interface for your data model.

Unfortunately, the way ASP.NET MVC 2 handles validation (in the controller) is at odds with my preferred way to handle validation (in the business layer, with validation errors batched and thrown up as exceptions back to the controller. I used xVal). In order to understand ASP.NET MVC 2’s validation I wanted to know how it worked, not just how to use it. Unfortunately, at time of writing, there seemed to be very few sources of information that described the way MVC 2’s model metadata and validation worked at a high, architectural level. The ASP.NET MVC MSDN class documentation is pretty horrible (it’s anaemic, in terms of detail). Sure, there are blog articles on how to use it, but none (that I could find, anyway) on how it really all works together under the covers.  And if you don’t know how it works, it can be quite difficult to know how to use it most effectively.

To resolve this lack of documentation, this blog post attempts to explain the architecture of the ASP.NET MVC 2 metadata and validation features and to show you, at a high level, how they work. As I am not a member of the MVC team at Microsoft, I don’t have access to any architecture documentation they may have, so keep in mind everything in this blog was deduced by reading the MVC source code and judicious application of .NET Reflector. If I’ve made any mistakes, please let me know in the comments.

Model Metadata

ASP.NET MVC 2 has the ability to create metadata about your data model and use this for various tasks such as validation and special rendering. Some examples of the metadata that it stores are whether a property on a data model class is required, whether it’s read only, and what its display name is. As an example, the “is required” metadata would be useful as it would allow you to automatically write HTML and JavaScript to ensure that the user enters a value in a textbox, and that the textbox’s label is appended with an asterisk (*) so that the user knows that it is a required field.

Figure 1. ModelMetadata and Providers Class Diagram

Figure 1. ModelMetadata and Providers Class Diagram

The ASP.NET MVC Model Metadata system stores metadata across many instances of the ModelMetadata class (and subclasses). ModelMetadata is a recursive data structure. The root ModelMetadata instance stores the metadata for the root data model object type, and also stores references to a ModelMetadata instance for each property on that root data model type. Figure 1 shows visual example of this for an example “Book” data model class.

Figure 2. Model Metadata Recursive Data Structure

Figure 2. Model Metadata Recursive Data Structure Diagram (note that not all properties on ModelMetadata are shown)

As you can see in Figure 2, there is a root ModelMetadata that describes the Book instance. You’ll notice that its ContainerType and PropertyName properties are null, which indicates that this is the root ModelMetadata instance. This root ModelMetadata instance has a Properties property that provides lazy-loaded access to ModelMetadata instances for each of the Book type’s properties. These ModelMetadata instances do have ContainerType and PropertyName set, because since they are for properties, they have a container type (ie. Book) and a name (ie. Title).

So how do we get instances of ModelMetadata for a data model, and how is the model’s metadata divined in the first place? This is where ModelMetadataProvider classes come in (see Figure 1). ModelMetadataProvider classes are able to create ModelMetadata instances. However, ModelDataMetadataProvider itself it is an abstract class, and delegates all its functionality to subclasses. The DataAnnotationsModelMetadataProvider is a implementation that is able to create model metadata based off the attributes in the System.ComponentModel.DataAnnotations namespace that are applied to data model classes and their properties. For example, the RequiredAttribute is used to determine whether a property is required or not (ie. it causes the ModelMetadata.IsRequired property to be set).

You’ll notice that the DataAnnotationsModelMetadataProvider does not inherit directly from ModelMetadataProvider (see Figure 1). Instead, it inherits from the AssociatedMetadataProvider abstract class. The AssociatedMetadataProvider class actually performs the getting of the attributes off the class and its properties by using an ICustomTypeDescriptor behind the scenes (which turns out to be the internal AssociatedMetadataTypeTypeDescriptor class, which is constructed by the AssociatedMetadataTypeTypeDescriptionProvider class). The AssociatedMetadataProvider then delegates the actual responsibility of building a ModelMetadata instance using the attributes it found to a subclass. The DataAnnotationsModelMetadataProvider inherits this responsibility.

As you can see, the whole metadata-creation system is very modular and allows you to generate model metadata in any way you want by just using a different ModelMetadataProvider class. The Provider class that the system uses by default is set by creating an instance of it and putting it in the static property ModelMetadataProviders.Current. By default, an instance of DataAnnotationsModelMetadataProvider is used.

Model Validation

ASP.NET MVC 2 includes a very modular and powerful validation system. Unfortunately, it’s very coupled to ASP.NET MVC as it uses ControllerContexts throughout itself. This means you can’t move its use into your business layer; you have to keep it in the controller. The modularity and power comes with a trade-off: architectural complexity. The validation system architecture contains quite a lot of indirection, but when you step through it you can see the reasons why the indirection exists: for extensibility.

Figure 3. ModelValidator and ModelValidatorProviders Diagram (click to enlarge)

Figure 3. ModelValidator and ModelValidatorProviders Diagram (click to enlarge)

Figure 3 shows an overview of the validation system architecture. Key to the system are ModelValidators, which when executed, are able to validate a part of the model in a specific way and return ModelValidationResults, which describe any validation errors that occurred with a simple text message. ModelValidators also have a collection of ModelClientValidationRules, which are simple data structures that represent the client-side JavaScript validation rules to be used for that ModelValidator.

ModelValidators are created by ModelValidatorProvider classes when they are asked to validate a particular ModelMetadata object. Many ModelValidatorProviders can be used at the same time to provide validation (ie. return ModelValidators) on a single ModelMetadata. This is illustrated by the ModelValidatorProviders.Providers static property, which is of type ModelValidatorProviderCollection. This collection, by default, holds instances of the ClientDataTypeModelValidatorProvider, DataAnnotationsModelValidatorProvider and DataErrorInfoModelValidatorProvider classes. This collection has a method (GetValidators) that allows you to easily execute all contained ModelValidatorProviders and get all the returned ModelValidators back. In fact, this is what is called when you call ModelMetadata.GetValidators.

The DataAnnotationsModelValidatorProvider is able to create ModelValidators that validate a model held in a ModelMetadata object by looking at System.ComponentModel.DataAnnotations attributes that have been placed on the data model class and its properties. In a similar fashion to the DataAnnotationsModelMetadataProvider, the DataAnnotationsModelValidatorProvider leaves the actual retrieving of the attributes from the data types to its base class, the AssociatedValidatorProvider.

The ClientDataTypeModelValidator produces ModelValidators that don’t actually do any server-side validation. Instead, the ModelValidators that it produces provide the ModelClientValidationRules needed to enforce data typing (such as numeric types like int) on the client. Obviously no server-side validation is needed to be done for data types since C# is a strongly-typed language and only an int can be in an int property. On the client-side, it is quite possible for the user to type a string (ie “abc”) into a textbox that requires an int, hence the need for special ModelClientValidationRules to deal with this.

The DataErrorInfoModelValidatorProvider is able to create ModelValidators that validate a data model that implements the IDataErrorInfo interface.

You may have noticed that ModelValidator is an abstract class. ModelValidator delegates the actual validation functionality to its subclasses, and it’s these concrete subclasses that the different ModelValidatorProviders create. Let’s take a deeper look at this, with a focus on the ModelValidators that the DataAnnotationsModelValidatorProvider creates.

Figure 4. ModelValidator and ModelClientValidationRule Class Hierarchy (click to enlarge)

Figure 4. ModelValidator and ModelClientValidationRule Class Hierarchy (click to enlarge)

As you can see in Figure 4, there is a ModelValidator concrete class for each System.ComponentModel.DataAnnotations validation attribute (ie. RangeAttributeAdapter is for the RangeAttribute). The DataAnnotationsModelValidator provides the general ability to execute a ValidationAttribute. The generic DataAnnotationsModelValidator<TAttribute> simply allows subclasses to access the concrete ValidationAttribute that the ModelValidator is for through a type-safe property.

You’ll notice in Figure 4 how each DataAnnotationsModelValidator<TAttribute> subclass has a matching ModelClientValidationRule. Each of those subclasses (ie. ModelClientValidationRangeRule) represents the client-side version of the ModelValidator. In short, they contain the rule name and whatever data needs to go along with it (for example, for the ModelClientValidationStringLengthRule, the specific string length is set inside that object instance). These rules can be serialised out to the page so that the page’s JavaScript validation logic can hook up client-side validation. (See Phil Haack’s blog on this for more information).

The DataAnnotationsModelValidatorProvider implements an extra layer of indirection that allows you to get it to create custom ModelValidators for any custom ValidationAttributes you might choose to create and use on your data model. Internally it has a Dictionary<Type, DataAnnotationsModelValidationFactory>, where the Type of a ValidationAttribute is associated with a DataAnnotationsModelValidationFactory (a delegate) that can create a ModelValidator for the attribute type it is associated with. You can add to this dictionary by calling the static DataAnnotationsModelValidatorProvider.RegisterAdapterFactory method.

The CompositeModelValidator is a private inner class of ModelValidator and is returned when you call the static ModelValidator.GetModelValidator method. It is a special ModelValidator that is able to take a ModelMetadata class, validate it by using its GetValidators method, and validate its properties by calling GetValidators on each of the ModelMetadata instances returned for each model property. It collects all the ModelValidationResult objects returned by all the ModelValidators that it executes and returns them in one big bundle.

Putting Metadata and Validation to Use

Now that we’re familiar with the workings of the ASP.NET MVC 2 Metadata and Validation systems, let’s see how we can actually use them to validate some data. Note that you don’t need to actually cause validation to happen explicitly, as ASP.NET MVC will trigger it to happen during model binding. This section just shows how the massive system we’ve looked at is kicked into execution.

The Controller.TryValidateModel method is a good example of showing model metadata creation and model validation in action:

protected internal bool TryValidateModel(object model, string prefix)
{
    if (model == null)
        throw new ArgumentNullException("model");

    ModelMetadata modelMetadata = ModelMetadataProviders.Current.GetMetadataForType(() => model, model.GetType());
    ModelValidator compositeValidator = ModelValidator.GetModelValidator(modelMetadata, base.ControllerContext);

    foreach (ModelValidationResult result in compositeValidator.Validate(null))
    {
        this.ModelState.AddModelError(DefaultModelBinder.CreateSubPropertyName(prefix, result.MemberName), result.Message);
    }

    return this.ModelState.IsValid;
}

The first thing we do in the above method is create the ModelMetadata for the root object in our model (which is the model object passed in as a method parameter). We then get a CompositeModelValidator that will internally get and execute all the ModelValidators for the validation of the root object type and all its properties. Finally, we loop over all the validation errors gotten from executing the CompositeModelValidator (which executes all the ModelValidators it fetched internally) and add each error to the controller’s ModelState.

The CompositeModelValidator really hides the actual work of getting and executing the ModelValidators from us. So let’s see how it works internally.

public override IEnumerable<ModelValidationResult> Validate(object container) 
{
    bool propertiesValid = true;

    foreach (ModelMetadata propertyMetadata in Metadata.Properties) 
    {
        foreach (ModelValidator propertyValidator in propertyMetadata.GetValidators(ControllerContext)) 
        {
            foreach (ModelValidationResult propertyResult in propertyValidator.Validate(Metadata.Model)) 
            {
                propertiesValid = false;
                yield return new ModelValidationResult 
                {
                    MemberName = DefaultModelBinder.CreateSubPropertyName(propertyMetadata.PropertyName, propertyResult.MemberName),
                    Message = propertyResult.Message
                };
            }
        }
    }

    if (propertiesValid) 
    {
        foreach (ModelValidator typeValidator in Metadata.GetValidators(ControllerContext)) 
            foreach (ModelValidationResult typeResult in typeValidator.Validate(container)) 
                yield return typeResult;
    }
}

First it validates all the properties on the object that is held by the ModelMetadata it was created with (the Metadata property). It does this by getting the ModelMetadata of each property, getting the ModelValidators for that ModelMetadata, then executing each ModelValidator and yield returning the ModelValidationResults (if any; there will be none if there were no validation errors).

Then, if the all the properties were valid (if there were no ModelValidationResults returned from any of the ModelValidators) it proceeds to validate the root ModelMetadata by getting its ModelValidators, executing each one, and yield returning any ModelValidationResults.

The ModelMetadata.GetValidators method that the CompositeModelValidator is using simply asks the default ModelValidatorProviderCollection to execute all its ModelValidatorProviders and returns all the created ModelValidators:

public virtual IEnumerable<ModelValidator> GetValidators(ControllerContext context)
{
    return ModelValidatorProviders.Providers.GetValidators(this, context);
}

Peanut-Gallery Criticism

The only criticism I have is levelled against the Model Validation system that for some reason requires a ControllerContext to be passed around internally (see the ModelValidator constructor). As far as I can see, that ControllerContext is never actually used by any out of the box validation component, which leads me to question why it needs to exist. The reason I dislike it is because it prevents me from even considering using the validation system down in the business layer (as opposed to in the controller) without coupling the business layer to ASP.NET MVC. However, there may be a good reason as to why it exists that I am not privy to.

Conclusion

In this blog post we’ve addressed the problem of a serious lack of documentation that explains the inner workings and architecture of the ASP.NET MVC 2 Model Metadata and Validation systems (that existed at least at the time of writing). We’ve systematically deconstructed and looked at the architecture of the two systems and seen how the architecture enables you to easily generate metadata about your data model and validate it, and how it provides the flexibility to plug in, change and extend the basic functionality to meet your specific use case.

DigitallyCreated Utilities v1.0.0 Released

After a hell of a lot of work, I am happy to announce that the 1.0.0 version of DigitallyCreated Utilities has been released! DigitallyCreated Utilities is a collection of many neat reusable utilities for lots of different .NET technologies that I’ve developed over time and personally use on this website, as well as on others I have a hand in developing. It’s a fully open source project, licenced under the Ms-PL licence, which means you can pretty much use it wherever you want and do whatever you want to it. No viral licences here.

The main reason that it has taken me so long to release this version is because I’ve been working hard to get all the wiki documentation on CodePlex up to scratch. After all, two of the project values are:

  • To provide fully XML-documented source code
  • To back up the source code documentation with useful tutorial articles that help developers use this project

And truly, nothing is more frustrating than code with bad documentation. To me, bad documentation is the lack of a unifying tutorial that shows the functionality in action, and the lack of decent XML documentation on the code. Sorry, XMLdoc that’s autogenerated by tools like GhostDoc, and never added to by the author, just doesn’t cut it. If you can auto-generate the documentation from the method and parameter names, it’s obviously not providing any extra value above and beyond what was already there without it!

So what does DCU v1.0.0 bring to the table? A hell of a lot actually, though you may not need all of it for every project. Here’s the feature list grouped by broad technology area:

  • ASP.NET MVC and LINQ
    • Sorting and paging of data in a table made easy by HtmlHelpers and LINQ extensions (see tutorial)
  • ASP.NET MVC
    • HtmlHelpers
      • TempInfoBox - A temporary "action performed" box that displays to the user for 5 seconds then fades out (see tutorial)
      • CollapsibleFieldset - A fieldset that collapses and expands when you click the legend (see tutorial)
      • Gravatar - Renders an img tag for a Gravatar (see tutorial)
      • CheckboxStandard & BoolBinder - Renders a normal checkbox without MVC's normal hidden field (see tutorial)
      • EncodeAndInsertBrsAndLinks - Great for the display of user input, this will insert <br/>s for newlines and <a> tags for URLs and escape all HTML (see tutorial)
    • IncomingRequestRouteConstraint - Great for supporting old permalink URLs using ASP.NET routing (see tutorial)
    • Improved JsonResult - Replaces ASP.NET MVC's JsonResult with one that lets you specify JavaScriptConverters (see tutorial)
    • Permanently Redirect ActionResults - Redirect users with 301 (Moved Permanently) HTTP status codes (see tutorial)
    • Miscellaneous Route Helpers - For example, RouteToCurrentPage (see tutorial)
  • LINQ
    • MatchUp & Federator LINQ methods - Great for doing diffs on sequences (see tutorial)
  • Entity Framework
    • CompiledQueryReplicator - Manage your compiled queries such that you don't accidentally bake in the wrong MergeOption and create a difficult to discover bug (see tutorial)
    • Miscellaneous Entity Framework Utilities - For example, ClearNonScalarProperties and Setting Entity Properties to Modified State (see tutorial)
  • Error Reporting
    • Easily wire up some simple classes and have your application email you detailed exception and error object dumps (see tutorial)
  • Concurrent Programming
    • Semaphore/FifoSemaphore & Mutex/FifoMutex (see tutorial)
    • ReaderWriterLock (see tutorial)
    • ActiveObject - Easily inherit from ActiveObject to separately thread your class (see tutorial)
  • Unity & WCF
    • WCF Client Injection Extension - Easily use dependency injection to transparently inject WCF clients using Unity (see tutorial)
  • Miscellaneous Base Class Library Utilities
    • SafeUsingBlock and DisposableWrapper - Work with IDisposables in an easier fashion and avoid the bug where using blocks can silently swallow exceptions (see tutorial)
    • Time Utilities - For example, TimeSpan To Ago String, TzId -> Windows TimeZoneInfo (see tutorial)
    • Miscellaneous Utilities - Collection Add/RemoveAll, Base64StreamReader, AggregateException (see tutorial)

DCU is split across six different assemblies so that developers can pick and choose the stuff they want and not take unnecessary dependencies if they don’t have to. This means if you don’t use Unity in your application, you don’t need to take a dependency on Unity just to use the Error Reporting functionality.

I’m really pleased about this release as it’s the culmination of rather a lot of work on my part that I think will help other developers write their applications more easily. I’m already using it here on DigitallyCreated in many many places; for example the Error Reporting code tells me when this site crashes (and has been invaluable so far), the CompiledQueryReplicator helps me use compiled queries effectively on the back-end, and the ReaderWriterLock is used behind the scenes for the Twitter feed on the front page.

I hope you enjoy this release and find some use for it in your work or play activities. You can download it here.

C# Using Blocks can Swallow Exceptions

While working with WCF for my part time job, I came across this page on MSDN that condemned the C# using block as unsafe to use when working with a WCF client. The problem is that the using block can silently swallow exceptions without you even knowing. To prove this, here’s a small sample:

public static void Main()
{
    try
    {
        using (new CrashingDisposable())
        {
            throw new Exception("Inside using block");
        }
    }
    catch (Exception e)
    {
        Console.WriteLine("Caught exception: " + e.Message);
    }
}

private class CrashingDisposable : IDisposable
{
    public void Dispose()
    {
        throw new Exception("Inside Dispose");
    }
}

The above program will write “Caught exception: Inside Dispose” to the console. But where did the “Inside using block” exception go? It was swallowed by the using block! How this happens is more obvious when you unroll the using block into the try/finally block that it really is (note the outer braces that limit crashingDisposable’s scope):

{
    CrashingDisposable crashingDisposable = new CrashingDisposable();
    try
    {
        throw new Exception("Inside using block");
    }
    finally
    {
        if (crashingDisposable != null)
            ((IDisposable)crashingDisposable).Dispose(); //Dispose exception thrown here
    }
}

As you can see, the “Inside using block” exception is lost entirely. A reference to it isn’t even present and the exception from the Dispose call is the one that gets thrown up.

So, how does this affect you in WCF? Well, when a WCF client is disposed it is closed, which may throw an exception. So if, while using your client object, you encounter an exception that escapes the using block, the client will be disposed and therefore closed, which could throw an exception that will hide the original exception. That’s just bad for debugging.

This is obviously undesirable behaviour, so I’ve written a construct that I’ve dubbed a “Safe Using Block” that stops the exception thrown in the using block from being lost. Instead, the safe using block gathers both exceptions together and throws them up inside an AggregateException (present in DigitallyCreated.Utilities.Bcl, but soon to be found in .NET 4.0) Here’s the above using block rewritten as a safe using block:

new CrashingDisposable().SafeUsingBlock(crashingDisposable =>
{
    throw new Exception("Inside using exception");
});

When this code runs an AggregateException is thrown that contains both the “Inside using exception” exception and the “Dispose” exception. So how does this work? Here’s the code:

public static void SafeUsingBlock<TDisposable>(this TDisposable disposable, Action<TDisposable> action)
    where TDisposable : IDisposable
{
    try
    {
        action(disposable);
    }
    catch (Exception actionException)
    {
        try
        {
            disposable.Dispose();
        }
        catch (Exception disposeException)
        {
            throw new DigitallyCreated.Utilities.Bcl.AggregateException(actionException, disposeException);
        }

        throw;
    }

    disposable.Dispose(); //Let it throw on its own
}

SafeUsingBlock is defined as an extension method off any type that implements IDisposable. It takes an Action delegate, which represents the code to run inside the “using block”. Since the method uses generics, the Action delegate is handed the concrete type of IDisposable you created, not just an abstract IDisposable.

You can use a safe using block in a very similar fashion to a normal using block, except for one key thing: goto-style statements won’t work. So for example, you can’t use return, break, continue, etc inside the safe using block and expect it to affect the method outside. You must keep in mind that you’re writing inside a new anonymous method, not inside a code block.

SafeUsingBlock is part of DigitallyCreated Utilities, however, it is currently only available in the trunk repository, not as a part of a release. Once I’m done working on the documentation, I’ll put it out in a full release.

DigitallyCreated v4.0 Launched

After over a month of work (in between replaying Mass Effect and my part-time work at Onset), I’ve finally finished the first version of DigitallyCreated v4.0. DigitallyCreated has been my website for many many years and has seen three major revisions before this one. However, those revisions were only HTML layout revisions; v3.0 upgraded the site to use a div-based CSS layout instead of a table layout, and v2.0 was so long ago I don’t even remember what it upgraded over v1.0.

Version 4 is DigitallyCreated’s biggest change since its very first version. It has received a brand new look and feel, and instead of being static HTML pages with a touch of 2 second hacked-up PHP thrown in, v4.0 is written from the ground up using C#, .NET Framework 3.5 SP1, and ASP.NET MVC. Obviously, there’s no point re-inventing the wheel (ignoring the irony that this is a custom-made blog site), so DigitallyCreated use a few open-source libraries (.NET ones and JavaScript ones) in its makeup:

  • DigitallyCreated Utilities – a few libraries full of lots of little utilities ranging from MVC HtmlHelpers, to DateTime and TimeZoneInfo utilities, to Entity Framework utilities and LINQ extensions, to a pretty error and exception emailing utility.
  • DotNetOpenAuth – a library that implements the OpenID standard, used for user authentication
  • XML-RPC.NET – a library that allows you to create XML-RPC web services (since WCF doesn’t natively support it)
  • Unity Application Block – a Microsoft open-source dependency injection library (used to decouple the business and UI layers)
  • xVal – a library that enables declarative, attribute-based, server-side & client-side form validation
  • jQuery – the popular JavaScript library, without which JavaScript development is like pulling teeth without anaesthetic
  • jQuery.Validate – an extension to jQuery that makes client-side form validation easier (used by xVal)
  • openid-selector (modified) – the JavaScript that powers the pretty logos on the OpenID sign in page
  • qTip – a jQuery extension that makes doing rich tooltips easier
  • IE6Update – a JavaScript library that encourages viewers still on IE6 to upgrade to a browser that doesn’t make baby Jesus cry
  • SyntaxHighlighter – a JavaScript library that handles the syntax highlighting of any code snippets I use

So what are some of the cool features that make v4.0 so much better? Firstly, I’ve finally got a real blogging system. Using an implementation of the MetaWeblog API (plus some small extras) I can use Windows Live Writer (which is awesome, FYI) to write my blogs and post them up to website. The site automatically generates RSS feeds for almost everything to do with blogs. You can get an RSS feed for all blogs, for all blogs in a particular category or tag, for all comments, for comments on a blog, or for all comments on all blogs in a particular category or tag. Thanks to ASP.NET MVC, that was really easy to do, since most of the feeds are simply a different “view” of the blog data. And finally, the blog system now allows people to comment on blog posts, a long needed feature.

The new blog system makes it so much more convenient for me to blog now. Previously, posting a blog was a chore: write the blog, put the HTML into a new file, tell the hacky PHP rubbish code about the new file (increment the total number of posts), manually update the RSS feed XML, and then upload all the changes via FTP. Now I just write the post in Live Writer and click publish.

The URLs that the blog post use are now slightly search engine optimised (ie. they use a slug, that effectively puts the post title in the URL). I’ve made sure that the old permalinks to blog posts still work for the new site. If you use an old URL, you’ll get permanently redirected to the new URL. Permalinks are a promise. :)

Other than the big blog system, there are some other cool things on the site. One is the fact you can sign in with your OpenID. Doing your own authentication is so 2006 :). From the user’s perspective it means that they don’t need to sign up for a new account just to comment on the blog posts. They can simply log in with their existing OpenID, which could be their Google or Yahoo account (or soon their Windows Live ID). If their OpenID provider supports it (and the user allows it), the site pulls some of their details from their OpenID, like their nickname, email, website, and timezone, so they don’t have to enter their details manually. I use the user’s email address to pull and display their Gravatar when they submit a comment. All the dates and times on the site are adjusted to match the logged in user’s time zone.

Another good thing that is that home page is no longer a waste of space. In v3.0, the home page displayed some out of date website news and other old information. Because uploading blog posts was a manual process, I couldn’t be bothered putting blog snippets on the home page, since that would have meant me doing it manually for every blog post. In v4.0, the site does it for me. It also runs a background thread, polls my Twitter feed, and pulls my tweets for display.

I’ve also trimmed a lot of content. The old DigitallyCreated had a lot of content that was ancient, never updated, and read like it had been written by a 15 year old (it had been!). I cut all that stuff. However, I kept the resume page, and jazzed it up with a bit of JavaScript so that it’s slish and sexy. It no longer shows you my life history back into high school by default. Instead, it shows you what you care about: the most recent stuff. The older things are still there, but are hidden by default and can be shown by clicking “show me more”-style JavaScript links.

Behind the scenes there’s an error reporting system that catches crashes and emails nice pretty exception object dumps to me, so that I know what went wrong, what page it occurred on, and when it happened so that I can fix it. And of course, how can we forget funny error pages.

Of course, all this new stuff is just the beginning of the improvements I’m making to the site. There are more features that I wanted to add, but I had to draw the release line somewhere, so I drew it here. In the future, I’m looking at:

  • Not displaying the full blog posts in the blog post lists. Instead, I’ll have a “show me more”-style link that shows the rest of the post when you click it.
  • A right-hand panel on the blog pages which will contain, in particular, a list of categories and a coloured tag cloud. Currently, the only way to browse tags and categories is to see one used on a post and to click it. That’s rubbish discoverability.
  • “Reply to” in comments, so that people can reply to other comments. I may also consider a peer reviewed comment system where you can “thumbs-up” and “thumbs-down” comments.
  • The ability to associate multiple OpenIDs with the same account here at DigitallyCreated. This could be useful for people who want to change their OpenID, or use a new one.
  • A full file upload system, which can only be used by myself for when I want to upload a file for a friend to download that’s too big for email (like a large installer or something). The files will be able to be secured so that only particular OpenIDs can download them.
  • An in-browser blog editing UI, so that I’m not tied to using Windows Live Writer to write blogs.
  • Blog search. I might fiddle with my own implementation for fun, but search is hard, so I may end up delegating to Google or Bing.

In terms of deployment, I’ve had quite the saga, which is the main reason why the site is late in coming. Originally, I was looking at hosting at either WebHost4Life or DiscountASP.NET. DiscountASP.NET looked totally pro and seemed like it was targeted at developers, unlike a lot of other hosts that target your mum. However, WebHost4Life seemed to provide similar features with more bandwidth and disk space for half the price. Their old site had a few awards on it, which lent it credibility to me (they’ve since changed it and those awards have disappeared). So I went with them over DiscountASP.NET. Unfortunately, their services never worked correctly. I had to talk to support to get signed up, since I didn’t want to transfer my domain over to them (a common need, surely!). I had to talk to support to get FTP working. I had to talk to support when the file manager control panel didn’t work. The last straw for me was when I deployed DigitallyCreated, and I got an AccessViolationException complaining that I probably had corrupt memory. After talking to support, they fixed it for me and declared the site operational. Of course, they didn’t bother to go past the front page (and ignored the broken Twitter feed). All pages other than the front page were broken (seemed like the ASP.NET MVC URL routing wasn’t happening). I talked to support again and they spent the week making it worse, and at the end the front page didn’t even load any more.

So instead I discovered WinHost, who offer better features than WebHost4Life (though with less bandwidth and disk space), but for half the price and without requiring any yearly contract! I signed up with them today, and within a few hours, DigitallyCreated was up and running perfectly. Wonderful! They even offer IIS Manager access to IIS and SQL Server Management Studio access to the database. Thankfully, I risked WebHost4Life knowing that they have a 30-day money back guarantee, so providing they don’t screw me, I’ll get my money back from them.

Now that the v4.0 is up and running, I will start blogging again. I’ve got some good topics and technologies that I’ve been saving throughout my hiatus in January to blog about for when the site went live. DigitallyCreated v4.0 has been a long time coming, and I hope you like it, and now, thanks to comments, you can tell me how much it sucks!

Entity Framework, TransactionScope and MSDTC

Update: Please note that the behaviour described in this article only occurs when using SQL Server 2005. SQL Server 2008 (and .NET 3.5+) can handle multiple connections within a transaction without requiring MSDTC promotion.

I've been tightening up code on a website I'm writing for work, and as such I've been improving the transactional integrity of some of the code that talks to our database (written using Entity Framework). Namely, I've been using TransactionScope to create transactions at specific isolation levels to ensure that no weird concurrency issues can slip in.

TransactionScope is very powerful. It has the ability to maintain your transaction across two (or more) database connections (or at least the SQL Server database code that uses it does) or even after you close the connection to the database. This is done by promoting your transaction up to being a "distributed transaction" that is managed by the Microsoft Distributed Transaction Coordinator (MSDTC) when you start to use multiple connections, or close the connection that you currently have a transaction in. So, essentially, as soon as your transaction becomes something that SQL Server can't handle with its normal transaction, the transaction is palmed off to MSDTC to manage.

An MSDTC transaction comes at a performance price as the transaction is no longer a lightweight transaction managed by SQL Server internally, but a heavyweight MSDTC transaction that is much more powerful. So, if at all possible, we do not want to use MSDTC (unless, of course, we actually need it).

However, there is a funny (or not so funny, when you think about it) behaviour that Entity Framework exhibits that causes your innocent transaction to be promoted to being an MSDTC transaction. Consider this code:

using (TestDBEntitiesContext context = new TestDBEntitiesContext())
{
    using (TransactionScope transaction = new TransactionScope())
    {
        var authors = (from author in context.Authors
                       select author).ToList();

        int count = (from author in context.Authors
                     select author).Count();

        transaction.Complete();
    }
}

This looks like a pretty innocent bit of code and looks like it should not result in transaction promotion. By putting the lifetime of the transaction (its using block) inside the lifetime of the ObjectContext, we ensure that the transaction cannot outlive any connection used by the context and therefore cause a promotion to an MSDTC transaction.

However, this code causes a transaction promotion on line 8. You can see this by ensuring the MSDTC service ("Distributed Transaction Coordinator") is stopped and then waiting for the exception that will be thrown because the SqlConnection is unable to promote the transaction since MSDTC is not running.

Why does this occur? A bit of digging on MSDN comes up with this bit of innocuous documentation:

Promotion of a transaction to a DTC may occur when a connection is closed and reopened within a single transaction. Because Object Services opens and closes the connection automatically, you should consider manually opening and closing the connection to avoid transaction promotion.

By sticking some breakpoints in the code above, we can observe this behaviour in action. The connection (ObjectContext.Connection) is closed by default, is opened quickly for the first query then closed immediately, then opened again for the second query, then closed. This second connection that is opened causes the transaction promotion!

At first glance this seemed to me to be an inefficient way of handling the connection! It's not that uncommon that one would want to do more than one thing with the ObjectContext in sequence and having a connection opened and closed for each query seems really inefficient.

However, upon further thought, I realised the reason why the Entity Framework team does this is probably to cover the use case where you have a long-lived ObjectContext (unlike here where we create it quickly, use it, and then throw it away). If the ObjectContext is going to be around for a long time (perhaps you've got one hanging around supplying data to a WPF dialog) we don't want it to hog a connection for all that time (99% of the time it will be idling waiting for the user!).

However, this "feature" gets in our way when using the ObjectContext in the manner above. To change this behaviour you need to sack the ObjectContext from the connection management job and do it yourself:

using (TestDBEntitiesContext context = new TestDBEntitiesContext())
{
    using (TransactionScope transaction = new TransactionScope())
    {
        context.Connection.Open();

        var authors = (from author in context.Authors
                       select author).ToList();

        int count = (from author in context.Authors
                     select author).Count();

        transaction.Complete();
    }
}

Notice on line 5 above we are now manually opening the connection ourselves. This ensures that the connection will be open for the duration of both our queries and will be closed when the ObjectContext goes out of scope at the end of its using block.

Using this technique we can avoid the accidental promotion of our lightweight database transaction to an heavy MSDTC transaction and thereby scrape back some lost performance.

Speeding up .NET Reflection with Code Generation

One of the bugbears I have with Entity Framework in .NET is that when you use it behind a method in the business layer (think AddAuthor(Author author)) you need to manually wipe non-scalar properties when adding entities. I define non-scalar properties as:

  • EntityCollections: the "many" ends of relationships (think Author.Books)
  • The single end of relationships (think Book.Author)
  • Properties that participate in the Entity Key (primary key properties) (think Author.ID). These need to be wiped as the database will automatically fill them (identity columns in SQL Server)

If you don't wipe the relationship properties when you add the entity object you get an UpdateException when you try to SaveChanges on the ObjectContext.

You can't assume that someone calling your method knows about this Entity Framework behaviour, so I consider it good practice to wipe the properties for them so they don't see unexpected UpdateExceptions floating up from the business layer. Obviously, having to write

author.ID = default(int);
author.Books.Clear();

is no fun (especially on entities that have more properties than this!). So I wrote a method that you can pass an EntityObject to and it will wipe all non-scalar properties for you using reflection. This means that it doesn't need to know about your specific entity type until you call it, so you can use whatever entity classes you have for your project with it. However, calling methods dynamically is really slow. There will probably be lots faster ways of doing it in .NET 4.0 because of the DLR, but for us still back here in 3.5-land it is still slow.

So how can we enjoy the benefits of reflection but without the performance cost? Code generation at runtime, that's how! Instead of using reflection and dynamically calling methods and setting properties, we can instead use reflection once, create a new class at runtime that is able to wipe the non-scalar properties and then reuse this class over and over. How do we do this? With the System.CodeDom API, that's how (maybe in .NET 4.0 you could use the expanded Expression Trees functionality). The CodeDom API allows you to literally write code at runtime and have it compiled and loaded for you.

The following code I will go through is available as a part of DigitallyCreated Utilities open source libraries (see the DigitallyCreated.Utilities.Linq.EfUtil.ClearNonScalarProperties() method).

What I have done is create an interface called IEntityPropertyClearer that has a method that takes an object and wipes the non-scalar properties on it. Classes that implement this interface are able to wipe a single type of entity (EntityType).

public interface IEntityPropertyClearer
{
    Type EntityType { get; }

    void Clear(object entity);
}

I then have a helper abstract class that makes this interface easy to implement by the code generation logic by providing a type-safe method (ClearEntity) to implement and by doing some boilerplate generic magic for the EntityType property and casting in Clear:

public abstract class AbstractEntityPropertyClearer<TEntity> : IEntityPropertyClearer
    where TEntity : EntityObject
{
    public Type EntityType
    {
        get { return typeof(TEntity); }
    }
    
    public void Clear(object entity)
    {
        if (entity is TEntity)
            ClearEntity((TEntity)entity);
    }

    protected abstract void ClearEntity(TEntity entity);
}

So the aim is to create a class at runtime that inherits from AbstractEntityPropertyClearer and implements the ClearEntity method so that it can clear a particular entity type. I have a method that we will now implement:

private static IEntityPropertyClearer GeneratePropertyClearer(EntityObject entity)
{
    Type entityType = entity.GetType();
    ...
}

So the first thing we need to do is to create a "CodeCompileUnit" to put all the code we are generating into:

CodeCompileUnit compileUnit = new CodeCompileUnit();
compileUnit.ReferencedAssemblies.Add(typeof(System.ComponentModel.INotifyPropertyChanging).Assembly.Location); //System.dll
compileUnit.ReferencedAssemblies.Add(typeof(EfUtil).Assembly.Location);
compileUnit.ReferencedAssemblies.Add(typeof(EntityObject).Assembly.Location);
compileUnit.ReferencedAssemblies.Add(entityType.Assembly.Location);

Note that we get the path to the assemblies we want to reference from the actual types that we need. I think this is a good approach as it means we don't need to hardcode the paths to the assemblies (maybe they will move in the future?).

We then need to create the class that will inherit from AbstractEntityPropertyClearer and the namespace in which this class will reside:

//Create the namespace
string namespaceName = typeof(EfUtil).Namespace + ".CodeGen";
CodeNamespace codeGenNamespace = new CodeNamespace(namespaceName);
compileUnit.Namespaces.Add(codeGenNamespace);

//Create the class
string genTypeName = entityType.FullName.Replace('.', '_') + "PropertyClearer";
CodeTypeDeclaration genClass = new CodeTypeDeclaration(genTypeName);
genClass.IsClass = true;
codeGenNamespace.Types.Add(genClass);

Type baseType = typeof(AbstractEntityPropertyClearer<>).MakeGenericType(entityType);
genClass.BaseTypes.Add(new CodeTypeReference(baseType));

The namespace we create is the namespace of the utility class + ".CodeGen". The class's name is the entity's full name (including namespace) where all "."s are replaced with "_"s and PropertyClearer appended to it (this will stop name collision). The class that the generated class will inherit from is AbstractEntityPropertyClearer but with the generic type set to be the type of entity we are dealing with (ie if the method was called with an Author, the type would be AbstractEntityPropertyClearer<Author>).

We now need to create the ClearEntity method that will go inside this class:

CodeMemberMethod clearEntityMethod = new CodeMemberMethod();
genClass.Members.Add(clearEntityMethod);
clearEntityMethod.Name = "ClearEntity";
clearEntityMethod.ReturnType = new CodeTypeReference(typeof(void));
clearEntityMethod.Parameters.Add(new CodeParameterDeclarationExpression(entityType, "entity"));
clearEntityMethod.Attributes = MemberAttributes.Override | MemberAttributes.Family;

Counterintuitively (for C# developers, anyway), "protected" scope is called MemberAttributes.Family in CodeDom.

We now need to find all EntityCollection properties on our entity type so that we can generate code to wipe them. We can do that with a smattering of LINQ against the reflection API:

IEnumerable<PropertyInfo> entityCollections = 
        from property in entityType.GetProperties()
        where
          property.PropertyType.IsGenericType &&
          property.PropertyType.IsGenericTypeDefinition == false &&
          property.PropertyType.GetGenericTypeDefinition() ==
          typeof(EntityCollection<>)
        select property;

We now need to generate statements inside our generated method to call Clear() on each of these properties:

foreach (PropertyInfo propertyInfo in entityCollections)
{
    CodePropertyReferenceExpression propertyReferenceExpression = new CodePropertyReferenceExpression(new CodeArgumentReferenceExpression("entity"), propertyInfo.Name);
    clearEntityMethod.Statements.Add(new CodeMethodInvokeExpression(propertyReferenceExpression, "Clear"));
}

On line 3 above we create a CodePropertyReferenceExpression that refers to the property on the "entity" variable which is an argument that we defined for the generated method. We then add to the method an expression that invokes the Clear method on this property reference (line 4). This will give us statements like entity.Books.Clear() (where entity is an Author).

We now need to find all the single-end-of-a-relationship properties (like book.Author) and entity key properties (like author.ID). Again, we use some LINQ to achieve this:

//Find all single multiplicity relation ends
IEnumerable<PropertyInfo> relationSingleEndProperties = 
        from property in entityType.GetProperties()
        from attribute in property.GetCustomAttributes(typeof(EdmRelationshipNavigationPropertyAttribute), true).Cast<EdmRelationshipNavigationPropertyAttribute>()
        select property;

//Find all entity key properties
IEnumerable<PropertyInfo> idProperties = 
        from property in entityType.GetProperties()
        from attribute in property.GetCustomAttributes(typeof(EdmScalarPropertyAttribute), true).Cast<EdmScalarPropertyAttribute>()
        where attribute.EntityKeyProperty
        select property;

We then can iterate over both these sets in the one foreach loop (by using .Concat) and for each property we will generate a statement that will set it to its default value (using a default(MyType) expression):

//Emit assignments that set the properties to their default value
foreach (PropertyInfo propertyInfo in relationSingleEndProperties.Concat(idProperties))
{
    CodeExpression defaultExpression = new CodeDefaultValueExpression(new CodeTypeReference(propertyInfo.PropertyType));

    CodePropertyReferenceExpression propertyReferenceExpression = new CodePropertyReferenceExpression(new CodeArgumentReferenceExpression("entity"), propertyInfo.Name);
    clearEntityMethod.Statements.Add(new CodeAssignStatement(propertyReferenceExpression, defaultExpression));
}

This will generate statements like author.ID = default(int) and book.Author = default(Author) which are added to the generated method.

Now that we've generated code to wipe out the non-scalar properties on an entity, we need to compile this code. We do this by passing our CodeCompileUnit to a CSharpCodeProvider for compilation:

CSharpCodeProvider provider = new CSharpCodeProvider();
CompilerParameters parameters = new CompilerParameters();
parameters.GenerateInMemory = true;
CompilerResults results = provider.CompileAssemblyFromDom(parameters, compileUnit);

We set GenerateInMemory to true, as we just want these types to be available as long as our App Domain exists. GenerateInMemory causes the types that we generated to be automatically loaded, so we now need to instantiate our new class and return it:

Type type = results.CompiledAssembly.GetType(namespaceName + "." + genTypeName);
return (IEntityPropertyClearer)Activator.CreateInstance(type);

What DigitallyCreated Utilities does is keep a Dictionary of instances of these types in memory. If you call on it with an entity type that it hasn't seen before it will generate a new IEntityPropertyClearer for that type using the method we just created and save it in the dictionary. This means that we can reuse these instances as well as keep track of which entity types we've seen before so we don't try to regenerate the clearer class. The method that does this is the method that you call to get your entities wiped:

public static void ClearNonScalarProperties(EntityObject entity)
{
    IEntityPropertyClearer propertyClearer;

    //_PropertyClearers is a private static readonly IDictionary<Type, IEntityPropertyClearer>
    lock (_PropertyClearers)
    {
        if (_PropertyClearers.ContainsKey(entity.GetType()) == false)
        {
            propertyClearer = GeneratePropertyClearer(entity);
            _PropertyClearers.Add(entity.GetType(), propertyClearer);
        }
        else
            propertyClearer = _PropertyClearers[entity.GetType()];
    }

    propertyClearer.Clear(entity);
}

Note that this is done inside a lock so that this method is thread-safe.

So now the question is: how much faster is doing this using code generation instead of just using reflection and dynamically calling Clear() etc?

I created a small benchmark that creates 10,000 Author objects (Authors have an ID, a Name and an EntityCollection of Books) and then wipes them with ClearNonScalarProperties (which uses the code generation). It does this 100 times. It then does the same thing but this time it uses reflection only. Here are the results:

  Average Standard Deviation
Using Code Generation 466.0ms 39.9ms
Using Reflection Only 1817.5ms 11.6ms

As you can see, when using code generation, the code runs nearly four times as fast compared to when using reflection only. I assume that when an entity has more non-scalar properties than the paltry two that Author has this speed benefit would be even more pronounced.

Even though this code generation logic is harder to write than just doing dynamic method calls using reflection, the results are really worth it in terms of performance. The code for this is up as a part of DigitallyCreated Utilities (it's in trunk and not part of an official release at the time of writing), so if you're interested in seeing it in action go check it out there.

ASP.NET MVC Model Binders + RegExs + LINQ == Awesome

I was working on some ASP.NET MVC code today and I created this neat little solution that uses a custom model binder to automatically read in a bunch of dynamically created form fields and project their data into a set of business entities which were returned by the model binder as a parameter to an action method. The model binder isn't particularly complex, but the way I used regular expressions and LINQ to identify and collate the fields I needed to create the list of entities from was really neat and cool.

I wrote a Javascript-powered form (using jQuery) that allowed the user to add and edit multiple items at the same time, and then submit the set of items in a batch to the server for a save. The entity these items represented looks like this (simplified):

public class CostRangeItem
{
    public short VolumeUpperBound { get; set; }
    public decimal Cost { get; set; }
}

The Javascript code, which I won't go into here as it's not particularly cool (just a bit of jQuery magic), creates form items that look like this:

<input id="VolumeUpperBound:0" name="VolumeUpperBound:0" type="text />
<input id="Cost:0" name="Cost:0" type="text" />

The multiple items are handled by adding more form fields and incrementing the number after the ":" in the id/name. So having fields for two items means you end up with this:

<input id="VolumeUpperBound:0" name="VolumeUpperBound:0" type="text />
<input id="Cost:0" name="Cost:0" type="text" />
<input id="VolumeUpperBound:1" name="VolumeUpperBound:1" type="text />
<input id="Cost:1" name="Cost:1" type="text" />

I chose to use the weird ":n" format instead of a more obvious "[n]" format (like arrays) since using "[" is not valid in an ID attribute and you end up with a page that doesn't validate.

I created a model binder class that looked like this:

private class CostRangeItemBinder : IModelBinder
{
    public object BindModel(ControllerContext controllerContext, ModelBindingContext bindingContext)
    {
        ...
    }
}

To use the model binder on an action method, you use the ModelBinderAttribute to specify the type of model binder you want ASP.NET MVC to bind the parameter with:

public ActionResult Create([ModelBinder(typeof(CostRangeItemBinder))] IList<CostRangeItem> costRangeItems)
{
    ...
}

The first thing I did was create two regular expressions so that I could identify which fields I needed to read in (the VolumeUpperBound ones and the Cost ones). I chose to use regular expressions because they're much simpler to write (once you learn them!) than some manual string hacking code, and I could use them to not only identify which fields I needed to read, but also to extract the number after the ":" so I could link up the appropriate VolumeUpperBound field with the appropriate Cost field to create a CostRangeItem object. These are the regular expressions I used:

private static readonly Regex _VolumeUpperBoundRegex = new Regex(@"^VolumeUpperBound:(\d+)$");
						
private static readonly Regex _CostRegex = new Regex(@"^Cost:(\d+)$");

Both regexs work like this: the ^ at the beginning and the $ at the end means that, in the matching string, there cannot be any characters before or after the characters specified in between those symbols. It effectively anchors the match to the beginning and the end of the string. Inside those symbols, each regex matches their field name and the ":" character. They then define a special "group" using the ()s. Using the group allows me to pull out the sub-part of the regex match defined by this group later on. The \d+ means match one or more digits (0-9), which matches the number after the ":".

Note that I've made the Regex objects static readonly members of my binder class. Keeping a single instance of the regex object, which is immutable, saves .NET from having to recompile a new Regex every time I bind, which I believe is a relatively expensive process.

Then, in the BindModel method, I first make sure that the model binder has been used on the correct type in the action method:

if (bindingContext.ModelType.IsAssignableFrom(typeof(List<CostRangeItem>)) == false)
    return null;

I then use LINQ to perform matches against all the fields in the current page submit (via the BindingContext's ValueProvider IDictionary). I do this using a Select() call, which performs the match and puts the results into an anonymous class. I then filter out any fields that didn't match the regex using a Where() call.

var vubMatches = bindingContext.ValueProvider
                    .Select(kvp => new 
                        {
                            Match = _VolumeUpperBoundRegex.Match(kvp.Key), 
                            Value = kvp.Value.AttemptedValue
                        })
                    .Where(o => o.Match.Success);
					
var costMatches = bindingContext.ValueProvider
                    .Select(kvp => new 
                        {
                            Match = _CostRegex.Match(kvp.Key), 
                            Value = kvp.Value.AttemptedValue
                        })
                    .Where(o => o.Match.Success);

Now I need to link up VolumeUpperBound fields with their associated Cost fields. To do this, I do an inner equijoin using LINQ. I join on the result of that regex group that I created that extracts the number from the field name. This means I join VolumeUpperBound:0 with Cost:0, and VolumeUpperBound:1 with Cost:1 and so on.

List<CostRangeItem> costRangeItems = 
        (from vubMatch in vubMatches
         join costMatch in costMatches 
           on vubMatch.Match.Groups[1].Value equals costMatch.Match.Groups[1].Value
         where ValuesAreValid(vubMatch.Value, costMatch.Value, controllerContext)
         select new CostRangeItem
                   {
                       Cost = Decimal.Parse(costMatch.Value),
                       VolumeUpperBound = Int16.Parse(vubMatch.Value)
                   }
        ).ToList();

return costRangeItems;

I then use a where clause to ensure the values submitted are actually of the correct type, since the Value member is actually a string at this point (I pass them into a ValuesAreValid method, which returns a bool). This method marks the ModelState with an error if it finds a problem. I then use Select to create a CostRangeItem per join and copy in the values from the form fields using an object initialiser. I can then finally return my List of CostRangeItem. The ASP.NET MVC framework ensures that this result is passed to the action method that declared that it wanted to use this model binder.

As you can see, the solution ended up being a neat declarative way of writing a model binder that can pull in and bind multiple objects from a dynamically created form. I didn't have to do any heavy lifting at all; regular expressions and LINQ handled all the munging of the data for me! Super cool stuff.

Entity Framework Compiled Queries Bake in the MergeOption

Updated on 2009/09/26

I've had some unexpected behaviour with change tracking and compiled queries while using .NET 3.5 SP1's Entity Framework. It's something that you likely won't notice and the resultant bug acts like a race condition: it only exhibits itself when things are executed in a particular order. Unfortunately, the Entity Framework documentation doesn't seem to make this behaviour clear in its documentation, which leaves you scratching your head when it actually occurs.

ObjectQuery.MergeOption allows you to control how the Entity Framework handles objects with respect to change tracking. I tend to set it to MergeOption.NoTracking when I want all objects created from a particular LINQ query returned pre-detached from the ObjectContext (very useful in web apps). However, when you want to query some objects, change their details and save the changes back to the DB, you will use MergeOption.AppendOnly, which is the default. This causes the ObjectContext to "remember" your objects and track the changes you make to them so it knows what it needs to save back to the DB when you go SaveChanges() on the ObjectContext.

Let me paint you a scenario. Maybe you use compiled queries to make your Entity Framework code quick as a fox. Maybe you use them to do something simple like get a Product from a database by its ID:

public static readonly Func<AdvWorksEntities, int, Product>
        GetProduct = CompiledQuery.Compile(
            (AdvWorksEntities context, int id) =>
                (from product in context.Product
                 where product.ProductID == id
                 select product).FirstOrDefault());

Maybe you have two methods that call that compiled query (these are rather contrived examples):

public Product GetProductById(int id)
{
    using (AdvWorksEntities context = new AdvWorksEntities())
    {
        context.Product.MergeOption = MergeOption.NoTracking;
        return CompiledQueries.GetProduct(context, id);
    }
}

public void ChangeProductName(int id, string newName)
{
    using (AdvWorksEntities context = new AdvWorksEntities())
    {
        Product product = CompiledQueries.GetProduct(context, id);
        product.Name = newName;
        context.SaveChanges();
    }
}

Nothing looks immediately wrong with that code (well, it didn't to me anyway).

Maybe you've got a user who opens the View Product page. The GetProductById method is called and the GetProduct compiled query is compiled and run. The user then decided the product has a crappy name, and he/she wants to change it. They change the name, and ChangeProductName gets called. The user then sees that the product name has, in fact, not been changed and gets pissed off.

What went wrong? You, the brow-beaten developer, open Visual Studio, start the app, and call ChangeProductName, which causes the GetProduct compiled query to be compiled and run. It works okay, the name is changed succesfully, and you're left scratching your head. You call it again, and it works again! What the hell is going on?

You restart the app and step through exactly what the user did. You call the GetProductById method, then call ChangeProductName. Bam, silent failure! SaveChanges returns successfully but your changes were not saved to the DB! The query is the same, but it's like somehow the MergeOption.NoTracking is coming over from the call to GetProductId. But this doesn't make sense since you're using a whole new ObjectContext! At this point you're cursing Entity Framework and looking at the documentation, which says nothing (to be fair, when you look into the details like I did, it makes sense. But it's just unintuitive and odd, and really needs to be highlighted in the documentation explicitly).

When this situation happened to me, I created a small test app to try and figure out why this was occurring. Here's what I found. The very first use of the GetProduct query causes it to get compiled. The query itself uses the ObjectQuery (which is IQueryable) that you set the MergeOption on. The expression tree for this query is saved and used for every query from then on. And here's the key: the saved expression tree includes the MergeOption that you set. So if you use MergeOption.NoTracking on the very first call to that compiled query, every query done with that compiled query from then on will be a NoTracking query, since the MergeOption is baked into it. It doesn't matter than you pass the compiled query a different ObjectContext with a different ObjectQuery that has a different MergeOption set. It will forevermore be NoTracking.

So what can you do? At this point, it seems that you will need different compiled queries for the different MergeOptions. If you use GetProduct with NoTracking and with AppendOnly, you will now need GetProductWithNoTracking and GetProductWithAppendOnly.

So watch out for this issue; it's really easy to run into because the query's MergeOption is set outside of the compiled query's declaration, therefore it's not obvious that it is actually baked into the compiled query. And it's a pain to discover, because it doesn't crop up unless you execute your methods in a particular order.

Update: I've developed a solution that makes this problem easier to deal with: a class that transparently duplicates a compiled query, allowing you to easily have multiple versions of a compiled query, one per MergeOption. The solution is currently in the DigitallyCreated Utilities repository and the tutorial for it can be found here.