Serializing objects to XML is easy in .NET thanks to the XmlSerializer class, but developers will quickly find that the built-in serializer is limited and not easy to extend. A more flexible approach is needed to support complex serialization needs. Today I’ll show you Fluently-XML’s domain-specific language for configuring serialization behaviors, and I’ll dive (a bit) into how it’s implemented.
Warning: I confess that I am a newb at creating domain-specific languages (DSLs). I may very well have approached the design and implementation of the DSL in completely the wrong way. If so, please feel free to enlighten me in the comments. 🙂
Crafting the DSL
The Fluently-XML project was born out of a real, concrete business need. As such, I had plenty of requirements to help drive the design of the DSL. I started from the top-down and worked as if the DSL already existed, taking an object that needed special serialization behavior and writing code that specified the desired behavior, and I created tests to verify the object was serialized correctly. Here’s an example of using the DSL to control how Fluently-XML tracks object identity:
public class Bar { public int BarId { get; set; } public int CustomId { get; set; } public string Name { get; set; } public string Value { get; set; } public string Label { get; set; } } public class BarSerialization : FluentSerializationSpec { /// <summary> </summary> public BarSerialization() { WhenSerializing<Bar>() .SerializeAllAncestorsAsThisType(); WhenDeserializing<Bar>() .DetermineIdentityBy(b => b.BarId); } }
And here’s a corresponding test case:
[TestFixture] public class When_deserializing_bar_array { ... [SetUp] public void When() { var factory = new FluentSerializerFactory(x => { x.ApplyConfigFrom<BarSerialization>(); }); _serializer = factory.CreateSerializer(); _deserializer = factory.CreateDeserializer(); _originalBars = new[] { new Bar { BarId = 1 }, new Bar { BarId = 1 }, new Bar { BarId = 2 } }; _xml = _serializer.Serialize(_originalBars).ToString(); _deserializedBars = _deserializer.Deserialize<Bar[]>(_xml); } ... [Test] public void Deserializing_bars_does_not_create_duplicates() { Assert.That(_deserializedBars[0], Is.SameAs(_deserializedBars[1])); } }
Approaching the design of the DSL in a top-down way helped guide me towards a language that was naturally suited to solving my particular problem.
Implementing the DSL – Separating Specification From Execution through CQRS
I approached the implementation of the DSL in a test-driven development manner: I started with a failing test, wrote code that didn’t compile using a non-existent DSL, and began fleshing things out until I had compiling code and a passing test. Early on I decided that the DSL would build up configuration in a way that kept the DSL-related classes simple, and that I’d “compile” that configuration into a form that was better suited to the serialization/deserialization core logic.
This proved to be a very wise decision, as it keep the implementation very simple and easy to work with. This approach effectively separated the DSL’s “commands” (the “do this when serializing this type”) from the core serialization framework’s “queries” (the “how do I serialize this property on this type?”). If this sounds familiar, it’s probably because you are at least somewhat familiar with Command-Query Responsibility Segregation (CQRS). This design was indeed inspired by CQRS principles as well as ideas I picked up from Jeremy Miller’s blog over the years.
Here’s an example of the DSL implementation. In fact, this is actually one of the most complicated bits in the entire DSL API:
/// <summary> </summary> internal class TypeDeserializationSpec<T> : ITypeDeserializationSpec<T> { private readonly IDeserializationConfig _config; public TypeDeserializationSpec(IDeserializationConfig config) { _config = config; } /// <summary> </summary> public ITypeDeserializationSpec<T> DetermineIdentityBy(Func<T, object> identitySelector) { //Selector must be converted for use with the Reflection-based core deserialization process. Func<object, object> wrappedSelector = o => identitySelector((T)o); _config.SetIdentityFunction(typeof(T), wrappedSelector); return this; } ... }
I’m not exaggerating, this one method is probably the most complicated thing in the entire DSL. The tricky bit there is the conversion from a generic Func<T,object> to a non-generic Func<object,object>. That’s actually the solution to a problem that took me quite a while to figure out. The DSL has to be generic in order to provide a nice, strongly-typed developer experience. However, at run-time, the serialization framework works with the objects to be serialized and deserialized in a non-generic way, meaning it references everything as if it is a System.Object. The core serialization classes cannot be generic because they need to access properties, methods, etc. at runtime that can’t even be guessed about at compile time. I struggled with that problem for quite a while before finding an elegant solution. I’m actually quite proud of that one line of code, even if the solution is trivially simple. 🙂
Ignoring that, the important piece here is how simple the actual DSL implementation is: it’s really just a wrapper that provides a nice way to build up some configuration data that will be “compiled” into serialization/deserialization objects at runtime. I’ll talk more about how the config is handled in the next post.
Improving the DSL – Using Interfaces To Limit Methods
One of the goals I had for the DSL was to avoid the “AutoMapper problem.” While I do love AutoMapper, I really don’t care for the DSL it uses to specify custom mapping behavior. It relies on nested lambdas that can look quite ugly:
Mapper.CreateMap<Widget, WidgetViewModel>() .ForMember(dest => dest.OwnerName, opt => opt.MapFrom(src => src.TheOwner.Name));
I wanted to avoid this nesting of lambdas. I accomplished this by having each “token” or operation in the DSL return an interface that exposed only the tokens that were valid at that point. Here’s the Intellisense list you are presented with immediately after a “WhenSerializing” statement:
And again after specifying which property the statement applies to:
At this point, I can either specify additional options on the property, or I can select a different property, so the Intellisense gives me both options:
Note that the “Using” token no longer appears in the list. “Using” is actually a poorly-named method (which I will fix eventually), but it’s purpose is to allow you to completely override how Fluently-XML serializes a type. As such, the token is only valid as the first (and only) statement for a particular type. Once you’ve specified custom serialization behavior for a specific property, it no longer makes sense to completely override how serialization is to be performed. There are other tokens I need to perform similar filtering on, such as “IgnoreAllProperties”, and I should really filter out methods inherited from Object as well, such as ToString… but I digress.
You might be thinking that all this mess with interfaces must make the DSL’s implementation a nightmare, but it actually doesn’t. There are only two actual classes in the DSL for serialization: one for class-level statements, and another for property-level statements. Each implements all the methods for all the applicable class/property interfaces. It’s only through the return types of each method that the visibility of tokens is controlled. This keeps the implementation quite simple while still providing a clean, filtered API for specifying serialization behavior.
There are advantages to AutoMapper’s DSL, namely that an incomplete statement in the DSL won’t even compile. This example wouldn’t even compile as it’s not valid code:
Mapper.CreateMap<Widget, WidgetViewModel>() .ForMember(dest => dest.OwnerName);
However, an incomplete statement in Fluently-XML’s DSL will not generate a compile-time error:
WhenSerializing<Widget>() //Missing a token after property selection! .Serialize(w => w.Name);
Indeed it won’t even generate a runtime error. I have a couple of vague ideas about how to solve this problem, but I doubt the problem is going to occur often enough to be worth solving. In the end, I much prefer Fluently-XML’s approach over AutoMapper’s.
Coming Up Next…
I hope this post has shed some light on the design of Fluently-XML’s DSL as well as given you some insight into the tradeoffs one must make when building a DSL. In the next post, I’ll show you how configuration built up from the DSL is converted into objects to perform serialization at runtime.