Moving to wordpress?

I'm not a big fan of PHP, but blogger seems like a very difficult way to run a blog and I don't even quite know how to work on templates on this place to be honest. So I've decided to move to Wordpress soon. I will keep this one, but eventually will just be a redirection.

DTO's and why you should be using them

If you've worked in any form of modern (decent sized) application, you know that the de facto standard is to use a layered design where people usually define operations into layers corresponding to certain functionality, for example a Data Access Layer, that is nothing else but an implementation of your repository using nHibernate, Entity Framework, etc. While that is a very good idea for most scenarios, a bit of a problem comes around with it, and is the fact that you need to pass around lots of calls between layers, and sometimes is not just calling a DLL inside your solution, sometimes, it's calling a service hosted somewhere over the network.

The problem

If your app calls services and receives data from them (obviously?) then you might encounter in your service something like this:
public Person AddPerson(string name, string lastName, string email)
Now, let's first look at the parameters and why this is probably not a very good definition. 

In this method, you have 3 arguments, name, lastName and email; what happens if somebody needs a telephone number? Well, we just add another argument! Dead easy! Yeah, no. Suppose we make it more interesting saying we have Workers and Customers, both inheriting from person, we would then have something like this:
public Person AddWorker(string name, string lastName, string email)
public Person AddCustomer(string name, string lastName, string email)
If you need to add that telephone number now and go for that extra param, you have to add code in two locations, so you need to touch more code, and what happens if we touch more code? Simple, we put more bugs.

The Good

Now, what happens if you have this?
public Worker AddWorker(Worker worker)
public Customer AddCustomer(Customer customer)

DTO stands for Data Transfer Object, and that is precisely what these classes do, we use them to transfer data on our services. For one, code is much simpler to read now! But there is another thing, if Worker and Customer inherit from Person as they should considering they are both a Person, then we can safely add that email to the person without having to change the signature of the service, yes, our service will now have an extra argument but we don't have to change our service signature on the code, just the DTO it receives. 

Now, more on the common use for DTO's, just as Martin Fowler states a DTO is

An object that carries data between processes in order to reduce the number of method calls.

Now, it's fairly obvious that using DTOs for input arguments is good, but what happens for output arguments? Well, similar story really, with a small twist, considering that many people today use ORMs for accessing the database, it's very likely that you already have a Worker, Customer and person class, because they are part of your domain model, or they are created by Linq To Sql (not a huge fan, but many people still use it), so, should you be using those entities to return on your services? Not a very good idea and I have some reasons for it.

One very simple reason is that the objects generated by these frameworks usually are not serialization friendly, because they are on top of proxy classes which are a pain to serialize for something that outputs JSON or XML. Another potential problem is when your entity doesn't quite fit the response you want to give, what happens if your service has something like this?
public Salary CalculateWorkerSalary(Worker worker)
You could have a very simple method just returning a double, but let's think of a more convoluted solution to illustrate the point, imagine salary being like this:
public class Salary
     public double FinalSalary {get;}
     public double TaxDeducted {get;}
     public double Overtime {get;}
So, this is our class, and Overtime means it's coupled to a user because not everybody does the same amount of overtime. So, what happens now if we also need the Tax code for that salary? Or the overtime rate for the calculation? That is assuming these are not stored on the salary table. More importantly, what happens if we don't want whoever is calling the API to see the Overtime the Worker is doing? Well, the entity is not fit for purpose and we need a DTO where we can put all of these, simple as that.

The Bad

However, DTOs are not all glory, there is a problem with them and it's the fact they bloat your application, especially if you have a large application with many entities. If that's the case, it's up to you to decide when a DTO is worth it and when it's not, like many things on software design, there is no rule of thumb and it's very easy to get it wrong. But for most of things where you pass complex data, you should be using DTOs.

The Ugly

There is another problem with DTOs, and it's the fact you end up having a lot of code like this:
var query = _workerRepository.GetAll();
var workers = query.Select(ConvertWorkerDTO).ToList();
return workers;
Where ConvertWorkerDTO is just a method looking pretty much like this:
public WorkerDTO ConvertWorkerDTO(Worker worker)
    return new WorkerDTO() {
        Name = worker.Name,
        LastName = worker.LastName,
        Email = worker.Email
Wouldn't be cool if you could do something without a mapping method, like this:
var query = _workerRepository.GetAll();
var workers = query.Select(x => Worker.BuildFromEntity<Worker, WorkerDTO>(x))
return workers;
Happily, there is a simple way to achieve a result like this one, and it's combining two very powerful tools, inheritance and reflection. Just have a BaseDTO class that all of your DTOs inherit from and make a method like that one, that manages the conversion by performing a mapping property to property. A fairly simple, yet fully working, version could be this:
public static TDTO BuildFromEntity<TEntity, TDTO>(TEntity entity)
    var dto = Activator.CreateInstance<TDTO>();
    var dtoProperties = typeof (TDTO).GetProperties();
    var entityProperties = typeof (TEntity).GetProperties();

    foreach (var property in dtoProperties)
        if (!property.CanWrite)

        var entityProp =
            entityProperties.FirstOrDefault(x => x.Name == property.Name && x.PropertyType == property.PropertyType);

        if (entityProp == null)

        if (!property.PropertyType.IsAssignableFrom(entityProp.PropertyType))

        var propertyValue = entityProp.GetValue(entity, new object[] {});
        property.SetValue(dto, propertyValue, new object[]{});

    return dto;

And Finally...

The bottom line is like everything, you can over engineer your way into adding far too many DTOs into your system, but ignoring them is not a very good solution either, and adding one or two to a project with more than 15 entities just to feel you're using them, it's just as good as using one interface to say you make decoupled systems.

What's your view on this? Do you agree? Disagree? Share what you think on the comments!

EDIT: As a side note, it's work checking this article that talks a lot about the subject.

Empower your lambdas!

If you’ve used generic repositories, you will encounter one particular problem, matching items using dynamic property names isn't easy. However, using generic repositories has always been a must for me, as it saves me having to write a lot of boilerplate code for saving, updating and so forth. Not long ago, I had a problem, I was fetching entities from a web service and writing them to the database and given that these entities had relationships, I couldn’t retrieve the same entity and save it twice, so I had a problem.
Whenever my code fetched the properties from the service, it had to realize if this entity had been loaded previously and instead of saving it twice, just modified the last updated time and any actual properties that may had changed. To begin with, I had a simple code on a base web service consumer class like this.
var client = ServiceUtils.CreateClient();
var request = ServiceUtils.CreateRequest(requestUrl);
var resp = client.ExecuteAsGet(request, "GET");
var allItems = JsonConvert.DeserializeObject<List<T>>(resp.Content);
This was all very nice and so far, I had a very generic approach (using DeserializeObject<T>). However, I had to check if the item had been previously fetched and one item’s own identity could be determined by one or more properties and my internal Id was meaningless on this context to determine if an object existed previously or not. So, I had to come up with another approach. I created a basic attribute and called it IdentityProperty, whenever a property would define identity of an object externally, I would annotate it with it, so I ended up with entities like this:
public class Person: Entity
    public string PassportNumber { get; set; } 
    public string SocialSecurityNumber { get; set; }

    public string Name {get; set}
This would mark all properties that defined identity on the context of web services. So far, so good, my entities now know what defines them on the domain, now I need my generic service consumer to find them on the database so I don’t get duplicates. Now, considering that all my entities fetched from a web service have a Cached and a Timeout property, ideally, I would have something like this:
foreach (var item in allItems)
    var calculatedLambda = CalculateLambdaMatchingEntity(item);
    var match = repository.FindBy(calculatedLambda);

    if (match == null) {
        item.LastCached = DateTime.Now;
        item.Timeout = cacheControl;
    else {
        var timeout = match.Cached.AddSeconds(match.Timeout);
        if (DateTime.Now > timeout){
            //Update Entity using reflection
            item.LastCached = DateTime.Now;

Well, actually, this is what I have, but the good stuff is on the CalculateLambda method. The idea behind that method is to calculate a lambda to be passed to the FindBy method using the only the properties that contains the IdentityProperty attribute. So, my method looks like this:
private Expression<Func<T, bool>> CalculateLambdaMatchingEntity<T>(T entityToMatch)
 var properties = typeof (T).GetProperties();
 var expresionParameter = Expression.Parameter(typeof (T));
 Expression resultingFilter = null;

 foreach (var propertyInfo in properties) {
  var hasIdentityAttribute = propertyInfo.GetCustomAttributes(typeof (IdentityPropertyAttribute), false).Any();

  if (!hasIdentityAttribute)

  var propertyCall = Expression.Property(expresionParameter, propertyInfo);

  var currentValue = propertyInfo.GetValue(entityToMatch, new object[] {});
  var comparisonExpression = Expression.Constant(currentValue);

  var component = Expression.Equal(propertyCall, comparisonExpression);

  var finalExpression = Expression.Lambda(component, expresionParameter);

  if (resultingFilter == null)
   resultingFilter = finalExpression;
   resultingFilter = Expression.And(resultingFilter, finalExpression);

    return (Expression<Func<T, bool>>)resultingFilter;
Fancy code apart, what this does is just iterate trough the properties of the object and construct a lambda matching the object received as sample, so for our sample class Person, if our service retrieves a person with passport "SAMPLE" and social security number "ANOTHER", the generated lambda would be the equivalent of issuing a query like

repository.FindBy(person => person.Passport == "SAMPLE" && person.SocialSecurityNumber == "ANOTHER")

Performance you say?

If you've read the about section on my blog, you'll know that I work for a company that cares about performance, so once I did this, I knew the next step was bechmarking the process. It doesn't really matter the fact that it was for a personal project, I had to know that the performance made it a viable idea. So, I ended up doing a set of basic tests benchmarking the total time that the update foreach would take and I came up with these results:
Scenario Matching data Ticks Faster?
Lambda calculation Yes 5570318 Yes
No Lambda calculation Yes 7870450
Lambda calculation No 1780102 No
No Lambda calculation No 1660095
These are actually quite simple to explain, when no data is available, the overhead of calculating a lambda, makes it loose the edge because no items match on the query, however, when there are items matching the power of lambdas shows up, because the compiler doesn't have to build the expression tree from an expression, but instead, it will receive a previously built tree, so it's faster to execute. So, back into the initial title, empower your lambdas!
If you have any other point of view on these ideas, feel free to leave a comment even if you are going to prove me wrong with it because I've always said that nobody knows everything, so I might be very mistaken here. On the other hand, if this helps, then my job is complete here.