How deep is your clone?

July 30, 2008 at 12:35 PMAndre Loker

Someone asked in a forum:

Why is there no ICloneable<T> only ICloneable?

Good question. It wouldn't have taken too much effort to introduce a generic version when .NET 2.0 came out. But I think MS had a good reason not to introduce ICloneable<T>. And that's because they realized that ICloneable sucks in the first place! Why is that? Because it is effectively undefined what cloning does.

Let's have a look at the MSDN library. It has this to say about ICloneable:

Supports cloning, which creates a new instance of a class with the same value as an existing instance.

And this about ICloneable.Clone:

Creates a new object that is a copy of the current instance.

...

Clone can be implemented either as a deep copy or a shallow copy. In a deep copy, all objects are duplicated; whereas, in a shallow copy, only the top-level objects are duplicated and the lower levels contain references.

All right, doesn't sound too bad, does it. There are two interesting points, though:

Point 1: Clone returns a new object. Really? Not necessarily. System.String implements ICloneable.Clone like this:

   1: public object Clone(){
   2:   return this;
   3: }

Not necessarily problematic, as strings are immutable, but still explicitly against the documentation of ICloneable.Clone.

Point 2: shallow vs. deep copy. This is hell, trust me. Any implementor of ICloneable is free to choose "how deeply" it copies itself. This can give a multitude of different meanings to Clone().

Let us have a look at an example of shallow copy. To implement Clone as a shallow copy method create a new instance of the class and set all instance variables to the value of the original class. This is in fact what Object.MemberwiseClone() does, so let's just use that:

   1: class Place {
   2:   public string Name { get; set; }
   3:   public string Postcode { get; set; }
   4: }
   5:  
   6: class Address {
   7:   public string Street { get; set; }
   8:   public string HouseNumber { get; set; } // string to support '23b'
   9:   public Place Place { get; set; }
  10: }
  11:  
  12: class Order : ICloneable {
  13:   public Address ShippingAddress { get; set; }
  14:  
  15:   public object Clone() {
  16:     // shallow copy
  17:     return MemberwiseClone();
  18:   }
  19: }

It's easy to do. MemberwiseClone() creates a new instance of Order. The returned instance uses the same Address object as the original Order. That's fine until someone does this:

   1: Order order = // order from database 
   2: Order similarOrder = (Order) order.Clone();
   3: similarOrder.ShippingAddress.Street = "Somewhere";
   4: similarOrder.ShippingAddress.HouseNumber = "12c";

By changing the Address instance of similarOrder (which is the same as order.Address) we changed the shipping address of the original order. Whoops. Might be better to do a deep copy! Here's the modified code that does a deep copy:

   1: class Place : ICloneable {
   2:   public string Name { get; set; }
   3:   public string Postcode { get; set; }
   4:  
   5:   public object Clone() {
   6:     // shallow copy is enough here
   7:     return MemberwiseClone();
   8:   }
   9: }
  10:  
  11: class Address : ICloneable {
  12:   public string Street { get; set; }
  13:   public string HouseNumber { get; set; } // string to support '23b'
  14:   public Place Place { get; set; }
  15:  
  16:   public object Clone() {
  17:     return new Address() {
  18:       Street = Street,
  19:       HouseNumber = HouseNumber,
  20:       Place = (Place) Place.Clone()
  21:     };
  22:   }
  23: }
  24:  
  25: class Order : ICloneable {
  26:   public Address ShippingAddress { get; set; }
  27:  
  28:   public object Clone() {
  29:     // deep copy
  30:     return new Order {
  31:       ShippingAddress = (Address)ShippingAddress.Clone()
  32:     };
  33:   }
  34: }

Now we are on the safe side. We can mess with the address of a cloned order anyway we like without affecting the original order. But wait... all of a sudden we realize that the postcode of the address was wrong, so we fix that:

   1: Order order = // order from database 
   2: Order similarOrder = (Order)order.Clone();
   3: similarOrder.ShippingAddress.Place.Postcode = "1234";

But now we have a new problem: by doing a deep copy we duplicated the Place instance as well. If we change the postcode in one of the instances, it won't affect the other one - but it should! So what we actually need in this case is a semi-deep copy. Some parts of the object graph have to be copied deeply (the Address), some parts need a shallow copy (the Place). There's clearly no general pattern in this.

While this example is a bit made up you might find such situations in practice. Sometimes you won't have any chance to avoid it. But you see that it can get complicated. More complicated than a single one-size-fits-all interface like ICloneable could handle. In general "cloning" an object is by far not as transparent as ICloneable.Clone might suggest. If you need some sort of copying function, go ahead. Give it a clear name and implement it in a reasonable way. But don't implement ICloneable as it can rise false assumptions.

MS probably realized this problem. They did not want to advertise "general purpose cloning" by introducing another ICloneable interface which would make cloning even more convenient for the user.

Posted in: Patterns

Tags: , ,

Comments (3) -

Excellent post!  I've seen this first hand -- is it a shallow, deep, semi-deep, semi-shallow-but-kinda-deep, etc?

I think a lot of folks (myself included) have migrated to a "copy constructor" pattern for creating clones.  The logic for defining how to create a new instance of a class is then prominent.

Order original = new Order();
Order clone = new Order(original);
etc.


To clone every type of entity.

In your base class:

public TEntity DeepClone()
{
  using(MemoryStream memoryStream = new MemoryStream())
  {
    IFormatter formatter = new BinaryFormatter();
    formatter.Serialize(memoryStream, this);
    memoryStream.Seek(0, SeekOrigin.Begin);
    return formatter.Deserialize(memoryStream) as TEntity;
  }
}

Andre Loker
Germany Andre Loker says:

@Matt: while this does not completely remove the need to clearly document which parts are shallow copied and which parts deeply copied, your approach will most likely let the user think more about what they do.

@Martin: this will certainly create a deep copy as long as every object in the graph is serializable. It could lead to some strange side effects, though, if you have objects in the graph that should not exist more than once (singleton instances).

Pingbacks and trackbacks (1)+