Friday, June 27, 2014

How CopyOnWriteArrayList works internally in Java

CopyOnWriteArrayList | ArrayList Java
In this article, we will see one of the most important concurrent collection class CopyOnWriteArrayList and we also look into the working of CopyOnWriteArrayList class including its comparison with ArrayList and in which situation CopyOnWriteArrayList should be preferred over ArrayList.


Few Drawbacks of ArrayList
  • If multiple threads access an ArrayList instance concurrently, and at least one of the threads modifies the list structurally, it must be synchronized externally.Means whenever there is access or modification we need synchronization that is a overhead.
  • The iterator will throw a ConcurrentModificationException


What is copy-on-write concept
Copy-on-write is an optimization strategy used in computer programming. Copy-on-write stems from the understanding that when multiple separate tasks use identical copies of the same information, it is not necessary to create separate copies of that information for each process, instead they can all be given pointers to the same resource. When there are many separate processes all using the same resource it is possible to make significant resource savings by sharing resources this way. However, when a local copy has been modified, the copy-on-write paradigm has no provision that the shared resource has in the meantime not been updated by another task or tasks.
So Copy-on-write is therefore only suggested if only the latest update is important and occasional use of a slightly stale value is not harmful. Copy-on-write is the name given to the process of identifying when a task attempts to make a change to the shared information, creating a separate (private) copy of that information for the task and redirecting the task to making changes to the private copy to prevent its changes from becoming visible to all the other tasks


CopyOnWriteArrayList
As the name suggest, "copy on write" means whenever there is write operation such as add, set and so on. it copies the underlying array. So it is a thread-safe variant of ArrayList in which all mutative operations (add, set, and so on) are implemented by making a fresh copy of the underlying array. So basic idea is whenever we add or remove to the CopyOnWriteArrayList, the underlying array is copied with the modification. Remember this point, we'll see shortly in the code how CopyOnWriteArrayList do this.
It means, whenever there is modification done by thread that update the ArrayList, all other threads holding an older copy of different array. This is ordinarily too costly, but may be more efficient than alternatives when traversal operations vastly outnumber mutations, and is useful when you cannot or don't want to synchronize traversals, yet need to preclude interference among concurrent threads.

This completes our basic understanding what is the need CopyOnWriteArrayList compared to ArrayList. Now we'll look into the code of CopyOnWriteArrayList to see how it implements the same concept.


Working of CopyOnWriteArrayList



If you see the underlying array reference is marked as volatile so that readers do not need to use a lock to see changes to the referenced array which means array update is an atomic operation and hence reads will always see the array in a consistent state. When a write operation occurs this volatile reference is only updated in the final statement via setArray() method. Up until this point any read operations will return elements from the old copy of the array. 
Also, the write lock is required to prevent concurrent modification, which may result the array holding inconsistent data or changes being lost

Advantage of volatile
Using volatile variables reduces the risk of memory consistency errors, because any write to a volatile variable establishes a happens-before relationship with subsequent reads of that same variable. This means that changes to a volatile variable are always visible to other threads. What's more, it also means that when a thread reads a volatile variable, it sees not just the latest change to the volatile, but also the side effects of the code that led up the change. Here volatile only applies to the array reference itself, not to the content of the array.

add(E e) method


If you notice in the above code, lock is acquired before adding the element in line 415 and getArray() is called which returned the underlying array which is declared volatile. After copying and modification the array is updated using the setArray() method, finally release the lock. This same concept applies to all the modification operations.

Iterator of CopyOnWriteArrayList
Iterator use "snapshot" style method uses a reference to the state of the array at the point that the iterator was created. Iterator of CopyOnWriteArrayList is fail-safe and doesn't throw ConcurrentModificationException even if underlying CopyOnWriteArrayList is modified once iteration begins because iterator is operating on separate copy of ArrayList. Consequently all the updates made on CopyOnWriteArrayList is not available to iterator.

How to get the most updated version of array
To get the most updated version do a new read like list.iterator(). Let's have a look to understand it more


In the above code, if you notice it call the getArray() method to return the iterator which always return the updated version of array. COWIterator implements the ListIterator.

How ArrayList throw ConcurrentModificationException
First, we will see the code of iterator of both ArrayList and CopyOnWriteArrayList from that we analyzed how ArrayList throw exception.

CopyOnWriteArrayList  Iterator




ArrayList Iterator








Here modCount is the ArrayList variable that holds the modification count and every time we use add, remove or trimToSize method, it increments. expectedModCount is the iterator variable that is initialized when we create iterator with same value as modCount. This explains why we don’t get exception if we use set method to replace any existing element. So basically iterator throws ConcurrentModificationException if list size is changed. Which is not the case with next() function of CopyOnWriteArrayList because we have the copy of array and no other thread will modify the same array.
Sometimes we want to add or remove elements from the list if we find some specific element, in that case we should use Concurrent Collection class – CopyOnWriteArrayList. 

When CopyOnWriteArrayList is preferred
A CopyOnWriteArrayList is preferable to a synchronized ArrayList when the expected number of reads and traversals greatly outnumber the number of updates to a list.


Examples of CopyOnWriteArrayList and ArrayList













If you know anyone who has started learning Java, why not help them out! Just share this post with them. Thanks for studying today!...

13 comments:

  1. Very well written article. Appreciated. Thank you.

    ReplyDelete
  2. Iterator of CopyOnWriteArrayList can not perform remove operation otherwise we get Run-time exception saying UnsupportedOperationException.last program is incorrect

    ReplyDelete
    Replies
    1. The point which you made that remove cannot be called on iterator of CopyOnWriteArrayList is correct. But, in the last program remove is being called on the list itself. Not on the iterator of the list. So, the program is correct.

      Delete
  3. Hi,

    Very nice material. Thanks for that.

    Regarding this part:

    "Iterator use "snapshot" style method uses a reference to the state of the array at the point that the iterator was created. Iterator of CopyOnWriteArrayList is fail-safe and doesn't throw ConcurrentModificationException even if underlying CopyOnWriteArrayList is modified once iteration begins because iterator is operating on separate copy of ArrayList. Consequently all the updates made on CopyOnWriteArrayList is not available to iterator."

    I've one question: I don't get how could this be possible. The `snapshot` still refers to the volatile variable in the main memory. And when the value is updated using setArray(), this snaphot could still possible see this updated state, right?

    ReplyDelete
  4. This was nice and amazing and the given contents were very useful and the precision has given here is good.

    Data Science Training In Bangalore

    ReplyDelete
  5. Superb Blog post.Thanks for sharing it with us. It was really interesting.
    Java Course in Nagpur

    ReplyDelete
  6. Thank you for sharing this! I always appreciate engaging with such outstanding content that provides valuable information. The presented ideas are exceptional and innovative, making the post thoroughly enjoyable. Keep up the fantastic work.
    You can also reed- Java vs. Other Programming Languages: Which Course Should You Take?

    ReplyDelete