Monday, September 30, 2013

Semaphore vs CountDownLatch in java


I recommend  before looking for a difference between semaphore and CountDownLatch  if you go through the following post, difference is easy to understand as semaphore and CountDownLatch are explained with detailed examples.


CountDownLatch with examples
Semaphore with examples



Both CountDownLatch and semaphore used for difference scenario and main difference is


  • CountdownLatch is used to start a series of threads and then wait until all of them are complete a given number of times, for instance, before starting main service you want to start initializing services like db connection, starting logging service etc.
  • Semaphore is used to control the number of concurrent threads that are using a resource.That resource can be something like a shared data, or any file. The count on a Semaphore can go up and down as different threads call acquire() and release().






Related Post
Daemon Thread in java with demo example
Concept of Upcasting and downcasting in java with example
Why java doesn't support operator overloading
when to use concurrent Has Map



Thanks for reading the post !
Please share or comment on this as this encourage me to write more :) 



Friday, September 27, 2013

Semaphore in Java : Concurrency

In this post we'll see one of the synchronizers that is used in  java.util.concurrent library i.e, Semaphore. We'll see a simplest and easiest way to use Semaphore so that it is easy to remember and understand easily.

What is Semaphore?




A semaphore controls access to shared resources. A semaphore maintains a counter to specify the number of resources that the semaphore controls.
Access to the resource is allowed if the counter is greater than zero,while a zero value of the counter indicates that no resource is available at the moment and so the access is denied. Semaphore: n-member access to a resource






A semaphore acts as a limiter of available resource pool depth; for example, a semaphore with a capacity of 10 allows a maximum of 10 threads to acquire it at once, and any further threads that attempt to acquire it will block until one of the other threads releases it.

This is somewhat different from ordinary mutual exclusion or monitor locking, which is typically used to prevent more than one thread from simultaneously modifying the same variables and causing inconsistent results or program state.

Synchronization vs Semaphore
Synchronized allows only one thread of execution to access the resource at the same time. Semaphore allows up to n threads of execution to access the resource at the same time.


Basically two method that used are for acquiring and releasing resources from a semaphore
  • acquire() : decrement the value
  • release() : increment the value

If a thread calls acquire() and the counter is zero (i.e., resources are unavailable), the thread waits until the counter is non-zero and then gets the resource for use. Once the thread is done using the resource, it calls release() to increment the resource availability counter.

Note if the number of resources is 1, then at a given time only one thread can access the resource; in this case, using the semaphore is similar to using a lock

A semaphore initialized to one, and which is used such that it only has at most one permit available, can serve as a mutual exclusion lock.This is more commonly known as a binary semaphore, because it only has two states.
Semaphores are often used to restrict the number of threads than can access some (physical or logical) resource.

  • Semaphore(int permits) : This method create Semaphore objects with a given number of permits, here permits means the number of threads that can access the resource at a time.
  • void acquire() : Acquires a permit if available; otherwise, it blocks until a permit becomes available
  • void release() : Releases a permit from the semaphore basically decrement the value





Example First we see example of binary semaphore.To show how to use it, we are going to implement a print queue that can be used by concurrent tasks to print their jobs. This print queue will be protected by a binary semaphore, so only one thread can print at a time.
Output


How it works...
The key to this example is in the printJob() method of the PrintQueue class. This method shows the three steps you must follow when you use a semaphore to implement a critical section, and protect the access to a shared resource:

  • First, you acquire the semaphore, with the acquire() method.
  • Then, you do the necessary operations with the shared resource.
  •  Finally, release the semaphore with the release() method.

Another important point in this example is the constructor of the PrintQueue class and the initialization of the Semaphore object. You pass the value 1 as the parameter of this constructor, so you are creating a binary semaphore. The initial value of the internal counter is 1, so you will protect the access to one shared resource, in this case, the print queue.

The Semaphore class has two additional versions of the acquire() method:

  • acquireUninterruptibly(): The acquire() method; when the internal counter of the semaphore is 0, blocks the thread until the semaphore is released. During this blocked time, the thread may be interrupted and then this method throws an InterruptedException exception. This version of the acquire operation ignores the interruption of the thread and doesn't throw any exceptions.
  • tryAcquire(): This method tries to acquire the semaphore. If it can, the method returns the true value. But if it can't, the method returns the false value instead of being blocked and waits for the release of the semaphore. It's your responsibility to take the correct action based on the return value.

Fairness in semaphores
The concept of fairness is used by the Java language in all classes that can have various threads blocked waiting for the release of a synchronization resource (for example, a semaphore). The default mode is called the non-fair mode. In this mode, when the synchronization resource is released, one of the waiting threads is selected to get this resource, but it's selected without any criteria. The fair mode changes this behavior and forces to select the thread that has been waiting for more time.
As occurs with other classes, the Semaphore class admits a second parameter in its constructor. This parameter must take a Boolean value. If you give it the false value, you are creating a semaphore that will work in non-fair mode. You will get the same behavior if you don't use this parameter. If you give it the true value, you are creating a semaphore that will work in fair mode.


Controlling Concurrent access to multiple copies of a resource 
In that recipe, you implemented an example using binary semaphores. These kinds of semaphores are used to protect the access to one shared resource, or to a critical section that can only be executed by one thread at a time. But semaphores can also be used when you need to protect various copies of a resource, or when you have a critical section that can be executed by more than one thread at the same time.
In this Example, you will learn how to use a semaphore to protect more than one copy of a resource. You are going to implement an example, which has one print queue that can print documents in three different printers.

Output
How it works...
The key of this example is in the PrintQueue class. The Semaphore object is created using 3 as the parameter of the constructor. The first three threads that call the acquire() method will get the access to the critical section of this example, while the rest will be blocked. When a thread finishes the critical section and releases the semaphore, another thread will acquire it.
In this critical section, the thread gets the index of the printer assigned to print this job. This part of the example is used to give more realism to the example, but it doesn't use any code related with semaphores.

The acquire(), acquireUninterruptibly(), tryAcquire(), and release() methods have an additional version which has an int parameter. This parameter represents the number of permits that the thread that uses them wants to acquire or release, so as to say, the number of units that this thread wants to delete or to add to the internal counter of the semaphore. In the case of the acquire(), acquireUninterruptibly(), and tryAcquire() methods, if the value of this counter is less this value, the thread will be blocked until the counter gets this value or a greater one.




If you know anyone who has started learning Java, why not help them out! Just share this post with them. Thanks for studying today!...

Tuesday, September 24, 2013

Java Collection : ArrayList, Vector, LinkedList, HashSet, LinkedHashSet, TreeSet, HashMap, Hashtable, LinkedHashMap, TreeMap

In this post, we'll dig into the basic collections that we use day to day java programming. I'll hope the this post will help to understand and remember the concept of collection framework.

What is Collection?
In simple term, you can say collection is a container — is simply an object that groups multiple elements into a single unit.
For instance, Collection correspond to a bag. Typically, they represent data items that form a natural group such as a collection of cards,a collection of letters, and a mapping of names to phone numbers.
  • Like C++'s Standard Template Library (STL)
  • Can grow as necessary
  • Contain only Objects
  • Heterogeneous
  • Can be made thread safe
  • Can be made not-modifiable
There are really three overloaded uses of the word "collection"
  1. collection(lowercase c), which represents any of the data structures in which objects are stored and iterated over.
  2. Collection (capital C), which is actually the java.util.Collection interface from which Set, List, and Queue extend. (That's right, extend, not implement. There are no direct implementations of Collection.)
  3. Collections (capital C and ends with s) is the java.util.Collections class that holds a pile of static utility methods for use with collections

The Collections Framework in the java.util package is loaded with interfaces and utilities.

What Is a Collections Framework?
A collections framework is a unified architecture for representing and manipulating collections. All collections frameworks contain the following:
  • Interfaces  Represent different types of collections, such as sets, lists, and maps. These interfaces form the basis of the framework.
  • Implementations  Primary implementations of the collection interfaces.
  • Algorithms Static methods that perform useful functions on collections, such as sorting a list.

Advantages of a collections framework
  • A usable set of collection interfaces : By implementing one of the basic interfaces -- CollectionSetList, or Map -- you ensure your class conforms to a common API and becomes more regular and easily understood
  • A basic set of collection implementations : Using an existing, common implementation makes your code shorter and quicker to download. Also, using existing Core Java code core ensures that any improvements to the base code will also improve the performance of your code.
  • Reduces programming effort by providing data structures and algorithms so you don't have to write them yourself.
  • Provides interoperability between unrelated APIs by establishing a common language to pass collections back and forth.
  • Reduces the effort required to learn APIs by requiring you to learn multiple ad hoc collection APIs.
  • Reduces the effort required to design and implement APIs by not requiring you to produce ad hoc collections APIs.

Collection is an interface with declarations of the methods common to most collections including add(), remove(), contains(), size(), and iterator().


Key Interfaces and Classes of the Collections Framework


Key Classes of the Collections Framework


Collections come in four basic flavors
  • Lists Lists of things (classes that implement List)
  • Sets Unique things (classes that implement Set).
  • Maps Things with a unique ID (classes that implement Map).
  • Queues Things arranged by the order in which they are to be processed.




List interface

A List cares about the index.

An ordered collection (also known as a sequence). The user of this interface has precise control over where in the list each element is inserted. The user can access elements by their integer index (position in the list), and search for elements in the list.








  • Lists typically allow duplicate elements
  • Access to elements via indexes, like arrays  add (int, Object), get(int), remove(int), set(int, Object) 
  • Search for elements : indexOf(Object), lastIndexOf(Object)
  • Specialized Iterator, call ListIterator
  • Extraction of sublist : subList(int fromIndex, int toIndex)
  • add(Object) adds at the end of the list
  • remove(Object) removes at the start of the list.
  • list1.equals(list2) the ordering of the elements is  taken into consideration
  • Extra requirements to the method hashCode
    list1.equals(list2) implies that list1.hashCode()==list2.hashCode()

The List interface provides a special iterator, called a ListIterator, that allows element insertion and replacement, and bidirectional access in addition to the normal operations that the Iterator interface provides. A method is provided to obtain a list iterator that starts at a specified position in the list.

Category of List operation


Positional access — manipulates elements based on their numerical position in the list

  • Object get(int index);
  • Object set(int index, Object element); // Optional
  • void add(int index, Object element); // Optional
  • Object remove(int index); // Optional
  • abstract boolean addAll(int index, Collection c);  // Optional

Search — searches for a specified object in the list and returns its numerical position.
  • int indexOf(Object o);
  • int lastIndexOf(Object o);

Iteration — extends Iterator semantics to take advantage of the list's sequential nature
  • ListIterator listIterator();
  • ListIterator listIterator(int index);

Range-view — performs arbitrary range operations on the list.
  • List subList(int from, int to);


==============================================================================================
Arraylist
This is Resizable-array implementation of the List interface
  • ArrayList is an array based implementation where elements can be accessed directly via the get and set methods.
  • Default choice for simple sequence.
  • It gives you fast iteration and fast random access
  • It is an ordered collection (by index), but not sorted
The iterators returned by this class's iterator and listIterator methods are fail-fast: if the list is structurally modified at any time after the iterator is created, in any way except through the iterator's own remove or add methods, the iterator will throw a ConcurrentModificationException. Thus, in the face of concurrent modification, the iterator fails quickly and cleanly, rather than risking arbitrary, non-deterministic behavior at an undetermined time in the future.

We'll see fail-fast in the following example.
Output :


Choose this over a LinkedList when you need fast iteration but aren't as likely to be doing a lot of insertion and deletion

==============================================================================================
Vector


  • A Vector is basically the same as an ArrayList, but Vector methods are synchronized for thread safety.
  • You'll normally want to use ArrayList instead of Vector because the synchronized methods add a performance hit you might not need.



Output:


Difference between ArrayList and Vector? *interview question

  • Vectors are synchronized, ArrayLists are not.
    If multiple threads access an ArrayList concurrently then we must externally synchronize the block of code which modifies the list either structurally or simply modifies an element. Structural modification means addition or deletion of element(s) from the list. Setting the value of an existing element is not a structural modification.
  • Data Growth Methods
    A Vector defaults to doubling the size of its array, while the ArrayList increases its array size by 50 percent.
    Depending on how you use these classes, you could end up taking a large performance hit while adding new elements. It's always best to set the object's initial capacity to the largest capacity that your program will need.
Vector is often considered deprecated nowadays.

==============================================================================================

LinkedList

  • A LinkedList is ordered by index position, like ArrayList, except that the elements are doubly-linked to one another.
  • This linkage gives you new methods (beyond what you get from the List interface) for adding and removing from the beginning or end, which makes it an easy choice for implementing a stack or queue.

We'll see few methods example demo

Output:




Note that this implementation is not synchronized. If multiple threads access a linked list concurrently, and at least one of the threads modifies the list structurally, it must be synchronized externally


A palindrome is a word or phrase that reads the same forward and backward. For instance, "abcba". We'll see how to do this with the help of LinkedList and ListIterator


Example of ListIterator


Difference between ArrayList and LinkedList? *interview question
LinkedList and ArrayList are two different implementations of the List interface.
  • LinkedList implements it with a doubly-linked list.
  • ArrayList implements it with a dynamically resizing array.
  • LinkedList<E> allows for constant-time insertions or removals using iterators, but only sequential access of elements. In other words, you can walk the list forwards or backwards, but finding a position in the list takes time proportional to the size of the list.
  • ArrayList<E>, on the other hand, allow fast random read access, so you can grab any element in constant time. But adding or removing from anywhere but the end requires shifting all the latter elements over, either to make an opening or fill the gap.
  • Use LinkedList when you need fast insertion and deletion whereas which is slow in ArrayList
  • Use ArrayList when you need fast iteration and fast random access whereas which is slow in LinkedList

==============================================================================================


Set Interface


A Set cares about uniqueness—it doesn't allow duplicates. Your good friend the equals() method determines whether two objects are identical.

HashSet
  • A HashSet is an unsorted, unordered Set.
  • It uses the hashcode of the object being inserted, so the more efficient your hashCode() implementation the better access performance you'll get.
  • Use this class when you want a collection with no duplicates and you don't care about order when you iterate through it.
  • This class permits the null element
  • If you attempt to add an element to a set that already exists in the set, the duplicate element will not be added, and the add() method will return false
This class offers constant time performance for the basic operations (add, remove, contains and size), assuming the hash function disperses the elements properly among the buckets
Output
==============================================================================================



LinkedHashSet
  • A LinkedHashSet is an ordered version of HashSet that maintains a doubly-linked List across all elements. 
  • Use this class instead of HashSet when you care about the iteration order.
  • When you iterate through a HashSet the order is unpredictable, while a LinkedHashSet lets you iterate through the elements in the order in which they were inserted.
  • permits null elements

It provides constant-time performance for the basic operations (addcontains and remove), assuming the hash function disperses elements properly among the buckets. 

As you have already seen in the above program that order is unpredictable, now we'll in the below program LinkedHashSet maintain insertion order.

Output

Note HashSet, LinkedHashSet implementation is not synchronized. If multiple threads access a hash set concurrently, and at least one of the threads modifies the set, it must be synchronized externally 
Difference between HashSet and LinkedHashSet? *interview question
LinkedHashSet performance is likely to be just slightly below that of HashSet, due to the added expense of maintaining the linked list with one exception:
  • Iteration over a LinkedHashSet requires time proportional to the size of the set, regardless of its capacity. 
  • Iteration over a HashSet is likely to be more expensive, requiring time proportional to its capacity.
==============================================================================================

TreeSet

The TreeSet is one of two sorted collections (the other being TreeMap). It uses a Red-Black tree structure, and guarantees that the elements will be in ascending order, according to natural order.

  • Guarantees log(n) time cost for the basic operations (add, remove and contains).
  • Offers a few handy methods to deal with the ordered set like first(), last(), headSet(), and tailSet() etc
  • Tress set will not allow null object. if you try to add null value i will be throw null pointer exception

Note that the ordering maintained by a set must be consistent with equals if it is to correctly implement the Set interface.
This is so because the Set interface is defined in terms of the equals operation, but a TreeSet instance performs all element comparisons using its compareTo (or compare) method, so two elements that are deemed equal by this method are, from the standpoint of the set, equal.

The behavior of a set is well-defined even if its ordering is inconsistent with equals; it just fails to obey the general contract of the Set interface.

Few functions of TreeSet with the help of example

  • ceiling(E e) : Returns the least element in this set greater than or equal to the given element, or null if there is no such element.
  • floor(E e) : Returns the greatest element in this set less than or equal to the given element, or null if there is no such element.
  • higher(E e) : Returns the least element in this set strictly greater than the given element, or null if there is no such element.
  • lower(E e) : Returns the greatest element in this set strictly less than the given element, or null if there is no such element.
  • SortedSet<E> headSet(E toElement) : Returns a view of the portion of this set whose elements are strictly less than toElement.
  • SortedSet<E> tailSet(E fromElement) : Returns a view of the portion of this set whose elements are greater than or equal to fromElement.


Output



Difference between HashSet and TreeSet? *interview question
HashSet
  • Hash set allow null object
  • class offers constant time performance for the basic operations (add, remove, contains and size).
  • Does not guarantee that the order of elements will remain constant over time
  • Iteration performance depends on the initial capacity and the load factor of the HashSet.
  • It's quite safe to accept default load factor but you may want to specify an initial capacity that's about twice the size to which you expect the set to grow
TreeSet
  • Tress set will not allow null object .if you try to add null value i will be throw null pointer exception
  • log(n) time cost for the basic operations (add, remove and contains).
  • elements of set will be sorted (ascending, natural, or the one specified by you via it's constructor)
  • Doesn't offer any tuning parameters for iteration performance
==============================================================================================


Map Interface

  • A Map cares about unique identifiers.
  • The Map implementations let you do things like search for a value based on the key, ask for a collection of just the values, or ask for a collection of just the keys.
  • Like Sets, Maps rely on the equals() method to determine whether two keys are the same or different.




  • A map cannot contain duplicate keys; each key can map to at most one value.
  • The Map interface provides three collection views, which allow a map's contents to be viewed as a set of keys, collection of values, or set of key-value mappings.
  • Great care must be exercised if mutable objects are used as map keys. The behavior of a map is not specified if the value of an object is changed in a manner that affects equals comparisons while the object is a key in the map


HashMap

  • The HashMap gives you an unsorted, unordered Map, it does not guarantee that the order will remain constant over time.
  • HashMap allows one null key and multiple null values in a collection
  • When you need a Map and you don't care about the order (when you iterate through it), then HashMap is the way to go; the other maps add a little more overhead
  • The more efficient your hashCode() implementation, the better access performance you'll get.
  • The HashMap class is roughly equivalent to Hashtable, except that it is unsynchronized and permits nulls.
  • This implementation provides constant-time performance for the basic operations (get and put), assuming the hash function disperses the elements properly among the buckets.


An instance of HashMap has two parameters that affect its performance: initial capacity and load factor. The capacity is the number of buckets in the hash table, and the initial capacity is simply the capacity at the time the hash table is created. The load factor is a measure of how full the hash table is allowed to get before its capacity is automatically increased. When the number of entries in the hash table exceeds the product of the load factor and the current capacity, the hash table is rehashed (that is, internal data structures are rebuilt) so that the hash table has approximately twice the number of buckets.
As a general rule, the default load factor (.75) offers a good tradeoff between time and space costs.

Output
==============================================================================================


Hashtable
  • Just as Vector is a synchronized counterpart to the sleeker, more modern ArrayList, Hashtable is the synchronized counterpart to HashMap.
  • Remember that you don't synchronize a class, so when we say that Vector and Hashtable are synchronized, we just mean that the key methods of the class are synchronized

How Hashtable differ from HashMap? *interview question
  • Hashtable doesn't let you have anything that's null whereas HashMap allows one null key and multiple null values in a collection
  • HashMap is not Synchronized  where as HashTable is Synchronized.
  • HashMap cannot be shared with multiple thread without proper synchronization where HashTable is thread safe and can be shared between multiple thread
  • Iterator in HashMap if fail fast where as Enumerator in HashTable is not fail fast


Output
==============================================================================================

LinkedHashMap
  • Like its Set counterpart, LinkedHashSet, the LinkedHash- Map collection maintains insertion order (or, optionally, access order). 
  • Slower than HashMap for adding and removing elements, you can expect faster iteration with a LinkedHashMap.
Example will be same as above.

TreeMap
  • TreeMap is a sorted Map.
And you already know that by default, this means "sorted by the natural order of the elements.
Like TreeSet, TreeMap lets you define a custom sort order (via a Comparable or Comparator) when you construct a TreeMap, that specifies how the elements should be compared to one another when they're being ordered

Output

The iterators returned by this all the above class's iterator  methods are fail-fast as you have already seen in the above examples.



Related Post
4 ways to iterate over Map in java
120+ Core java interview Question including Collection related also
7 new feature of Java 7
Garbage Collection from Interview perspective


If you know anyone who has started learning Java, why not help them out! Just share this post with them. 
Thanks for studying today!...

Thursday, September 19, 2013

Garbage Collection in Java

Java Memory Management, with its built-in garbage collection, is one of the language’s finest achievements. It allows developers to create new objects without worrying explicitly about memory allocation and deallocation, because the garbage collector automatically reclaims memory for reuse. This enables faster development with less boilerplate code, while eliminating memory leaks and other memory-related problems. At least in theory. This is what we'll understand in this post.

Garbage Collection
When a Java object becomes unreachable to the program, then it is subjected to garbage collection.
The main use of garbage collection is to identify and discard objects that are no longer needed by a program so that their resources can be reclaimed and reused.

Advantages can be :

  • Increase productivity as no need to worry about memory issues.
  • Program integrity
  • Garbage collection is an important part of Java's security strategy. Java programmers are unable to accidentally crash the Java virtual machine by incorrectly freeing memory.



Automatic Garbage Collection
Automatic garbage collection is the process of looking at heap memory, identifying which objects are in use and which are not, and deleting the unused objects.
An in use object, or a referenced object, means that some part of your program still maintains a pointer to that object. An unused object, or unreferenced object, is no longer referenced by any part of your program. So the memory used by an unreferenced object can be reclaimed.

In a programming language like C, allocating and deallocating memory is a manual process. In Java, process of deallocating memory is handled automatically by the garbage collector

Basic process of Garbage collection can be described as follows:

  1. Marking
  2. Normal Deletion
  3. Deletion with compacting
Marking : Marking means identifies which pieces of memory are in use and which are not.
All the objects are scanned in this phase to make this determination. This can be very time consuming process if all objects in a system must be scanned.




Normal Deletion :  In this phase, normal deletion removed the unreferenced objects leaving referenced object and pointers to free space.




Deletion with compacting : For further performance improvement, in addition to deleting unreferenced objects, you can compact the remaining referenced objects. By doing this, this makes memory allocation much easier and faster.



May be you have seen Generational Garbage collection like Young generation, old generation and permanent generation. We'll discuss these in the following section.


First question come into mind is 

Why Generational Garbage collection?
You have already seen in the normal process, having to mark and compact all the objects in a JVM is inefficient. As more and more objects are allocated, the list of objects grows and grows leading to longer and longer garbage collection time as most of the objects are short lived.

JVM Generations
  • Young generation
  • Old or Tenured Generation
  • Permanent Generation : Now this is replaced by Metaspace in Java 8
Heap is divided into smaller parts or generations : Young Generation, Old Generation and Permanent Generation.


Java uses generational garbage collection. This means that if you have an object foo (which is an instance of some class), the more garbage collection events it survives (if there are still references to it), the further it gets promoted. It starts in the young generation (which itself is divided into multiple spaces - Eden and Survivor) and would eventually end up in the tenured generation if it survived long enough.


  • Young Generation is where all new objects are allocated and aged.
  • Eden Space (heap): The pool from which memory is initially allocated for most objects.
  • Survivor Space (heap): The pool containing objects that have survived the garbage collection of the Eden space.
  • Tenured Generation (heap): The pool containing objects that have existed for some time in the survivor space.
  • Permanent Generation (non-heap): The pool containing all the reflective data of the virtual machine itself, such as class and method objects. With Java VMs that use class data sharing, this generation is divided into read-only and read-write areas.
Note : the permanent generation is not part of the heap. It's a separate space for class definitions and related data, as well as where interned strings live.

When the young generation fills up, this causes a minor garbage collection. Minor collections can be optimized assuming a high object mortality rate. A young generation full of dead objects is collected very quickly. Some surviving objects are aged and eventually move to the old generation.

Stop the World Event - All minor garbage collections are "Stop the World" events. This means that all application hreads are stopped until the operation completes. Minor garbage collections are always Stop the World events.The Old Generation is used to store long surviving objects. Typically, a threshold is set for young generation object and when that age is met, the object gets moved to the old generation. Eventually the old generation needs to be collected. This event is called a major garbage collection.

Major garbage collection are also Stop the World events. Often a major collection is much slower because it involves all live objects. So for Responsive applications, major garbage collections should be minimized. Also note, that the length of the Stop the World event for a major garbage collection is affected by the kind of garbage collector that is used for the old generation space.

The Permanent generation contains metadata required by the JVM to describe the classes and methods used in the application. The permanent generation is populated by the JVM at runtime based on classes in use by the application. In addition, Java SE library classes and methods may be stored here.

Classes may get collected (unloaded) if the JVM finds they are no longer needed and space may be needed for other classes. The permanent generation is included in a full garbage collection.


Generation Garbage Collection process
Now, I consider you got the idea why the heap is divided into different generations. We'll see how these space interact with each other with the help of pictures showing object allocation and aging process in JVM

1. First, any new objects are allocated to the eden space. Both survivor spaces start out empty.




2. When the eden space fills up, a minor garbage collection is triggered.




3. Referenced objects are moved to the first survivor space. Unreferenced objects are deleted when the eden space is cleared.




4. At the next minor GC, the same thing happens for the eden space. Unreferenced objects are deleted and referenced objects are moved to a survivor space. However, in this case, they are moved to the second survivor space (S1). In addition, objects from the last minor GC on the first survivor space (S0) have their age incremented and get moved to S1. Once all surviving objects have been moved to S1, both S0 and eden are cleared. Notice we now have differently aged object in the survivor space.




5. At the next minor GC, the same process repeats. However this time the survivor spaces switch. Referenced objects are moved to S0. Surviving objects are aged. Eden and S1 are cleared.




6. After a minor GC, when aged objects reach a certain age threshold (8 in this example) they are promoted from young generation to old generation.




7. As minor GCs continue to occure objects will continue to be promoted to the old generation space




8. So that pretty much covers the entire process with the young generation. Eventually, a major GC will be performed on the old generation which cleans up and compacts that space.



How Garbage Collection Really Works
Many people think garbage collection collects and discards dead objects. In reality, Java garbage collection is doing the opposite! Live objects are tracked and everything else designated garbage. As you’ll see, this fundamental misunderstanding can lead to many performance problems.


Let's start with the heap, which is the area of memory used for dynamic allocation. In most configurations the operating system allocates the heap in advance to be managed by the JVM while the program is running. This has a couple of important ramifications:

  • Object creation is faster because global synchronization with the operating system is not needed for every single object. An allocation simply claims some portion of a memory array and moves the offset pointer forward . The next allocation starts at this offset and claims the next portion of the array.
  • When an object is no longer used, the garbage collector reclaims the underlying memory and reuses it for future object allocation. This means there is no explicit deletion and no memory is given back to the operating system.
All objects are allocated on the heap area managed by the JVM. Every item that the developer uses is treated this way, including class objects, static variables, and even the code itself. As long as an object is being referenced, the JVM considers it alive. Once an object is no longer referenced and therefore is not reachable by the application code, the garbage collector removes it and reclaims the unused memory. As simple as this sounds, it raises a question: what is the first reference in the tree?

Garbage-Collection Roots
Every object tree must have one or more root objects. As long as the application can reach those roots, the whole tree is reachable. But when are those root objects considered reachable? Special objects called garbage-collection roots are always reachable and so is any object that has a garbage-collection root at its own root.
There are four kinds of GC roots in Java:


  1. Local variables are kept alive by the stack of a thread. This is not a real object virtual reference and thus is not visible. For all intents and purposes, local variables are GC roots.
  2. Active Java threads are always considered live objects and are therefore GC roots. This is especially important for thread local variables.
  3. Static variables are referenced by their classes. This fact makes them de facto GC roots. Classes themselves can be garbage-collected, which would remove all referenced static variables. 
  4. JNI References are Java objects that the native code has created as part of a JNI call. Objects thus created are treated specially because the JVM does not know if it is being referenced by the native code or not. Such objects represent a very special form of GC root.

Therefore, a simple Java application has the following GC roots:


  • Local variables in the main method
  • The main thread
  • Static variables of the main class



Garbage Deallocation Mechanism 
User Controlled Deallocation

Deallocation can be manual or automatic. Manual allocation involve explicit programmer initiated call to routine like free(), delete() in C/C++. Then object is then added to a free space  list for subsequent reallocation.

Automatic Garbage Collection
The alternative to manual deallocation of heap space is garbage collection. Many garbage collection techniques exist.

Here are some of the most important approaches:

Reference Counting
This is one of the oldest and simplest garbage collection techniques.
A reference count field is added to each heap object. It counts how many references to the heap object exist. When an object’s reference count reaches zero, it is garbage and may collected.

Mark-Sweep Collection
Many collectors, including mark & sweep, do nothing until heap space is nearly exhausted.
Then it executes a marking phase that identifies all live heap objects.
After the marking phase, any object not marked is garbage that may be freed. We then sweep through the heap, collecting all unmarked objects. During the sweep phase we also clear all marks from heap objects found to be still in use.
It's basically two-step process

  1. The algorithm traverses all object references, starting with the GC roots, and marks every object found as alive.
  2. All of the heap memory that is not occupied by marked objects is reclaimed. It is simply marked as free, essentially swept free of unused objects.


Compaction
After the sweep phase, live heap objects are distributed throughout the heap space. This can lead to poor locality. If live objects span many memory pages, paging overhead may be increased. Cache locality may be degraded too.
We can add a compaction phase to mark-sweep garbage collection. After live objects are identified, they are placed together at one end of the heap. This involves another tracing phase in which global, local and internal heap pointers are found and adjusted
to reflect the object’s new location.

Copying Collectors
An entire family of garbage collection techniques, called copying collectors are designed to integrate copying with recognition of live heap objects. Copying collectors are very popular and are widely used.
Consider a simple copying collector that uses semispaces. We start with the heap divided into
two halve - the from and to spaces.

Initially, we allocate heap requests from the from space, using a simple “end of heap” pointer. When the from space is exhausted, we stop and do garbage collection. Actually, though we don’t collect garbage. We collect live heap objects—garbage is never touched.
We trace through global and local pointers, finding live objects. As each object is found, it is moved from its current position in the from space to the next available position in the to space. The pointer is updated to reflect the object’s new location

Generational Techniques
The great strength of copying collectors is that they do no work for objects that are born and die between collections. However, not all heaps objects are so shortlived. In fact, some heap objects are very long-lived.
Generational garbage collection techniques were developed to better handle objects with varying lifetimes. The heap is divided into two or more generations, each with its own to and from space. New objects are allocated in the youngest generation, which is collected most frequently. If an object survives across one or more collections of the youngest generation, it is “promoted” to the next older generation, which is collected less often. Objects that survive one or more collections of this generation are then moved to the next older generation. This
continues until very long-lived objects reach the oldest generation, which is collected very infrequently

The advantage of this approach is that long-lived objects are “filtered out,” greatly reducing the cost of repeatedly processing them. Of course, some long-lived objects will die and these will be caught when their generation is eventually collected.




  1. Which part of the memory is involved in Garbage Collection?
    Heap
  2. What is responsibility of Garbage Collector?
    Garbage collector frees the memory occupied by the unreachable objects during the java program by deleting these unreachable objects. It ensures that the available memory will be used efficiently, but does not guarantee that there will be sufficient memory for the program to run.
  3. When does an object become eligible for garbage collection?
    An object becomes eligible for Garbage Collection when no live thread can access it.
  4. Can the Garbage Collection be forced to run?
    No. The Garbage Collection can not be forced, though there are few ways by which it can be requested there is no guarantee that these requests will be taken care of by JVM.
  5. How can the Garbage Collection be requested?
    1. The methods to perform the garbage collections are present in the Runtime class provided by java. The Runtime class is a Singleton for each java main program.The method getRuntime() returns a singleton instance of the Runtime class. The method gc() can be invoked using this instance of Runtime to request the garbage collection.
    2. Call the System class System.gc() method which will request the jvm to perform GC.
  6. Does garbage collection guarantee that a program will not run out of memory?
    It does not guarantee that a program will not run out of memory.
  7. What is the purpose of finalization in java?
    It gives the opportunity to unreachable object to perform any clean, before the object gets garbage collected. For instance, closing a closing, releasing any resources etc.
  8. If an object is garbage collected, can it become reachable again?
    No
  9. How String are garbage collected?
    Under normal circumstances, string literals and classes are all allocated into the JVM's permanent generation and usually won't ever be collected.
    String s = new String("java-latte");
    the string object referred by s will be on heap and the string literal "java-latte" will be in string pool.The objects in the string pool will not be garbage collected. They are there to be reused during the lifetime of the program, to improve the performance.
    If you intern() like this
    String s = new String("java-latte").Intern();
    The object return by intern() method represent the "java-latte" string literal. But, interned string which are not identical with String literals can be garbage collected once they are unreachable.
  10. Are static fields garbage collected?
    Static fields can't be garbage collected while the class is loaded. They can be garbage collected when the respective class loader which is responsible for loading the class is itself garbage collected.
    Basically, static variable are referenced by class objects which are referenced by classloader, so unless the class loader drop the class somehow or class loader itself become eligible for garbage collection the static variable won't be garbage collected.
  11. How many times does the garbage collector calls the finalize() method for an object?
    One
  12. What happens if an uncaught exception is thrown from during the execution of the finalize() method of an object?
    The exception will be ignored and the garbage collection (finalization) of that object terminates.
  13. Enable/disable call of finalize() method of exit of the application?
    Runtime.getRuntime().runFinalizersOnExit(boolean value) . Passing the boolean value will either disable or enable the finalize() call.
  14. Is garbage collector a dameon thread?
    Yes . A dameon thread runs behind the application. It is started by JVM. The thread stops when all non-dameon threads stop.
  15. Garbage Collector is controlled by whom?
    The JVM controls the Garbage Collector; it decides when to run the Garbage Collector. JVM runs the Garbage Collector when it realizes that the memory is running low, but this behavior of jvm can not be guaranteed. 
    One can request the Garbage Collection to happen from within the java program but there is no guarantee that this request will be taken care of by jvm.
  16. How the garbage collection occurred with instance member and static member in a class?
    Members are not garbage collected objects are
    An object becomes eligible for Garbage Collection when no live thread can access it.Static variables are referenced by Class objects which are referenced by classLoaders so unless either the ClassLoader drops the Class somehow or the ClassLoader itself becomes eligible for collection (think of unloading your application)
  17. How to make a "singleton" eligible for garbage collection?
    You must have a method to set the static variable to null(and hope that nothing else had taken a copy of the reference). Obviously the next time anyone asked for an instance,it would need to be recreated... at which point it's not really a singleton, of course.
  18. Storage Area of static member in JVM memory?
    static member variables are kept on the Permanent Genration.

Different ways to make an object eligible for Garbage Collection?

  1. Set all available object references to null once its purpose is over
  2. Make the reference variable to refer to another object :
    String str1 = "Java.latte.blogspot.in";
    String str2 = "google.com";
    //String object referred by str1 is not eligible for GC yet
    str1 = str2;
    //str1 variable referes to the String object "google.com" and the object "Java.latte.blogspot.in" is not referred by any variable and hence is eligible for GC 
  3. Creating Islands of Isolation
    If you have two instance reference variables which are referring to the instances of the same class, and these two reference variables refer to each other and the objects referred by these reference variables do not have any other valid reference then these two objects are said to form an Island of Isolation and are eligible for Garbage Collection.
    Test gc3;
    Test gc1 = new Test ();
    Test gc2 = new Test ();
    gc1.g = gc2;
    gc2.g = gc1;
    gc1 = null;
    gc2 = null;
    Here gc1 and gc2 refer to each other and have no other valid, form Island of Isolation and eligible for Garbage collection

How to set the Perm area size?
You can set the Perm area size with the -XX:PermSize and -XX:MaxPermSize options but only when OutOfMemoryError occurs and the cause is the Perm area size.



Terms used in Tuning Garbage Collection
  • Heap area size -Xms   Heap area size when starting JVM
    -Xmx Maximum heap area size
  • New area size
    -XX:NewRatio    Ratio of New area and Old area
    -XX:NewSize      New area size
    -XX:SurvivorRatio    Ratio of Eden area and Survivor area
Now we'll see with the help of an example to understand these terms

Sizing the Generations
  • The -Xmx value determines the size of the heap to reserve at JVM initialization
  • The -Xms value is the space in memory that is committed to the VM at init. The JVM can grow to the size of -Xmx
  • The difference between -Xmx and -Xms is virtual memory (virtually committed)
Total Heap
  • Total available memory is the most important factor affecting GC performance
  • By default the JVM grows or shrinks the heap at each GC to keep the ratio of free space to live objects at each collection within a specified range.
    -XX:MinHeapFreeRatio - when the percentage of free space in a generation falls below this value the generation will be expanded to meet this percentage. Default is 40
    -XX:MaxHeapFreeRatio - when the percentage of free space in a generation exceeded this value the generation will shrink to meet this value. Default is 70
The Young Generation
The bigger the young generation the less minor GC's, but this implies a smaller tenured generation which increases the frequency of major collections.
If you are not able to understand this, please go through the step mention above with pictures

Young Generation Guarantee
  • The -XX:SurvivorRatio option can be used to tune the number of survivor spaces
  • Not often important for performance

 -XX:SurvivorRatio 
It is "the ratio between each survivor space and eden space" . The SurvivorRatio parameter controls the size of the two survivor spaces.
As you seen above the young Generation is divided into 3 parts.
For example, -XX:SurvivorRatio=6 sets the ratio between each survivor space and eden to be 1:6, each survivor space will be one eighth of the young generation.
To clear this, consider x= s0 (survior one), x= s2 (survior two)  and 6x = eden space means total= 8x for young Generation. It's clear that each survivor space will be one eighth of the young generation.

-XX:NewSize
It is sum of Survivor Space (both S0 and S1) and Eden Space

-XX:NewRatio
NewRatio is the ratio of the New area(sum of Survivor Space (both S0 and S1) and Eden Space.) and the Old area (Old Generation).
If XX:NewRatio=1, New area:Old area is 1:1. For 1 GB, New area:Old area is 500MB: 500MB. 
If NewRatio is 2, New area:Old area is 1:2. Therefore, as the value gets larger, the Old area size gets larger and the New area size gets smaller.
Old Generation = 2/3 of the heap.

-XX:+PrintTenuringDistribution 
shows the threshold chosen by JVM to keep survivors half full, and the ages of objects in the new generation.

Example 

Xmx / Xms = 1000 MB
NewRatio = 3
From this we can calculate 
Old area (Old Generation) = 750 & New Area (the sum of Survivor Space (both S0 and S1) and Eden Space) = 250

If SurvivorRatio = 10 then each Survivor Space = 1/12 of Eden  i.e, 20.888 = 21
S0 & S1 = 21
Eden space= 208

Amount of available after garbage collection will be different on your computer
HotSpot Heap structure has been changed in Java 8, check this link 

If you know anyone who has started learning java, why not help them out! Just share this post with them. Thanks for studying today!...