Java as this is a very popular language, but also its use of the
new keyword is one of the seemingly simplest across various languages. That doesn’t mean it doesn’t have any surprising implications though! We’ll dive into the memory model of Java to understand why
new acts how it does. I’ll also reference the Oracle Java Language Specification at times to disambiguate.
Ultimately we’ll build up to my favourite surprising Java fact, which is why the following piece of code acts how it does:
Wait up! ⚠️
In this article, will be extensively referring to the stack and the heap as memory concepts. Please check here for some reading links if you need to confirm your understanding before continuing.
Basic understanding of garbage collection is also required.
newalways allocates a fresh object on the heap
- Both arrays and lists are objects
- Primitives are never newed
- Except when they’re boxed
- Boxed types such as
Integercache small values
At its core, Java’s
new keyword means “create a new object, place it on the heap, and give me a handle to it.”
Here I use the word object as a specific technical term: in Java, a type is either primitive or an object.
There are 8 primitives:
char. If declared in a function, a primitive will usually be stored on the stack, ready to be cleared out when the function completes.
What is an object then? An object is anything created through use of the
new keyword — which will always be called on a class. This is the bread and butter of Java:
Dog is a class, and
dog is an object.
Note how I’ve been talking about objects, not Objects.
Object is a class that every other class in Java must ultimately derive from. If you don’t explicitly choose a class to extend from,
Object will be implicitly chosen for you. In this case,
All objects are creations of classes, which means that all objects must have all the methods of
Have you ever come across a piece of code where it looks like someone is calling
new on a primitive? They’ll actually be calling it on a class which looks like the primitive.
new a primitive. You can however,
new a class which boxes the primitive — in this case,
Integer. This will take the stack allocated
int, and move it onto the heap. However, in this case we then immediately automatically unbox the Integer back to an
int for assignment to
int is a primitive, and
Integer is a class that creates an object when we call
new on it.
If that was a mouthful of a paragraph, let’s break it down a bit further.
Boxing (or autoboxing) the practise of wrapping a primitive (like
int) in a matching object (like an instance of
Integer). These are known as primitive wrapper classes. This means the primitive is no longer on the stack, and is now owned by the wrapper on the heap. As such, instead of being freed when the function ends, its lifetime is now controlled by garbage collection.
Unboxing is where we convert a primitive wrapper class back to a primitive — either automatically, or by calling a function like
intValue() on it.
Where possible, prefer primitives: they use less memory, are more cache efficient, and are easier to reason about. An
Integer uses 16 bytes, vs an
int using only 4.
However, you MUST box primitives in a wrapper class when making a
List of some type. You cannot have a
List<int>, but you can have a
List<Integer>. We’ll talk about why later.
Let’s first confirm the difference between an array and a list in Java.
An array is a fixed length block of memory where we can place anything we want, as long as we know the size of it. This includes both primitives and objects. Arrays are almost always allocated on the heap, not the stack — this is in contrast to say C++.
A list is some collection of objects, where we the size can grow and shrink as we add new values into it. Again, objects here is used technically — you cannot have a list of primitives.
So, are arrays (or lists?) primitives, objects, or something else entirely?
Take a look at the
new keyword used in there — this means we can tell it’s an object.
Above we talked about objects vs
Objects. An array’s parent class is
Object, so it gets all the methods of that too, such as
Strings in Java are not primitives — though you do not need
new to make one, the language has special support to convert a string literal to a heap allocated String.
- Oracle Java Language Specification
Note that internally, the JVM may actually reuse the same piece of memory if it can see that two or more
Strings have the same value. As
Strings are immutable, this is a safe optimization. (Note the word “may”, not “will”. It’s a hint.)
This is mostly a legacy issue. This StackOverflow answer covers it really well.
Essentially, when making a
List<Dog>, Java compiles it down to a
List<Object> so it can generically perform its array operations. All objects inherit from Object, however no primitives do.
If you want a matching interface of a list of primitive types, you can use libraries like Trove’s class, TIntArrayList.
No. Every time you call
new, you will always get a new fresh object. Guaranteed.
No (and yes). Java’s entire core is built around using
new when you need to, so the language is highly optimized for the scenario anyway.
Let’s quickly talk about garbage collection and the heap. In Java, our accessible heap is cut into two major sections: Young Generation, and Old Generation. (There is more nuance than this, but this is the key concept you need to know.)
When you call
new, the newly created object is put into the Young Generation. This area of memory is regularly garbage collected. The belief is that most objects that are created are likely to be thrown away quickly and never used again (like temporary variables).
If an object is not evicted from Young Generation after garbage collection, it is moved to Old Generation. This is garbage collected far less regularly — the idea being that if an object has stuck around for a while, it’s likely to stick around for much longer, and doesn’t need to be checked so much.
However, this doesn’t mean you can constantly call
new whenever you want: remember that the fastest code is that which isn’t run. The same applies to
Just before, I mentioned that calling
new guarantees a brand new object. This is true even of
Strings, as seen in this example:
Notice however that
t share the same piece of memory. It makes sense to reuse memory where we can.
That mindset is behind the Java Integer Cache. Let’s look at the code example in full:
The key to figuring out what is happening here can be see in the difference between line 7 and line 19. In Java,
== does not mean “are these things equal?”, it means:
- If these two values are primitives, do they share the same primitive value?
- If these two values are objects, do they occupy the same space in memory?
As such, if we have two new
Integers, they will not
== each other, as we know from above that they will both be given a new piece of memory to own. Instead, we want to call the
Integer has overridden to compare the actual value.
Above, we talk about how most objects are thrown away immediately, and those that stick around are likely to stay for a long time and be used a lot.
Java holds the belief that if you’re going to use an integer number, in an overwhelming number of cases, it’ll be small (between -128 and 127). It also believes you’re likely to throw these numbers away and re-use them a lot. If you’re going to be making a large amount of predictable Integer objects, why not simply cache them?
Instead of remaking
new Integer(1), which we know is 16 bytes, putting it in Young Generation a lot of times, then throwing it away, wasting both CPU and memory, why not keep it around in a special part of memory?
When you write
Integer a = 127;, Java first checks its Integer cache to see if it already exists. Since you did not use the
new keyword, it doesn’t have to guarantee a new piece of memory!
If the value is reused, when we use
== we check the memory addresses of the two values — and since they’re the same, return true.
new keyword itself is fairly straightforward, with no hidden side effects. However, Java’s memory model and other legacy decisions can produce effects that seem surprising in the context of how unambiguous
new keyword is the same. Java has a
new keyword to make the transition seem simpler for C++ developers, however there is no matching
delete keyword but it has nothing to do with
new or memory management!