When to use EntityManager.find() vs EntityManager.getReference() with JPA When to use EntityManager.find() vs EntityManager.getReference() with JPA java java

When to use EntityManager.find() vs EntityManager.getReference() with JPA


I usually use getReference method when i do not need to access database state (I mean getter method). Just to change state (I mean setter method). As you should know, getReference returns a proxy object which uses a powerful feature called automatic dirty checking. Suppose the following

public class Person {    private String name;    private Integer age;}public class PersonServiceImpl implements PersonService {    public void changeAge(Integer personId, Integer newAge) {        Person person = em.getReference(Person.class, personId);        // person is a proxy        person.setAge(newAge);    }}

If i call find method, JPA provider, behind the scenes, will call

SELECT NAME, AGE FROM PERSON WHERE PERSON_ID = ?UPDATE PERSON SET AGE = ? WHERE PERSON_ID = ?

If i call getReference method, JPA provider, behind the scenes, will call

UPDATE PERSON SET AGE = ? WHERE PERSON_ID = ?

And you know why ???

When you call getReference, you will get a proxy object. Something like this one (JPA provider takes care of implementing this proxy)

public class PersonProxy {    // JPA provider sets up this field when you call getReference    private Integer personId;    private String query = "UPDATE PERSON SET ";    private boolean stateChanged = false;    public void setAge(Integer newAge) {        stateChanged = true;        query += query + "AGE = " + newAge;    }}

So before transaction commit, JPA provider will see stateChanged flag in order to update OR NOT person entity. If no rows is updated after update statement, JPA provider will throw EntityNotFoundException according to JPA specification.

regards,


Assuming you have a parent Post entity and a child PostComment as illustrated in the following diagram:

enter image description here

If you call find when you try to set the @ManyToOne post association:

PostComment comment = new PostComment();comment.setReview("Just awesome!"); Post post = entityManager.find(Post.class, 1L);comment.setPost(post); entityManager.persist(comment);

Hibernate will execute the following statements:

SELECT p.id AS id1_0_0_,       p.title AS title2_0_0_FROM   post pWHERE p.id = 1 INSERT INTO post_comment (post_id, review, id)VALUES (1, 'Just awesome!', 1)

The SELECT query is useless this time because we don’t need the Post entity to be fetched. We only want to set the underlying post_id Foreign Key column.

Now, if you use getReference instead:

PostComment comment = new PostComment();comment.setReview("Just awesome!"); Post post = entityManager.getReference(Post.class, 1L);comment.setPost(post); entityManager.persist(comment);

This time, Hibernate will issue just the INSERT statement:

INSERT INTO post_comment (post_id, review, id)VALUES (1, 'Just awesome!', 1)

Unlike find, the getReference only returns an entity Proxy which only has the identifier set. If you access the Proxy, the associated SQL statement will be triggered as long as the EntityManager is still open.

However, in this case, we don’t need to access the entity Proxy. We only want to propagate the Foreign Key to the underlying table record so loading a Proxy is sufficient for this use case.

When loading a Proxy, you need to be aware that a LazyInitializationException can be thrown if you try to access the Proxy reference after the EntityManager is closed.


This makes me wonder, when is it advisable to use the EntityManager.getReference() method instead of the EntityManager.find() method?

EntityManager.getReference() is really an error prone method and there is really very few cases where a client code needs to use it.
Personally, I never needed to use it.

EntityManager.getReference() and EntityManager.find() : no difference in terms of overhead

I disagree with the accepted answer and particularly :

If i call find method, JPA provider, behind the scenes, will call

SELECT NAME, AGE FROM PERSON WHERE PERSON_ID = ?UPDATE PERSON SET AGE = ? WHERE PERSON_ID = ?

If i call getReference method, JPA provider, behind the scenes, will call

UPDATE PERSON SET AGE = ? WHERE PERSON_ID = ?

It is not the behavior that I get with Hibernate 5 and the javadoc of getReference() doesn't say such a thing :

Get an instance, whose state may be lazily fetched. If the requested instance does not exist in the database, the EntityNotFoundException is thrown when the instance state is first accessed. (The persistence provider runtime is permitted to throw the EntityNotFoundException when getReference is called.) The application should not expect that the instance state will be available upon detachment, unless it was accessed by the application while the entity manager was open.

EntityManager.getReference() spares a query to retrieve the entity in two cases :

1) if the entity is stored in the Persistence context, that is the first level cache.
And this behavior is not specific to EntityManager.getReference(), EntityManager.find() will also spare a query to retrieve the entity if the entity is stored in the Persistence context.

You can check the first point with any example.
You can also rely on the actual Hibernate implementation.
Indeed, EntityManager.getReference() relies on the createProxyIfNecessary() method of the org.hibernate.event.internal.DefaultLoadEventListener class to load the entity.
Here is its implementation :

private Object createProxyIfNecessary(        final LoadEvent event,        final EntityPersister persister,        final EntityKey keyToLoad,        final LoadEventListener.LoadType options,        final PersistenceContext persistenceContext) {    Object existing = persistenceContext.getEntity( keyToLoad );    if ( existing != null ) {        // return existing object or initialized proxy (unless deleted)        if ( traceEnabled ) {            LOG.trace( "Entity found in session cache" );        }        if ( options.isCheckDeleted() ) {            EntityEntry entry = persistenceContext.getEntry( existing );            Status status = entry.getStatus();            if ( status == Status.DELETED || status == Status.GONE ) {                return null;            }        }        return existing;    }    if ( traceEnabled ) {        LOG.trace( "Creating new proxy for entity" );    }    // return new uninitialized proxy    Object proxy = persister.createProxy( event.getEntityId(), event.getSession() );    persistenceContext.getBatchFetchQueue().addBatchLoadableEntityKey( keyToLoad );    persistenceContext.addProxy( keyToLoad, proxy );    return proxy;}

The interesting part is :

Object existing = persistenceContext.getEntity( keyToLoad );

2) If we don't effectively manipulate the entity, echoing to the lazily fetched of the javadoc.
Indeed, to ensure the effective loading of the entity, invoking a method on it is required.
So the gain would be related to a scenario where we want to load a entity without having the need to use it ? In the frame of applications, this need is really uncommon and in addition the getReference() behavior is also very misleading if you read the next part.

Why favor EntityManager.find() over EntityManager.getReference()

In terms of overhead, getReference() is not better than find() as discussed in the previous point.
So why use the one or the other ?

Invoking getReference() may return a lazily fetched entity.
Here, the lazy fetching doesn't refer to relationships of the entity but the entity itself.
It means that if we invoke getReference() and then the Persistence context is closed, the entity may be never loaded and so the result is really unpredictable. For example if the proxy object is serialized, you could get a null reference as serialized result or if a method is invoked on the proxy object, an exception such as LazyInitializationException is thrown.

It means that the throw of EntityNotFoundException that is the main reason to use getReference() to handle an instance that does not exist in the database as an error situation may be never performed while the entity is not existing.

EntityManager.find() doesn't have the ambition of throwing EntityNotFoundException if the entity is not found. Its behavior is both simple and clear. You will never have surprise as it returns always a loaded entity or null (if the entity is not found) but never an entity under the shape of a proxy that may not be effectively loaded.
So EntityManager.find() should be favored in the very most of cases.