Hibernate HQL And Performance
Wednesday, 23 August 2006
The Hibernate ORM tool give you the ability to write SQL-esque queries using HQL to do custom joining, filtering, etc. to pull Objects from your database. The documentation gives you a lot of examples of the things you can do, but I haven’t seen any caveats or warnings.
Database Performance
As far as database performance goes there are two major things to start with when you want to understand your database performance:
- How many queries are run?
- How expensive are the individual queries?
Not too earth shattering is it? Basically if you run fewer queries of the same cost you’re better off. Likewise, if you make the queries themselves cost less (by optimizing the queries themselves, creating the proper indexes, etc) then they will run faster. So of course the best is to do both. Identify you to run fewer, faster queries. (Yes, I’m still waiting on my Nobel prize.)
I’ll talk more about fewer queries later…
To make queries faster, you mostly are working in the database. You depend on good tools and good statistics. If the size and kind of data changes, you might have to redo this stuff.
To Optimize your database queries:
- Run some queries examining their execution plans
- Find some possible columns to index
- Create an index
- Re-run the queries and examine the execution plans again
- Keep it if it’s faster, get rid of it if it’s not
- Goto 1
Hibernate and Caches
Hibernate does one thing: It maps Objects to a Relational database. Hibernate is really pretty good at that mapping and can support all kinds of schemas. So you should be able to (relatively) easily map your objects to your schema.
Hibernate also has two potential caching schemes. What it calls Level-1 and Level-2 caching. Level-1 caching is done through the Hibernate session. As long as the Hibernate session is open, any object that you have loaded will be pulled from the session if you query for it again.
The Level-2 cache is a longer-running, more advanced caching scheme. It allows you to store objects across Hibernate sessions. You’re often discouraged against using Level-2 caching, but it is very nice for read-only objects that you don’t expect to change in the database (think of pre-defined type information and the like). Again, if you query or one of these objects using Hibernate, then you’ll get an object from the Level-2 cache.
Notice how the Level-1 and Level-2 cache prevent Hibernate from having to re-query the database for a lot of objects. This of course can be a huge performance benefit. Likewise, Hibernate supports Lazy Loading of collections, so if your object is related to a collection of other objects, Hibernate will wait to load them until you need them. Once they’ve been loaded though, they are in the Object graph, so accessing them a second time does not require another round-trip to the database.
All of this lazy loading and caching is about reducing the number of queries you need to run against the database. You can also tweak your Hibernate mapping files to implement things like batching (loading children of multiple parents in one query) to greatly reduce the number of queries that need to be run. You can also specify to pre-load a related object using a left join if you will always need the object and want to get both in the same query. Most of the decisions are dependent on your application and what you are doing, but they are very easy to play with in your configuration and see if they improve your application performance.
Why the hard time for HQL?
All of the Caching and tweaking you can do in your Hibernate mappings (or using Annotations) is totally wasted if you using HQL queries to load your objects.
If you specify a fetch=”join” in your mapping to do a left join and load a dependent object, that doesn’t get used when you use HQL to load the object, so you will be doing more queries than you need.
If you have natural mappings of parent/child relationships then the following code will only generate a single query to load the Person and a single query to get the Addresses.
Person p = session.get(Person.class, 1);
List<Address> address = p.getAddresses();
List<Address> address2 = p.getAddresses();
This code still only generates two queries:
Person p = session.createQuery("from Person where id=:id")
.setParameter("id", 1).uniqueResult();
List<Address> address = p.getAddresses();
List<Address> address = p.getAddresses();
But the following code generates twice as many queries to load the addresses.
Person p = session.createQuery("from Person where id=:id")
.setParameter("id", 1).uniqueResult();
List<Address> address = session
.createQuery("from Addresses where person_id=:id")
.setParameter("id", 1).list();
List<Address> address2 = session
.createQuery("from Addresses where person_id=:id")
.setParameter("id", 1).list();
Of course this is a totally contrived example, but if you’ve built out a large system with a Service Facade and DAOs these kinds of things can easily be hidden deep in the application where it would be hard to know whether a call would trigger a database call or not. So be very conscious of using HQL queries and the consequences of using them.
Hibernate rewards you for using natural relationships in your Objects. It rewards you with performance for building a POJO based Object Oriented system.
Hibernate HQL Rules
Rule #1: Don’t use HQL.
Rule #2: If you really need to use HQL, see Rule #1.
Rule #3: If you really, really need HQL and you know what you’re doing, then carefully use HQL.
Ok, so if I’m right about this, why is this not at the top of the HQL documentation? Don’t you think they should talk about this as a method of last resort?
Time to start reading POJOs in Action again.
No. 1 — May 30th, 2007 at 9:04 am
Good lord, I hope God almighty smites you down for your treacherous lies. HQL is friking amazing, and only a billion times better than the poor alternative of the criteria API; which is dooooooooooooooog slow.
I suggest you get some experience using HQL in a production environment before you tarnish the internet with lowly “opinions”.
No. 2 — May 30th, 2007 at 9:13 am
That’s funny… I didn’t mention the Criteria API at all.
The point, for those who obviously don’t get it, is that you can misuse and abuse HQL and give up a lot of the power of mapping objects, and thus Level 1 caching that Hibernate provides.
No. 3 — July 4th, 2007 at 7:02 pm
“If you specify a fetch=”join” in your mapping to do a left join and load a dependent object, that doesn’t get used when you use HQL to load the object, so you will be doing more queries than you need.”
With HQL, one can just do the following to get the object and the dependent object in one DB trip: “from parent as p left join fetch p.dependentObject” - important keyword “fetch” on the join. I believe this would be equivalent to using the explicit config of fetch=”join” which gets ignored with HQL as you state.
From hibernate.org: In addition, a “fetch” join allows associations or collections of values to be initialized along with their parent objects, using a single select. This is particularly useful in the case of a collection. It effectively overrides the outer join and lazy declarations of the mapping file for associations and collections. http://www.hibernate.org/hib_docs/v3/reference/en/html/queryhql.html#queryhql-joins-forms
No. 4 — July 5th, 2007 at 9:40 am
You of course can make all kinds of modifications to HQL sprinkled all over your data access layer. But that knowledge is already represented in your mapping file, so this is repeating that knowledge in many places. And it doesn’t “override” the declarations in the mappings, because those declarations aren’t ever used by HQL in the first place.
The main problem is when going back and running the same query that could have been expressed as an object relationship. The natural object relationships are where hibernate shines. HQL should be reserved for very special situations.
No. 5 — October 26th, 2007 at 9:32 am
I’ve been using NHibernate HQL and Criteria in a production environment for years and when used properly Criteria is a LOT faster than HQL. I’m guessing this is not the case with Hibernate Criteria
No. 6 — October 26th, 2007 at 10:30 am
Rob,
That’s exactly what I was trying to say. Use natural relationships and criteria queries to load objects in a domain driven way. Do not do a transaction script pattern of a bunch of HQL queries buried in an application un-aware data access layer.
No. 7 — November 30th, 2007 at 7:45 am
The criteria API has the same “limitations” in regard to the cache that you are pointing out with HQL.
No. 8 — December 15th, 2007 at 3:39 pm
There’s virtually no performance difference between HQL and Criteria. The principal benefit of the Criteria API is that the code is cleaner and more abstracted from the final generated SQL. However, the more complex your queries become, the deeper you must delve into the inner workings of Hibernate–which kind-of defeats the purpose.
I find I still end up using HQL for most of my DAOs, however, because HQL can return scaler results which are far more performant than fully-hydrated objects. Sometimes I don’t need a whole domain object–I just want a couple of properties. For instance: let’s say I have a Customer class that has dozens of simple properties (id, name, account number, etc.), as well as a bunch of associated collections (Addresses, Bank Accounts, Transactions, etc.). If I do a criteria query on the Customer.class, I’ll get a bunch of Customer objects back with all the simple (non-lazy) properties, and PersistentCollection placeholders for my (lazy-loaded) collections.
But what if I only need a list of names and ids? I don’t need placeholder objects taking up space in my JVM if they’re never used. A good example of this is drop-down auto-complete forms: you type something in, a list of partially matched Customers comes back. You’re not interested in the Customer’s addresses, or their lastModified date, or any of that stuff. You really just want an array of strings.
> But that knowledge is already represented in your mapping file, so this
> is repeating that knowledge in many places
The Hibernate docs actually recommend *against* specifying fetching strategies in the mappings. From the docs (19.1.2. - Tuning fetch strategies): “Usually, we don’t use the mapping document to customize fetching. Instead, we keep the default behavior, and override it for a particular transaction…”
No. 9 — January 8th, 2008 at 9:45 am
I’d have to agree with the HIbernate Docs on fetch strategies — if you specify the fetch strategy in the domain mapping (whether by annotation or the xml), then you’re committing all uses of that strategy for the given relationship to use that same strategy, whether that is appropriate or not.
Eg. you have a Brokerage that contains many Brokers, when you fetch a list of Brokerages, you do not want the child relationship brokers to be eagerly fetched - it should be allowed to remain in the normal lazy mode. However, when you’re fetching a specific Brokerage for display, in a context that wants to also display a table of the Brokers within the Brokerage, then you want the Brokers to be fetched eagerly (perhaps with a “from parent as p left join fetch p.dependentObject” DAO method) as part of the Brokerage fetch to avoid an N+1 fetch scenario.
(If Brokers and Brokerages are not your bag, you can substitute Brokerage with Order, and Broker with OrderLine for the same effect)
Providing the DAO layer is the primary method for fetching objects (and you never pollute the service/UI layers with hibernate manipulations of the persisted object graph) then it is quite reasonable to have transaction specific logic - it’s still only in one place, just not enforced at a POJO domain object level, where there is no knowledge of the transactional context of an operation.
No. 10 — January 13th, 2008 at 8:03 pm
Matt,
>But what if I only need a list of names and ids? I don’t need placeholder
>objects taking up space in my JVM if they’re never used.
So how can one achieve such a query which does not pull in all the properties?
Thanks,
Dan
No. 11 — January 27th, 2008 at 6:37 am
Matt,
you can easily load scalar values with the criteria API, you do not need HQL for this.
Example:
You have a Company with an one-2-one relation to an Address.
You just want to query id and name of the Company and the countryCode of its related Address.
myCompanyCriteria.createAlias(”address”, “joinedAddress”);
myCompanyCriteria.setProjection(
Projections.projectionList()
.add( Projections.id(), “id” )
.add( Projections.property(”name”), “name” )
.add( Projections.property(”joinedAddress.countryCode”), “countryCode” )
);
myCompanyCriteria.setResultTransformer( new AliasToBeanResultTransformer(CompanyVO.class) );
The Class CompanyVO just needs setter for id, name and countryCode.
Regards
No. 12 — February 14th, 2008 at 9:45 pm
[...] http://www.zorched.net/2006/08/23/hibernate-hql-and-performance/ [...]
No. 13 — September 24th, 2008 at 9:42 pm
Dude,
This a completely stupid article. Don’t use SQL eh?? Have you ever built a real app? I wish google didn’t index your post!!!