Groovy Mocking a Method That Exists on Object

In Groovy, the find method exists on Object. The find method is also an instance method on EntityManager which is commonly used in JPA to get an instance based on a database id. I was trying to create a Mock of EntityManager like:

def emMock = new MockFor(EntityManager)
emMock.demand.find { Class clazz, Object rquid -> return new Request(rqUID: rquid) }
def em = emMock.proxyDelegateInstance()

This gave me an error though:

groovy.lang.MissingMethodException: No signature of method: com.onelinefix.services.RequestRepositoryImplTest$_2_get_closure1.doCall() is applicable for argument types: (groovy.mock.interceptor.Demand) values: [groovy.mock.interceptor.Demand@a27ebd9]
Possible solutions: doCall(java.lang.Class, java.lang.Object), findAll(), findAll(), isCase(java.lang.Object), isCase(java.lang.Object)
at org.codehaus.groovy.runtime.metaclass.ClosureMetaClass.invokeMethod(ClosureMetaClass.java:264) ~[groovy-all-1.8.6.jar:1.8.6]
at groovy.lang.MetaClassImpl.invokeMethod(MetaClassImpl.java:877) ~[groovy-all-1.8.6.jar:1.8.6]
at groovy.lang.Closure.call(Closure.java:412) ~[groovy-all-1.8.6.jar:1.8.6]
at groovy.lang.Closure.call(Closure.java:425) ~[groovy-all-1.8.6.jar:1.8.6]

It ends up that find is a DefaultGroovyMethod that’s added to all Objects and this was causing the mock.demand to get confused because it thought I was trying to call the Object.find.

Luckily there is a workaround and that’s to use the ordinal argument for find.

emMock.demand.find(1) { Class clazz, Object rquid -> return new Request(rqUID: rquid) }

That’s enough of a change to invoke mock version instead of the DefaultGroovyMethod version.
Now if I only knew why….

Struts2 Map Form to Collection of Objects

The Struts2 documentation contains examples that are often basic at best which can make it challenging to figure out how to do things sometimes. I was working on creating a form that would allow me to select values from a list to connect 2 objects in a One-to-Many relationship. This is a common relationship for many things. In this example, I’ll use a User and Role class to demonstrate the concept.

For background, here’s a JPA mapped User and Role class.

import java.util.List;
import javax.persistence.*;
 
@Entity
public class User {
 
    private Long id;
    // ... other member variables
    private List<Role> roles;
 
    @Id
    @GeneratedValue(strategy = GenerationType.AUTO)
    public Long getId() {
        return id;
    }
 
    public void setId(Long id) {
        this.id = id;
    }
 
    @OneToMany
    @JoinTable(name = "UserRoles",
            joinColumns = @JoinColumn(name = "user_Id"),
            inverseJoinColumns = @JoinColumn(name = "role_Id"),
            uniqueConstraints = @UniqueConstraint(columnNames = {"user_Id", "role_Id"})
    )
    public List<Role> getRoles() {
        return roles;
    }
 
    public void setRoles(List<Role> roles) {
        this.roles = roles;
    }
 
    // ... other properties
}
 
@Entity
public class Role {
 
    private Long id;
 
    @Id
    @GeneratedValue(strategy = GenerationType.AUTO)
    public Long getId() {
        return id;
    }
 
    public void setId(Long id) {
        this.id = id;
    }
 
    // ... other properties
}

A list of Roles exists in the database. When a User is created, they are assigned one or more Roles. The User and Roles are connected in the Database through a join table as defined in the mapping. At this point I created some DAO Repository classes to manage the persistence and an Action to handle the form values. (The details of JPA, setting up Persistence and Actions are beyond the scope of this post.)

The part that caused me the most grief ended up being the form. I wanted to add a checkbox list that contained all of the Roles. The example on the Struts site for checkboxlist, a control with 43 documented properties is:

<s:checkboxlist name="foo" list="bar"/>

Needless to say, there was some ‘figuring out’ to be done.

The form itself is pretty vanilla for the most part. The checkboxlist is the interesting part because it’s what allows us to map the User to the Roles. I knew that I was looking for something to put into the value property of the control that would tell it to pre-select the Role values that were already associated with the User.

I started out with something like:

<s:checkboxlist name="user.roles.id"
                    list="roles"
                    listKey="id"
                    listValue="name"
                    label="%{getText('label.roles')}"
                    value="user.roles"/>

That didn’t work. When you think about it, that makes sense because the keys in the list are ids and the values supplied are Role objects. So I needed to figure out how to get the Ids from the Roles. I could have done that in the Action class, but it seemed like there should be a better way. A way that would allow me to continue in more of a Domain fashion.

Doing some research into OGNL, I came upon the list projections section which was the key…

The OGNL projection syntax gives us user.roles.{id}. Basically that is a list comprehension that takes of list of Role objects and turns it into a list of Role Ids. That list of Ids becomes the list of values that will be preselected.

Knowing that I can now create a select box that will include the pre-selected values on an edit form:

<s:checkboxlist name="user.roles.id"
                    list="roles"
                    listKey="id"
                    listValue="name"
                    label="%{getText('label.roles')}"
                    value="user.roles.{id}"/>

Enjoy.

Using Quartz.NET, Spring.NET and NHibernate to run Scheduled Tasks in ASP.NET

Running scheduled tasks in web applications is not normally a straightforward thing to do. Web applications are built to respond to requests from users and respond to that request. This request/response lifecycle doesn’t always match well to a long running thread that wakes up to run a task every 10 minutes or at 2 AM every day.

ASP.NET Scheduled Task Options

Using ASP.NET running on Windows, there are a number of different options that you could choose to implement this. Windows built in Scheduled Tasks can be run to periodically perform execute a program. A Windows Service could be constructed that used a Timer or a Thread to periodically do the work. Scheduled Tasks and Windows Service require you to write a standalone program. You can share DLLs from your Web application but in the end it is a separate app that needs to be maintained. Another option if you go this route is to turn the Scheduled Task or Service being run into a simple Web Service or REST client that can call your Web application but doesn’t need any knowledge of the jobs themselves.

Another option is an Open Source tool called Quartz.NET. Quartz.NET is based on the popular Java scheduled task runner called (not surprisingly) Quartz. Quartz.NET is a full-featured system that manages Jobs that do the work and Triggers that allow you to specify when you want those jobs run. It can run in your web application itself or as an external service.

The simplest approach to get started is to run directly in your Web application as a process in IIS. The downside to this is that IIS will periodically recycle it’s processes and won’t necessarily start a new one until a new web request is made. Assuming you can deal with this indeterministic behavior then in an IIS process will be fine. It also creates a relatively easy path that will allow you to migrate to the external service process at a later point if need be.

I’m an ALT.NET kind of .NET developer, so I like to use tools like NHibernate for ORM and Spring.NET for Dependency Injection, AOP and generally wiring everything together. The good news is that Spring.NET supports Quartz.NET through its Scheduling API. Start with that for some basic information on using Quartz.NET with Spring. The bad news is that the documentation is a bit thin and the examples basic. I attempt to remedy that in part here.

Using Quartz.NET, NHibernate and Spring.NET to run Scheduled Tasks

The goal is to integrate an existing Spring managed object like a Service or a DAL that uses NHibernate with a Quartz Job that will run on a periodic basis.

To start with you need to create an interface for your service and then implement that interface. The implementation I’ll leave to you and your problem, but the example below you can image uses one or more NHibernate DALs to lookup Users, find their email preferences, etc.

Implementing Services and Jobs

public interface IEmailService
{
    void SendEveryoneEmails();
}

When implementing your Job you need to know a few details about how Quartz works:

  1. The first thing to understand is that if you are going to use the AdoJobScheduler to store your Jobs and triggers in the database the Job needs to be Serializable. Generally speaking your DAL classes and NHibernate sessions and the like are not going to be serializable. To get around that, we make the properties set-only so that they will not be serialized when they are stored in the database.
  2. The second thing to understand is that your Job will not be running in the context of the Web application or a request so anything you have to set up connections (such as an OpenSessionInView filter) will not apply to Jobs run by Quartz. This means that you will need to setup your own NHibernate session for all of the dependent objects to use. Luckily Spring provides some help with this in the SessionScope class. This is the same base class as is used by the OpenSessionInView filter.

Using the Service interface you created, you then create a Job that Quartz.NET can run. Quartz.NET provides the IJob interface that you can implement. Spring.NET provides a base class that implements that interface called QuartzJobObject helps deal with injecting dependencies.

using NHibernate;
using Quartz;
using Spring.Data.NHibernate.Support;
using Spring.Scheduling.Quartz;
 
public class CustomJob : QuartzJobObject
{
    private ISessionFactory sessionFactory;
    private IEmailService emailService;
 
    // Set only so they don't get serialized
    public ISessionFactory SessionFactory { set { sessionFactory = value; } }
    public IEmailService EmailService { set { emailService = value; } }
 
    protected override void ExecuteInternal(JobExecutionContext ctx)
    {
        // Session scope is the same thing as used by OpenSessionInView
        using (var ss = new SessionScope(sessionFactory, true))
        {
            emailService.SendEveryoneEmails();
            ss.Close();
        }
    }
}

Wiring Services and Jobs Together with Spring

Now that you have your classes created you need to wire everything together using Spring.

First we have our DALs and Services wired in to Spring with something like the following:

<object id="UserDAL" type="MyApp.DAL.UserDAL, MyApp.Data">
  <property name="SessionFactory" ref="NHibernateSessionFactory" />
</object>
<object id="EmailService" type="MyApp.Service.EmailService, MyApp.Service">
  <property name="UserDAL" ref="UserDAL" />
</object>

Next you create a Job that references the Type of the Job that you just created. The type is referenced instead of the instance because the lifecycle of the Job is managed by Quartz itself. It deals with instantiation, serialization and deserialization of the object itself. This is a bit different than what you might expect from a Spring service normally.

<object id="CustomJob" type="Spring.Scheduling.Quartz.JobDetailObject, Spring.Scheduling.Quartz">
    <property name="JobType" value="MyApp.Jobs.CustomJob, MyApp.Jobs" />
</object>

Once your Job is created, you create a Trigger that will run the Job based on your rules. Quartz (and Spring) offer two types of Jobs SimpleTriggers and CronTriggers. SimpleTriggers allow you to specify things like “Run this task every 30 minutes”. CronTriggers follow a crontab format for specifying when Jobs should run. The CronTrigger is very flexible but could be a little confusing if you aren’t familiar with cron. It’s worth getting to know for that flexibility though.

<object id="CustomJobTrigger" type="Spring.Scheduling.Quartz.CronTriggerObject, Spring.Scheduling.Quartz">
    <property name="JobDetail" ref="CustomJob"/>
    <property name="CronExpressionString" value="0 0 2 * * ?" /> <!-- run every morning at 2 AM -->
    <property name="MisfireInstructionName" value="FireOnceNow" />
</object>

The last piece that needs to be done is the integration of the SchedulerFactory. The SchedulerFactory brings together Jobs and Triggers with all of the other configuration needed to run Quartz.NET jobs.

A couple of things to understand about configuring the SchedulerFactory:

  1. Specifying (where DbProvider is the db:provider setup used by your Nhibernate configuration) tells the SchedulerFactory to use the AdoJobProvider and store the Jobs and Trigger information in the database. The tables will need to exist already and Quartz provides a script for this task.
  2. Running on SQL Server requires a slight change to Quartz. It uses a locking mechanism to prevent Jobs from running concurrently. For some reason the default configuration uses a FOR UPDATE query that is not supported by SQL Server. (I don’t understand exactly why a .NET utility wouldn’t work with SQL Server out of the box?)
    To fix the locking a QuartzProperty needs to be set:
  3. The JobFactory is set to the SpringObjectJobFactory because it handles the injection of dependencies into QuartzJobObject like the one we created above.
  4. SchedulerContextAsMap is a property on the SchedulerFactory that allows you to set properties that will be passed to your Jobs when they are created by the SpringObjectJobFactory. This is where you set all of the Property names and the corresponding instance references to Spring configured objects. Those objects will be set into your Job instances whenever they are deserialized and run by Quartz.

Here’s the whole ScheduleFactory configuration put together:

<object id="SchedulerFactory" type="Spring.Scheduling.Quartz.SchedulerFactoryObject, Spring.Scheduling.Quartz">
    <property name="JobFactory">
        <object type="Spring.Scheduling.Quartz.SpringObjectJobFactory, Spring.Scheduling.Quartz"/>
    </property>
    <property name="SchedulerContextAsMap">
        <dictionary>
            <entry key="EmailService" value-ref="EmailService" />
            <entry key="SessionFactory" value-ref="NHibernateSessionFactory" />
        </dictionary>
    </property>
    <property name="DbProvider" ref="DbProvider"/>
    <property name="QuartzProperties">
        <dictionary>
            <entry key="quartz.jobStore.selectWithLockSQL" value="SELECT * FROM {0}LOCKS WHERE LOCK_NAME=@lockName"/>
        </dictionary>
    </property>
    <property name="triggers">
        <list>
            <ref object="CustomJobTrigger" />
        </list>
    </property>
</object>

Conclusion

Scheduled tasks in ASP.NET applications shouldn’t be too much trouble anymore. Reusing existing Service and DAL classes allows you to easily create scheduled tasks using existing, tested code. Quartz.NET looks to be a good solution for these situations.

Hibernate Query Translators

I’ve recently been doing some performance testing and tuning on an application. It makes use of Hibernate for the data access and ORM and Spring to configure and wire together everything. As I was looking at all of the configuration and came upon the fact that we were using the ClassicQueryTranslatorFactory. The job of the Query Translator is to turn HQL queries into SQL queries. The ClassicQueryTranslatorFactory is the version that was included in Hibernate 2. In Hibernate 3 they created a new Query Translator, the ASTQueryTranslatorFactory. This Query Translator makes use of Antlr which is a Java based parser generator in the vein of lex and yacc.

I switched out the the ClassicQueryTranslatorFactory and started to use the ASTQueryTranslatorFactory and saw an immediate boost in performance of about 15% for the application. I also noticed that fewer queries were being generated for the page loads for the application. Of course this application uses quite a bit of HQL, so if you do not make use of HQL extensively, then you might not see the same benefits.

I have yet to see any documentation or any other evidence to support the claim that the newer ASTQueryTranslatorFactory would offer better performance, but in my case it seems like it has. Has anyone else noticed this behavior?

Hibernate HQL And Performance

The Hibernate ORM tool give you the ability to write SQL-esque queries using HQL to do custom joining, filtering, etc. to pull Objects from your database. The documentation gives you a lot of examples of the things you can do, but I haven’t seen any caveats or warnings.

Database Performance

As far as database performance goes there are two major things to start with when you want to understand your database performance:

  • How many queries are run?
  • How expensive are the individual queries?

Not too earth shattering is it? Basically if you run fewer queries of the same cost you’re better off. Likewise, if you make the queries themselves cost less (by optimizing the queries themselves, creating the proper indexes, etc) then they will run faster. So of course the best is to do both. Identify you to run fewer, faster queries. (Yes, I’m still waiting on my Nobel prize.)

I’ll talk more about fewer queries later…

To make queries faster, you mostly are working in the database. You depend on good tools and good statistics. If the size and kind of data changes, you might have to redo this stuff.

To Optimize your database queries:

  1. Run some queries examining their execution plans
  2. Find some possible columns to index
  3. Create an index
  4. Re-run the queries and examine the execution plans again
  5. Keep it if it’s faster, get rid of it if it’s not
  6. Goto 1

Hibernate and Caches

Hibernate does one thing: It maps Objects to a Relational database. Hibernate is really pretty good at that mapping and can support all kinds of schemas. So you should be able to (relatively) easily map your objects to your schema.

Hibernate also has two potential caching schemes. What it calls Level-1 and Level-2 caching. Level-1 caching is done through the Hibernate session. As long as the Hibernate session is open, any object that you have loaded will be pulled from the session if you query for it again.

The Level-2 cache is a longer-running, more advanced caching scheme. It allows you to store objects across Hibernate sessions. You’re often discouraged against using Level-2 caching, but it is very nice for read-only objects that you don’t expect to change in the database (think of pre-defined type information and the like). Again, if you query or one of these objects using Hibernate, then you’ll get an object from the Level-2 cache.

Notice how the Level-1 and Level-2 cache prevent Hibernate from having to re-query the database for a lot of objects. This of course can be a huge performance benefit. Likewise, Hibernate supports Lazy Loading of collections, so if your object is related to a collection of other objects, Hibernate will wait to load them until you need them. Once they’ve been loaded though, they are in the Object graph, so accessing them a second time does not require another round-trip to the database.

All of this lazy loading and caching is about reducing the number of queries you need to run against the database. You can also tweak your Hibernate mapping files to implement things like batching (loading children of multiple parents in one query) to greatly reduce the number of queries that need to be run. You can also specify to pre-load a related object using a left join if you will always need the object and want to get both in the same query. Most of the decisions are dependent on your application and what you are doing, but they are very easy to play with in your configuration and see if they improve your application performance.

Why the hard time for HQL?

All of the Caching and tweaking you can do in your Hibernate mappings (or using Annotations) is totally wasted if you using HQL queries to load your objects.

If you specify a fetch=”join” in your mapping to do a left join and load a dependent object, that doesn’t get used when you use HQL to load the object, so you will be doing more queries than you need.

If you have natural mappings of parent/child relationships then the following code will only generate a single query to load the Person and a single query to get the Addresses.

Person p = session.get(Person.class, 1);
List

address = p.getAddresses();
List
address2 = p.getAddresses();

This code still only generates two queries:

Person p = session.createQuery("from Person where id=:id")
.setParameter("id", 1).uniqueResult();
List

address = p.getAddresses();
List
address = p.getAddresses();

But the following code generates twice as many queries to load the addresses.

Person p = session.createQuery("from Person where id=:id")
.setParameter("id", 1).uniqueResult();
List

address = session
.createQuery("from Addresses where person_id=:id")
.setParameter("id", 1).list();
List
address2 = session
.createQuery("from Addresses where person_id=:id")
.setParameter("id", 1).list();

Of course this is a totally contrived example, but if you’ve built out a large system with a Service Facade and DAOs these kinds of things can easily be hidden deep in the application where it would be hard to know whether a call would trigger a database call or not. So be very conscious of using HQL queries and the consequences of using them.

Hibernate rewards you for using natural relationships in your Objects. It rewards you with performance for building a POJO based Object Oriented system.

Hibernate HQL Rules

Rule #1: Don’t use HQL.
Rule #2: If you really need to use HQL, see Rule #1.
Rule #3: If you really, really need HQL and you know what you’re doing, then carefully use HQL.

Ok, so if I’m right about this, why is this not at the top of the HQL documentation? Don’t you think they should talk about this as a method of last resort?

Time to start reading POJOs in Action again.

Tapestry and Hibernate Take 2

In a previous post I lamented about Tapestry and Hibernate fighting each other. This was very soon after I had started using Tapestry for the first time, and I wanted to talk a little bit more about it now that I’ve been using it for a while. So now I can rant with much more detail.

Cut to the Chase

The short answer is that I have more of a mixed feeling about it now than I did before, but that I still think that Tapestry and Hibernate are not a good solution. There is too much of an impedance mismatch between the two. They are both very complex frameworks with leaky abstractions that make you have to know too much to use them well.

Complexity and Leaky Abstractions Sink Ships

Hibernate and Tapestry are incredibly complex pieces of software. Throw in a lot of hidden, “magic” functionality in Tapestry from its use of Hivemind and it’s even harder to understand what’s going on. This complexity and lack of transparency means that understanding the deterministic behavior of an application is incredibly difficult. It seems like people “just try different things until something works”. I would call this Programming by Coincidence. The abstractions are just leaky enough that you have to know about the implementation details to effectively use the framework.

Example:
The @Persist annotation (or the persist property in the jwc/page files, etc) store objects in the Web Session. If you modify the state of the object that change is not persisted because it does not know the object was changed. You have to actually call the setter of the persist property to get a new value set. This doesn’t sound like that big of a deal until you try to integrate something like Tapernate to help with Hibernate persistence issues. Tapernate will merge the persistent Hibernate object back into the current Hibernate Session. Well, if you’ve modified an @Persist bean and Tapernate merges it back into the Hibernate Session, you will end up with stale data in your Hibernate session unless you call the setter and set the value to null. Well if you set the value to null, then you break the back button because someone will click back and then get a NullPointerException if they try and submit the form.

I should not put all of the blame on Tapestry though. Hibernate is itself a hugely complex piece of software as well. (ORM is a complex problem, so some of this complexity might be unavoidable.) There are so many possible ways of mapping objects that you can never truly know if you’ve done it the right or the best way, and best is always dependent on your situation and your application uses which makes it even harder to discuss with others whether it’s best. Hibernate does a lot of work to ensure the consistency of it’s Session, if there’s a possibility of it being inconsistent, it throws exceptions.

So you get LazyInstantiationExceptions because the Session is closed, you get NonUniqueObject exceptions because an object with the same id is already in the Session, you get TransientObjectExceptions because you didn’t schedule a new Object for saving in the session, etc, etc. Way too much implementation detail! Of course if you ever suggest to the Hibernate people that this is too much detail and they “should just handle it”, you get told how smart they are and how dumb you are and how you wouldn’t like the consequences of them not throwing these Exceptions. Isn’t that helpful.

As Ted Neward said, ORM is The Vietnam of Computer Science

Conclusion

This can’t be the best we can do. We’ve seen interesting things like Active Record come out of the Ruby world. Not that Active Record is perfect, but I have yet to see a NonUniqueObject exception or a Lazy Loading error which is a Good Thing ™ since I don’t care about those details. I think the Application centric, declarative configuration of Spring is much easier to understand and less error prone than the configuration of Hivemind.

New axioms:
All things being equal, Using fewer frameworks is better. Using less complex frameworks is better. Managing complexity is the premier goal of software development. Code reuse should not trump simplicty of design and simplicity of implementation.