Sunday, December 2, 2012

Postgres random in subquery

There was a use-case in a project, where huge number of test data was required, but it had to meet a constraint for one property. The database schema looked like the following:

item:
item_namepricecolorstore
Item 132rednull
Item 23bluenull
Item 37yellownull

store:
store_idnamecoordinate
1Store 112°34′56″N 12°34′56″E
2Store 212°34′56″N 12°34′56″W
3Store 312°34′56″S 12°34′56″E
4Store 412°34′56″S 12°34′56″W

The requirement was that the tables need to be populated with approximately 1 million items, and these items should be randomly associated with the stores. The stores were previously defined, and the store table has already been populated with the data. The items were partially defined, only the store association was null.

The database was Postgres 8.3.

The first SQL attempt was the following:

update item set store = (select store_id from store order by random() limit 1) where item.store is null;
The result was:
item_namepricecolorstore
Item 132red2
Item 23blue2
Item 37yellow2

All items received the same store id. It seemed as if the evaluation of the random selection only happened once, and was cached for every record. This can happen because the inner selection does not have any dependencies, so it can be treated as a completely separate query.

The solution is that it has to depend on the outer query, even if the dependency is quite silly from the developer point of view:

update item t1 set t1.store  = (select t2.store_id from store where t1=t1 order by random() limit 1) item.store is null;

The dependency here is the where t1=t1 part. Using this query, the result was exactly what we wanted to achieve:

item_namepricecolorstore
Item 132red2
Item 23blue4
Item 37yellow1


Thursday, October 18, 2012

Some random JPA pitfalls

There are some custom errors that are usually made by even people who are not quite novice in building JPA 2.0 queries.

One of the most common mistake is the use of isNull() and isNotNull() functions.
The other day I spent a considerable amount of time for debugging a code that contained the following:

...
CriteriaBuilder cb = em.getCriteriaBuilder();
CriteriaQuery<Foo> criteriaQuery = cb.createQuery(Foo.class);
Root<Foo> root = criteriaQuery.from(Foo.class);
Predicate predicate = cb.conjunction();
predicate=cb.and(predicate, cb.isNull(root.get(Foo_.amount)));
...
cb.where(predicate);
...

In this case, the generated SQL did not include the part "AND foo.amount = null". This is because this expression should be used in other cases, such as a CASE expression, and preferably contains constants.

The correct form of the predicate is this:

predicate=cb.and(predicate, root.get(Foo_.amount).isNull());

The resulting SQL now contains the foo.amount=null part, so the query finally works. The hard part is it is not so trivial (at least for me) from looking at just the source code, but if the debug is available, and the generated SQL can be tracked, then this is not a big deal to point out the mistake.
This applies to isNotNull() and in(Collection<e>) functions as well. 

Another common pitfall is to try to make two different queries with the same predicate, in a way that one creates a predicate for a query, executes the query, and tries to do another select with the previously created predicate instance. A typical use-case is a count and a select query, where the same filters are applied.
Unfortunately this cannot be achieved for predicates that have different root instances.

The third is not quite a mistake, but can easily be one. This is listed here basically for prevention.
What is the difference in the following queries (which can go wrong):

TypedQuery<Long> randomSelectForSingleResult = em.createQuery(cq);
int result = randomSelectForSingleResult.getSingleResult();

TypedQuery<Long> randomSelectForSingleResult = em.createQuery(criteriaQuery);
List<Long> resultList = randomSelectForSingleResult.getResultList();
int result = 0;
if(resultList.size() != 0)
    result = resultList.get(0);

The problem is that if the first type of selection does not contain EXACTLY one element, then it throws an exception. Although this may not happen if the selection is a count, otherwise it may occur, so if there is the tiniest, slightest chance that the query returns more or less than one results, the second attempt should be used.

Friday, October 12, 2012

received message with Unknown Transaction ID -1: ignoring message

Recently I ran into an interesting problem. In a pretty complex architecture (several Glassfish servers, multiple Liferay nodes) which ran without a problem for over a year, we had to implement a minor improvement. It was a pretty standard JMS message to be sent from one Glassfish instance to an other. We had several JMS messages cruising in the system, so this seemed a quite straightforward task.

Some key words about the setup: Spring JTA transaction management (spring framework ver. 3.0), Glassfish (3.0.1)  built in JMS implementation, Liferay 6.0.6. The database had read-committed isolation, and  used XA transaction, the JMS supported local transactions.

Previously we did not encounter any problem regarding the transactions and the JMS message send, as long as the service implementation was annotated with @org.springframework.transaction.annotation.Transactional(readonly = false)

The required change was that a file should be uploaded to the Lifery Document Library at the end of the method, the a JMS message to be sent to broadcast the event. After implementing the method, the JMS failed with the following exception:

java.lang.RuntimeException: com.sun.messaging.jms.JMSException: [SEND_REPLY(9)] [C4036]: A broker error occurred. :[500] transaction failed: Unexpected Broker Exception: [received message with Unknown Transaction ID -1: ignoring message] user=admin, broker=localhost:2076(50283)

It was clear after a while, if the Liferay file upload is not in the function, everything works as expected. The transaction was active for the method, the stack trace was clear about it. It was no use to debug the Liferay part, because if we want to upgrade to a newer version, it might need to be patched again for our code to work.

So it seemed as if the Liferay terminates (commits) the transaction during file upload. So we need to exit the current transaction while leaving it open, then return to it after the file upload completed, and continue as previously. To do that, we had to start a new thread to do the dirty work. The java language had the built-in solution:

ExecutorService executorService = Executors.newSingleThreadExecutor();
Future future = executorService.submit(new Callable() {
  @Override
  public Object call() throws Exception {
    try {
      // do file upload
      DLFileEntryLocalServiceUtil.
        addFileEntry(userId, groupId, folderId, finalFilename,
        finalFilename, description, (String) null, null, 
        finalBytes, serviceContext);
    } catch (final SystemException e) {
      return e;
    } catch (final PortalException e) {
      return e;
    }
      return Boolean.TRUE;
    }
});

Object object = future.get();
if (object instanceof Boolean) {
  if (((Boolean) object).booleanValue()) {
    // everything OK
  } else {
    // something went wrong
  }
} else if (object instanceof Exception) {
  // something went wrong
}

This implementation certainly has its drawbacks: extra care should be taken if the transaction fails after the file upload happened to rollback it. In our case, this was not an issue: it was OK if the resulting file was a result of a transaction that rolled back afterwards.