Architecturally Elegant -

Composable querying with Scala

Jyri-Matti Lähteenmäki


Tags: jpa2 scala sql

It's been a while since my last post, but finally I was able to find enough time to re-implement the whole querying-thing and experiment with composability. My initial implementation was partly mutable (orgh, sorry about that...) which made true composability somewhat difficult (well, impossible, I guess).

I've been using JPA2 (Java Persistence API, version 2) for a while now at my day job, and I finally figured out a way to make the metamodel and the criteria API at least somewhat useful. This was done by separating the construction of the queries from actually executing them, which conveniently allowed me to use the same query with different projections, pagination etc.

But no matter what I tried I just couldn't figure out how to construct the queries by composing them from reusable parts. The predicate seems to be the biggest unit of reusability, but it's more of a joke than anything useful. My current belief is that JPA2 just cannot be used to create composable queries.

Now welcome Scala. LINQ in .NET also boasts with composability, and why shouldn't it, since I guess it really works. Scala has its for comprehension, and behold, it can be used to create composable queries. Here are some examples that actually work (well, since SQL generation isn't one of the most interesting problems in software science, my generated SQL might be more or less incorrect, but it does seem to work correctly on a h2 database):

object Queries {
  def wellPaidEmployees(es: View[Employee]) = for {
    e <- es if e.salary.isDefined && e.salary.get > 3000
  } yield e

  def namesAndSalariesOf(es: View[Employee]) = for {
    e <- es
  } yield (e.fullName, e.salary)

  def namesAndSalariesOfWellPaidEmployees(es: View[Employee]) =

  def increasedSalariesForDepartment(ds: View[Department]) = for {
    d <- ds
    e <- d.employees if e.salary.isDefined
  } yield (e.fullName, "Old salary: ", e.salary.get,
           "New salary: ", e.salary.get + e.salary.get*2%42)

  val itDepartment = for (d <- Departments if == "IT") yield d

  val rndDepartment = for {
    d <- Departments if contains "research"
  } yield d

  val rndEmployees = for (d <- rndDepartment; e <- d.employees) yield e

  val wellPaidRnDEmployees = wellPaidEmployees(rndEmployees)

  val exceptionalSalaries = for {
    e <- Employees if e.fullName contains "Bill"
  } yield e.salary.get

  val employeesFromITandRndDepartments = (for {
    d1 <- itDepartment
    d2 <- rndDepartment
    e1 <- d1.employees
    e2 <- d2.employees
  } yield Set(e1, e2)).flatten.distinct

  val amountOfEmployeesFromITandRndDepartments =
      employeesFromITandRndDepartments size

  val exceptionalSalariesFromRndDepartment = for {
    e <- rndEmployees if exceptionalSalaries contains e.salary.get
  } yield e.salary

  val namesAndSalariesOfWellPaidRnDEmployees =

  val increasedSalariesForRnD = increasedSalariesForDepartment(rndDepartment)

Now look at that, it's just beautiful! I can use independent queries as part of other queries, or define incomplete queries that can be completed by providing the missing parts. Also, there's no need for the actual database session while constructing the queries. Since all the queries are immutable, they can be defined as singleton values.

And all these compile and work if I throw away all SQL stuff and just use case classes and in-memory collections. I only need to change View[_] to Traversable[_] or declare something like type View[E] = Traversable[E]. The last of the queries, when executed, generates SQL like this:

SELECT e48.fullName, 'Old salary: ', e48.salary, 'New salary: ',
      FROM Department d49
      WHERE LOWER( LIKE '%research%') d50
INNER JOIN Employee e48 ON
WHERE e48.salary IS NOT NULL

While these examples already demonstrate some implemented "SQL features", I guess I'm now going to spend some time implementing a bunch of more stuff to see if I run into trouble. After that, it would be fascinating to try querying XML...