Architecturally Elegant - lahteenmaki.net

Functional programming in Java

Jyri-Matti Lähteenmäki

2022-08-22

Tags: fp java

Background

I work in a software consultancy called Solita. There I've been for the past ~10 years implementing software systems related to Finnish railways. Part of what we've been doing is a spatial and temporal data model of railway infrastructure and planning restrictions, with rich restful APIs. These are also available as open data for those who are interested. See https://rata.digitraffic.fi/infra-api/ and https://rata.digitraffic.fi/jeti-api/. This and other Digitraffic data can be used to implement applications like Rafiikka if you happen to like such a thing.

End of commercial ;)

The Digitraffic APIs are just the tip of the iceberg. The systems we've been implementing are rather large since they span many important functionalities. Building, maintaining, and further developing this kind of large codebase requires tools like programming languages that prevent us from shooting ourselves to the leg. Unfortunately, being an oldish enterprise project, Haskell is out of the question and the backend codebase is in Java. If we can't use a superior tool like Haskell, we can at least use some of its good parts in Java, like functional programming.

In 2012 Java was around version 6 (Java 7 was released in July 2011) and 99% of Java programmers hadn't even heard of functional programming let alone lambdas or even first-class functions. Having learned some Scala and Haskell, I had discovered some of the benefits that functional programming can bring to a codebase, and wanted to promote it even if we had to use Java.

However, to be able to use functional programming without major pain, the syntax has to be lightweight enough. And we are talking about Java here. Luckily Java 6 already had "compile-time macros" of a kind, namely Annotation processors, which made it possible to generate some of the structures Java was missing back then (and still is in version 18, released in March 2022), like field references or partial application.

The annotation processor we implemented, called meta-utils, is available in Github. It works with Java 6 and later. This allowed us to practice functional programming like it's done today in modern Java.

Unfortunately, we couldn't find a suitable functional programming library for Java. One candidate was LambdaJ by Mario Fusco. While being a revolutionary library at its time, it had some fundamental issues like using reflection and being based on proxying, so we thought it might be too big a risk.

Another candidate was FunctionalJava by Tony Morris. While it had loads of functional programming paradigms implemented already back then, I was afraid it would be too "mathy" for the rest of the team. Even though I happen to be somewhat interested in mathy and category-theoretic stuff, not that many software developers share those interests, at least as deeply.

Yet another candidate was Google Collections (nowadays known as Guava). Being from Google it had a good chance of being maintained and further developed, but it seemed to have some problems. Namely, I felt it wasn't opinionated enough and provided functions helping in using nulls, while I wanted to forbid nulls from our codebase. Google had also decided that they would never support Tuples which I found extremely useful. This was a warning sign that it might prove difficult or even impossible to try to get contributions in.

One more candidate, which appeared later after our project was already going strong, was RetroLambda by my former colleague Esko Luontola. Being able to use lambdas already in Java 6 codebase would have made some of the structures generated by meta-utils obsolete, or at least not that necessary. However, due to all the essential complexity in our spatial-temporal-relational-railway-modeling, we didn't want to take in any more complexity and were hoping to be able to upgrade to Java 8 soon anyway, so we never used it. We were finally able to upgrade to Java 8 after 5-or-so years around 2018...

I don't remember if we found more candidates back then, but the result was that we decided to write our own CollectionUtils module having the map/filter/Option/etc concepts we found useful, and see where that leads us. During the years I had some interest to try out different things and the module ended up in Github as functional-utils.

What follows is a small introduction to these libraries. Note that when I talk about Java I'm referring to features available in Java 6. No lambdas, no function/method/constructor references, and obviously no field references or partial application.

meta-utils

The main point of meta-utils is that Java, as it is, is missing some structures needed to practice functional programming in a light-weight syntax. We programmers, however, already have these structures inside our minds, so generating them during compilation shouldn't bring much mental overhead.

Think about some Java structures. Java has functions:

public class MyClass {
    public static Integer length(String str) {
        return str.length();
    }
}

length is a function from a String to an Integer. If we wanted to transform a list of Strings to their lengths, however, there's no way to reference this function except by manually wrapping the call inside an anonymous inner class:

// Apply and map provided by a library:
public interface Apply<S,T> {
    public T apply(S source);
}
public <S,T> List<T> map(Apply<S,T> f, List<S> lst) { ... };

List<String> strings = Arrays.asList("foo", "quux");
List<Integer> lengths = map(new Apply<String,Integer>() {
    public Integer apply(String source) {
        return length(source);
    }
}, strings);

If You think this is syntactically acceptable for functional programming, I would certainly like to hear why you feel like that :)

All we need is a way to reference the length function to be able to pass it to map. Meta-utils generates this for us at compile-time, so that we can write it more succinctly:

List<String> strings = Arrays.asList("foo", "quux");
List<Integer> lengths = map(MyClass_.length, strings);

// which is almost the same as in modern Java:
//List<Integer> lengths = map(MyClass::length, strings);

What meta-utils generates in the background is pretty much the same thing we already wrote by hand:

public class MyClass_ {
    public static final  MetaMethods.M1<String, Integer>  length = new MetaMethods.M1<String, Integer> (MyClass.class, "length", String.class) {
        public final Integer apply(final String str) { 
            return MyClass.length(str);
        }
    };   
}

For each class Foo a class named Foo_ is generated in the same package, containing all the generated structures. Java annotation processors could generate class files directly, but we chose to generate source files so that everyone could open them in the IDE like any class to see what's going on. Visible magic is at least a bit less magical (looking at you, Lombok).

Consider another Java construct, a field:

public class Person {
    public String name;
}

If we think about functional programming where pretty much everything is a function, can a field be considered just a function that could be referenced like an ordinary function? Could we transform a list of Persons to a list of their names?

List<Person> persons = ...;
List<String> names = map(Person_.name, persons);

If you think about a field, it's just a function from the class instance to the field value. In this case from a Person to its name. A similar structure is generated:

public class Person_ {
    public static final MetaFieldProperty<Person, String> name = new MetaFieldProperty<Person, String>(Person.class, "name") {
        public final String apply(final Person $self) {
            return $self == null ? null : $self.name;
        }
    }
}

A constructor is a function from its arguments to the class instance:

public class Person {
    public String name;

    public Person(String name) {
        this.name = name;
    }
}

List<Person> persons = map(Person_.$, Arrays.asList("Mario", "Tony", "Esko"));

Constructors in Java don't have names so we had to generate names for them. Since a dollar sign is kind of a reserved symbol in Java, the meta-constructors are named $, $1, $2, etc.

Java, being an object-oriented language, also has methods. I've learned that most Java developers haven't really thought about how these structures relate to functions, and a method seems to be especially difficult. If you happen to have experience in Python, however, it might be obvious that a method is a function from the instance and the method parameters to the method return value:

public class Person {
    public String kind() {
        return "human";
    }
}

List<Person> persons = ...;
List<String> kinds = map(Person_.kind, persons);

For constructors and methods, some familiar structures are generated:

public class Person_ {
    public static final MetaConstructors.C1<String, Person> $ = new MetaConstructors.C1<String, Person>(Person.class, String.class) {
        public Person apply(String name) {
            return new Person(name);
        }
    };

    public static final MetaMethods.M1<Person, String> kind = new MetaMethods.M1<Person, String> (Person.class, "kind") {
        public final String apply(final Person $self) { 
            return  $self == null ? null : $self.kind();
        }
    };
}

In the generated classes you can see things like MetaMethods.M1 etc. These are just functions of one argument, having some additional metadata like the java Member (Field, Method, ...) and the name of the member. There are also similar types for different arities to represent functions of a different number of arguments.

This is where Tuples come in. A tuple is just a fixed-size list of values where the type of each value can be different and is statically known. In Haskell, every function always takes only a single argument (currying) and multiple arguments can be simulated with tuples. In Java it would be nice to unify multiple arguments and tuples, so all the generated meta-functions have two apply-methods: one taking multiple arguments and another taking a similar tuple. Thus we can map over multiple-arity functions:

class MyClass {
    public static Integer length(String str1, String str2) {
        return str1.length() + str2.length();
    }
}

List<Tuple2<String,String>> tuples = ...;
List<Integer> lengths = map(MyClass_.length, tuples);

This is especially handy when you have a list of large tuples, like data queried from a database with an appropriate library (see this post and this), and you want to map them to a function or a constructor.

These were the most important structures. Additionally, for a class like this:

public class Person {
    public String name;
    public int age;
}

there are things like this found in the generated class:

public class Person_ {
    public enum $FieldNames {
        name,
        age
    }

    public static Tuple2<MetaFieldProperty<Person, String>,MetaFieldProperty<Person, Integer>> $Fields() {
        return Tuple.of(
            Person_.name,
            Person_.age);
    }
}

$FieldNames is an enumeration containing all the field names. This can be used in some cases to turn a missing field handling of a class into a compiler error. $Fields() returns all the meta fields as a tuple. This can be used for example to create a whole meta-representation of a data class without resorting to reflection.

Since all the meta structures are generated, nothing prevents us from generating them for code made by other people. We successfully generated meta structures for Jodatime, Hibernate, Wicket, and the whole Java standard library.

functional-utils

The main idea of functional-utils was to provide some functional programming constructs without implementing half the world while trying. For example, it doesn't provide an immutable collection implementation but just operates with Java's built-in abstractions (mainly Iterable). It does provide some stuff missing from Java, like Option, Either, fluent comparators, functions, tuples, monoids, builders, and basic lenses, but most lines of code tend to come from supporting arities up to 32.

Concepts

Uniformity

Creating and transforming collections of different kinds should be easy and intuitive. Java has several different kinds of "collections":

  • Object arrays
  • primitive arrays
  • Collections (Sets, Lists, Maps...)
  • Strings
  • other CharSequences
  • Enumerables

Whenever you want to create a collection or transform one to another, you should be able to do it in about the same way:

List<Integer>       newlist                  = newList(42);
Set<Integer>        newset                   = newSet(42);
Map<String,Integer> newmap                   = newMap(Pair.of("a", 42));
List<Integer>       listFromSet              = newList(newset);
Integer[]           arrayFromList            = newArray(Integer.class, newlist);
Set<Integer>        setFromArray             = newSet(new int[] {42});
Integer[]           objectArrayFromPrimitive = newArray(new int[] {42});
int[]               primitiveArrayFromObject = newArray(new Integer[] {42});
List<Integer>       listFromEnumeration      = newList(new Vector<Integer>().elements());
List<Character>     listFromString           = newList(it("foo")); // `it` converts string to Iterable<Character>

Similarly, methods in class Functional like map should be callable the same way with an Iterable, Array, or String, even though the return type would always be an Iterable.

Immutability

All structures should be immutable unless otherwise noted. For example, newList(a,b,c) creates an immutable list, whereas a mutable list can be acquired by newMutableList().

Type safety

All functions, if possible, should be safe such that they are impossible (or at least difficult) to use in the wrong way. Make illegal states unrepresentable. For example, newSortedSet requires the element to implement Comparable, or alternatively, a comparator is required as another argument. Or newMap(Iterable) requires a SemiGroup argument to define how duplicate keys are handled.

Laziness

Laziness is sometimes awesome. It's nice to handle infinite collections:

Iterable<Integer> naturalNumbers = range(1);
List<Integer> first10evenNaturalNumbers = newList(take(10, filter(Predicates.even, naturalNumbers)));

But the most important thing is that we can filter and map (etc) really large sets without worrying about performance even if in the end we need just a single value of them. Even though it's a double-edged sword, all functions in functional-utils try to be as lazy as possible.

No nulls

Nulls were a billion-dollar mistake. One of the biggest mistakes in Java. Every Java project should avoid using nulls as much as possible. Functional-utils does pass through nulls in many places and Option.of provides a bridge between nullable and non-nullable world, but the library doesn't require or enforce using nulls.

Some examples

Option is important since there's no Optional in Java 6. This is the main replacement for nulls and a bridge between nullable and non-nullable worlds. Options can be constructed from values or null and manipulated with several methods. Also, Option implements Iterable so it can be iterated over like a regular collection, and used in a for loop:

Option<String> optionContainingFoo = Some("foo");
Option<String> emptyOption = None();

Option<String> valueResultsInSome = Option.of("foo");
Option<Object> nullResultsInNone = Option.of(null);

for (String str: Some("foo")) {
    // executed once
}

for (String str: None()) {
    // never executed
}

Either is, unfortunately, missing from Java, causing use cases like "This value can be either 'foo' or any other string" to be encoded with magic constants instead of more elegant Either<Foo,String>:

Either<String,Integer> left = Either.left("foo");
Either<String,Integer> right = Either.right(42);

Option<String> some = Either.left("foo").left;
Option<String> none = Either.left("foo").right;

Tuples can be constructed and accessed in a lightweight syntax making them useful for any case where naming things feels like a waste of time. It's also easy to join tuples, append and prepend values, or take a prefix or a suffix. In our project tuples are used heavily in returning rows from the database, and creating test data for unit tests.

Tuple2<String, Integer> tuple = Tuple.of("foo", 42);
Tuple2<String, Integer> appended = Tuple.of("foo").append(42);

String string = tuple._1;
Tuple1<Integer> suffix = Tuple.of("foo", 42).drop1();

Building complex comparators is a lot easier with some utilities than without them:

Comparator<Person> byNameReversed = Compare.by(Person_.name).reverse();
Comparator<Person> byNameAndAge = Compare.by(Person_.name).then(
                                  Compare.by(Person_.age));

Comparator<Map.Entry<String,Integer>> byKey = Compare.byKey();
Comparator<List<Person>> byElementField = Compare.byIterable(Compare.by(Person_.name));    

Functions can be simple identities or constants or have any arity (up to 32). They can be applied partially, or composed with other functions:

Function1<String, String> identity = Function.id();
Function1<?,Integer>      constant = Function.constant(42);
Function0<Integer>        value    = Function.of(42);

Function2<String, String, Integer> someTwoArgFunction = ...;
Integer                    completelyApplied  = someTwoArgFunction.apply("foo", "bar");
Function1<String, Integer> firstValueApplied  = someTwoArgFunction.ap("foo");
Function1<String, Integer> secondValueApplied = someTwoArgFunction.apply(Function.__, "foo");

public class Person {
    public int age;
    public Person(String name) {
        this.age = name.length();
    }
}
Function1<String,Integer> composed = Person_.$.andThen(Person_.age);

Monoids and SemiGroups are concepts that few are familiar with, but which turn out useful quite often:

long    three   = reduce(Monoids.longSum, longs);
long    two     = reduce(Monoids.longProduct, longs);
boolean notTrue = reduce(Monoids.booleanConjunction, newList(true, false));
String  foobar  = reduce(Monoids.stringConcat, newList("foo", "bar"));

Map<String, Long> first        = emptyMap();
Map<String, Long> second       = emptyMap();
Map<String, Long> valuesSummed = reduce(Monoids.mapCombine(SemiGroups.longSum), newList(first, second));

Functional class contains many functions familiar from any functional library, like map, filter, take, etc. Most of these accept Arrays, Strings, or CharSequences as an argument in addition to the most useful Iterable. Almost all the functions operate lazily by returning another Iterable. See the class Javadocs for some examples.

Builders are sometimes useful (although personally I consider them an antipattern), and when we have builders we have a way to construct objects, thus giving us also Lenses:

public class Person {
    public String name;
    public Option<Integer> salary;
    public Person(String name, Option<Integer> salary) {
        this.name = name;
        this.salary = salary;
    }
}

Builder<Person> personBuilder = Builder.of(Person_.$Fields(), Person_.$);
Person person = personBuilder
    .with(Person_.name, "Jyri-Matti")
    .without(Person_.salary)
    .build();
String jyrimatti = person.name;
Option<Integer> none = person.salary;

Lens<Person,String> name_ = Lens.of(Person_.name, personBuilder);

String stillJyrimatti = name_.get(person);

Person newPerson = name_.set(person, "Jüppe");
String jyppe = newPerson.name;

Conclusion

Although Java as a programming language doesn't come even close to languages like Haskell, functional programming is still quite possible. I wouldn't call modern Java yet that functional, but functional-utils and similar libraries can take it quite far.

If you happen to be an author of a functional programming library yourself, feel free to borrow any ideas you find useful. Suggestions or recommendations for meta-utils and functional-utils are greatly appreciated, please post a comment or create a Github issue.