These cases are exactly what I'll discuss.
I'll also discuss refactoring of 'unconsciously functionalish' code (like new AndFilter(new FieldMatchesPatternFilter(new FieldReference("name"), ".*John.*"), new BlaBlaBlaFilter())) for readability.
DISCLAIMER:
Most of this article is not at all about FP. It is about increasing readability of certain common patterns which, when used unreadably, render FP unusable.
I'll talk a little about FP in the end of the article.
With respect to FP, java has the following problems:
- The infamous 'kingdom of nouns' (the tradition of considering everything to be an object or noun and thinking that to add two numbers, one has to ask a NumberAdder)
- Lack of a convenient syntax for constructing data, like tuples, lists etc
- Cumbersome syntax for generics and poor type inference and polymorphism implementation
Our weapons agains these problems shall be 'import static', varargs and a little bit of gunpowder.
1. Increasing readability of data construction
Very bad:
private static final Set<String> INTERESTING_TAGS = new HashSet<String>();
Map<String,Integer> WEIGHTS = new HashMap<String,Integer>
static {
INTERESTING_TAGS.add("A");
INTERESTING_TAGS.add("FORM");
INTERESTING_TAGS.add("INPUT");
INTERESTING_TAGS.add("SCRIPT");
INTERESTING_TAGS.add("OBJECT");
WEIGHTS.put("bad", -2);
WEIGHTS.put("poor", -1);
WEIGHTS.put("average", 0);
WEIGHTS.put("nice", 1);
WEIGHTS.put("outstanding", 3);
}
Bad:
private static final Set<String> INTERESTING_TAGS = new HashSet<String>(Arrays.asList(new String[] {
"A","FORM","INPUT","SCRIPT","OBJECT"
}));
for(double guessedScale : new double[] {0.01, 0.1, 1, 10, 100}) {
...
}
Good:
public class CollectionUtils {
public static T[] ar(T... ts) {return ts;}
public static Set<T> set(T... ts) {
return new HashSet<T>(Arrays.asList(ts));
}
public static Map<K,V> zipMap(K[] keys, V[] values) {...}
}
import static CollectionUtils.set;
import static CollectionUtils.ar;
private static final Set<String> INTERESTING_TAGS = set("A","FORM","INPUT","SCRIPT","OBJECT");
for(double guessedScale : ar(0.01, 0.1, 1, 10, 100)) {
...
}
Map<String,Integer> WEIGHTS = zipMap(
ar("bad", "poor", "average", "nice", "outstanding"),
ar(-2, -1, 0, 1, 3));
What can be easier? It's just the good old data abstraction that everyone remembers from the second chapter of SICP!
Its usefulness is particularly noticeable in unittests, where one has to test things on complex data structures.
In those cases the best decision is often to encode these structures in strings and to write a smallish parser for them.
For instance:
assertTrue(GraphAnalyzer.isConnected(graph("1->2 2->3 3->1")));
2. Increasing readability of combinators
Bad:
Filter f = new AndFilter(first, second);
Good:
public abstract class Filters {
public static Filter and(Filter a, Filter b) {return new AndFilter(a,b);}
}
import static Filters.*;
Filter f = and(first,second);
Even better:
public abstract class Filter {
Filter and(Filter other) {return Filters.and(this,other);}
}
Filter f = first.and(second).and(third);
And yet even better:
public abstract class Filters {
public static Filter ALWAYS_TRUE = new AlwaysTrue();
public static Filter and(Filter... filters) {
Filter res = ALWAYS_TRUE;
for(Filter f : filters) res = res.and(f);
return res;
}
}
Another example.
Bad:
enum StringComparisonKind {EXACT, REGEX, GLOB}
enum StringPosition {ANYWHERE, WHOLE_STRING, STARTS_WITH, ENDS_WITH}
public class StringCondition {
...
public StringCondition(String pattern, StringComparisonKind comparisonKind, StringPosition position) {...}
}
conditions.add(new StringCondition("foo", StringComparisonKind.REGEX, StringPosition.ANYWHERE))
Good:
import static StringComparisonKind.*;
import static StringPosition.*;
public class StringConditions {
public static regexWhole(String regex) {return new StringCondition(regex, REGEX, WHOLE_STRING);}
public static regexAnywhere(String regex) {return new StringCondition(regex, REGEX, ANYWHERE);}
public static exactWhole(String pattern) {return new StringCondition(pattern, EXACT, WHOLE_STRING);}
...
}
import static StringConditions.*;
conditions.add(regexAnywhere("foo"));
Abstraction, abstraction, abstraction - it is surprisingly underestimated and underused.
3. Increasing readability of anonymous classes
...by extracting them into constants, local variables, named classes or whichever named entity.
Bad:
List<Order> orders = CollectionUtils.flatten(CollectionUtils.map(customers, new Function<Customer, List<Order>>() {
public List<Order> apply(Customer customer) {
return customer.getOrders();
}
}));
Good:
import static CollectionUtils.*;
private static final Function<Customer, List<Order>> GET_ORDERS = new Function<Customer, List<Order>>() {
public List<Order> apply(Customer customer) {
return customer.getOrders();
}
};
List<Order> orders = flatten(map(customers, GET_ORDERS));
4. Increasing readability of combinator classes
Bad:
CustomerProcessor taxes = new ComputeTaxesCustomerProcessor();
These suffixes are absolutely useless. They could be useful if java didn't have packages or static typing: suffixes would prevent us from polluting the global namespace or from accidentally mixing one And with an other; but java has them, and suffixes are as useless as hungarian notation.
Good:
CustomerProcessor taxes = new ComputeTaxes();
5. Increasing readability of cumbersome generic types - poor man's typedef
Bad:
class FooEverythingDoer {
...
Map<String, String> getProperties(Foo foo) {...}
void putProperties(Foo foo, Map<String, String> properties) {...}
Map<Foo, Map<String, String>> getPropertiesBatch(Iterable<Foo> foos) {...}
Foo findByProperties(Map<String, String> partOfProperties) {...}
...
}
Good:
class Properties extends Map<String,String> {
(constructors matching super here)
}
class FooEverythingDoer {
...
Properties getProperties(Foo foo) {...}
void putProperties(Foo foo, Properties properties) {...}
Map<Foo, Properties> getPropertiesBatch(Iterable<Foo> foos) {...}
Foo findByProperties(Properties partOfProperties) {...}
...
}
Bad:
class MagicBarMerger {
public void mergeIntoDb(List<MagicBar> bars) {
List<MagicBar> existingBars = barDao.getAllBars();
Map<Integer, List<Pair<MagicBar, MagicBar>>> pairsById = joinOnId(bars, existingBars);
List<MagicBar> merged = new ArrayList<MagicBar>();
for(Pair<MagicBar, MagicBar> pair : pairsById) {
merged.add(merge(pair));
}
barDao.removeAllBars():
barDao.putBarsBatch(merged);
}
private Map<Integer, Pair<MagicBar, MagicBar>> joinOnId(List<MagicBar> as, List<MagicBar> bs) {
....
}
private MagicBar merge(Pair<MagicBar, MagicBar> bars) {
....
}
}
Good:
class MagicBarMerger {
private static class Bars extends Pair<MagicBar,MagicBar> {(constructor matching super)}
public void mergeIntoDb(List<MagicBar> bars) {
List<MagicBar> existingBars = barDao.getAllBars();
Map<Integer, Bars> pairsById = joinOnId(bars, existingBars);
List<MagicBar> merged = new ArrayList<MagicBar>();
for(Bars bars : pairsById) {
merged.add(merge(bars));
}
barDao.removeAllBars():
barDao.putBarsBatch(merged);
}
private Map<Integer, Bars> joinOnId(List<MagicBar> as, List<MagicBar> bs) {
....
}
private MagicBar merge(Bars bars) {
....
}
}
6. Increasing readability of generics again, utilizing whatever type inference java has
Bad:
Map<Integer, List<String>> namesById = new HashMap<Integer, List<String>>();
Good:
public class CollectionUtils {
public static <K,V> Map<K,V> newMap() {
return new HashMap<K,V>();
}
}
import static CollectionUtils.newMap;
Map<Integer, List<String>> namesById = newMap();
These are similar to ar(T... ts) and set(T... ts) of rule 1.
7. Increasing debuggability
Sometimes I feel a bit uneasy when, during debugging, I inspect an object just to find that is has class FooProcessor$3$1 and two fields, 'innerProcessor' and 'value' with a class of FooProcessor$4$2 and a value of 1.
Therefore it is useful to give meaningful names to all classes that escape scope of the method or enclosing class (most do, according to rule 3).
Giving them meaningful toString's is even better. They don't take much to write but they give a lot to read during debugging.
Higher-order combinators like AND and OR are often used to glue a statically unknown number of parts. Then one ends up with an object structure like AND(x, AND(y, AND(z, ...)))). Such structures are inconvenient to view in the debugger and are even more inconvenient to step through.
So, it is sometimes a good idea to make the AND combinator and alike glue not 2 but an arbitrary number of parts, and have the static factory method 'Filters.and(first,second)' (you did abstract the constructor, right? ;) ) check whether 'first' or 'second' are already AND's and, if so, glue them into a bigger one.
Then the recursive structure will turn into an nice iterative one, pleasing to be viewed in debugger, serialized to XML and to have its loop-over-parts stepped through.
****
Here goes some heavy FP artillery.
0. Read the J vocabulary ;)
This one: http://www.jsoftware.com/help/dictionar
And get a feeling of what the primitive functions and combinators do. You don't have to start writing in J or even to learn it, just read the descriptions of primitives.
J is an outstanding example of a functional language without statical typing and without even closures. This means that most J combinators can be implemented and used in java without stumbling over java's limitations.
My favourites are "&." (f &. g = f^-1 . g . f), "/." (x f/. y = a vector of f(g) where g ranges over groups of x grouped by y) and "/:" (x /: y - x vector, sorted by corresponding values of the y vector)
8. Currying the last argument
(Remember a rule of thumb: the last argument should be the most frequently varying one, then it will be convenient to curry)
Bad:
public class CollectionUtils {
public static <T> List<T> filter(Filter<T> filter, List<T> ts) {...}
public static <K, V> List<V> map(Function<K, V> fun, List<K> ts) {...}
}
Good:
public class CollectionUtils {
public static <T> Function<List<T>, List<T>> filter(Filter<T> filter) {...}
public static <K, V> Function<List<K>, List<V>> map(Function<K, V> fun) {...}
}
Why:
Because the combinators now are combineable.
public class FP {
public static <A,B,C> Function<A,C> chain(Function<A,B> first, Function<B,C> second) {...}
}
Or like this:
public abstract class Function<K,V> {
public abstract V apply(K argument);
public Function<K,U> then(Function<V,U> second) {return FP.chain(this, second);}
}
Now we can write things like:
Let Order be a tuple (Customer, Product, Time)
public abstract class Aggregate<K,V> extends Function<Collection<K>, V> {}
// select K,V from S group by K
public abstract class Partitioned<S,K,V> extends Function<Collection<S>, Map<K, V>> {}
Function<Order,Product> getProduct() {..}
Function<Product,Price> getPrice() {..}
Filter<Product> categoryEquals(String pat) {..}
Partitioned<Order,Month,T> byMonth(Aggregate<Order, T> inner) {..}
And, with that:
public Partitioned<Order,Month,Customer> mostGenerousCustomerByMonth() {
Function<Order, Price> ORDER_PRICE = getProduct().then(getPrice());
Aggregate<Order, Order> MOST_EXPENSIVE_ORDER =
over(getProduct().then(categoryEquals("Cars")), maximizeBy(ORDER_PRICE));
return byMonth(MOST_EXPENSIVE_ORDER.then(Order.GET_CUSTOMER));
}
{
...
Map<Month, Customer> goodGuysIn2007 = filter(timeBetween(year(2007), year(2008)))
.then(mostGenerousCustomerByMonth())
.apply(ourOrders);
...
}
9. Pairs and binary relations
Nearly every project has a home-brewn Pair class and uses lists like List<Pair<Foo,Bar>>
I personally like collecting values of a function 'foo' on the left parts of pairs, or collecting pairs of foo(left) and bar(right), or retaining only pairs whose right part satisfies predicate qux, or stuff.
God would kill a kitten if we didn't create some combinators for that and bless List<Pair<Foo,Bar>> with the ability of having them as member methods:
class BiRelation<L,R> {
...
List<Pair<L,R>> allEntries() {..}
static BiRelation<L,R> rel(List<Pair<L,R>> pairs) {..}
BiRelation<R,L> flip() {..}
BiRelation<L,R> filterL(Predicate<L> p) {..}
BiRelation<R,L> filterR(Predicate<R> p) {return flip().filterL(p).flip();}
BiRelation<L,List<R>> compressL() {..}
BiRelation<List<L>,R> compressR() {return flip().compressL().flip();}
BiRelation<P,Q> fmap(Function<L,P> f, Function<R,Q> g) {return rel(map(pair(f,g)).apply(allEntries()));}
List<T> map(Function<Pair<L,R>, T> fun) {...}
...
}
This also illustrates another cool thing, the "worker/wrapper transformation" (the flip()/.../flip() thing).
Now we can write a BFS:
private static BiRelation<Node, Node> bfs(BiRelation<Node, Node> graph, Node root) {
Set<Node> roots = singleton(root);
Set<Node> reached = new LinkedHashSet<Node>();
reached.add(root);
List<Pair<Node,Node>> res = newList();
while(true) {
BiRelation<Node,Node> nextLayer = graph.filterL(memberOf(roots)).filterR(not(memberOf(reached)));
if(nextLayer.isEmpty())
break;
res.addAll(nextLayer.allEntries());
reached.addAll(roots = nextLayer.rightSet());
}
return rel(res);
}
Or the Schwartz transform (used when sorting an array on the value of a function, in order to avoid computing the function multiple times for each element. The essence is that we glue a vector of function values to the array, sort the resulting vector of pairs on its second element and then unglue the function values):
public static Function<T,Pair<T,U>> attach(Function<T,U> fun) {..}
public static Function<Pair<T,U>,T> detachR() {..}
public static Function<Pair<T,U>,U> detachL() {..}
public static <U extends Comparable<? super U>>
Function<List<T>,List<T>> schwartzSortBy(Function<T,U> fun)
{
return map(attach(fun)).then(sortBy(second())).then(map(detachR()));
}
Or this one:
BiRelation<Customer, Order> customer_order = ... BiRelation<Product, Integer> product_buyersCount = customer_order.fmap(ID, GET_PRODUCT).compressR().fmap(SIZE, ID).flip();
Or many other things.
flip() is my favourite; many things are symmetric: pairs, relations, invertible functions, Map's etc.
Ternary relations can be rotated: TriRelation<A,B,C>.rotate213() etc. However, their use quickly gets mind-boggling.
10. Coarse combinators
Some processors use the VLIW architecture (Very Large Instruction Word). One instruction contains several actions: for example, multiply-and-add-to-accumulator, divide-and-compute-square-root, send-email-and-switch-screen-resolution :) This is done to improve instruction-level parallelism.
In the case of java FP we can utilize a similar trick to increase readability and decrease parentheses; maybe also for performance: a coarse combinator can be implemented more effectively than through the plain combination of primitive combinators.
Instead of this:
Aggregate<Customer, Order> everyonesOrdersIn2008 = map(GET_ORDERS).then(concat()).then(filter(timeBetween(year(2007), year(2008))))
Use this:
public static <K,V> Aggregate<K,V> concatMapOver(Function<K,V> fun, Filter<K> filter) {...}
Aggregate<Customer, Order> everyonesOrdersIn2008 = concatMapOver(GET_ORDERS, timeBetween(year(2007), year(2008)));
That's it.
Anonymous
July 3 2008, 02:34:03 UTC 3 years ago
Good stuff
I have some minor differences over readability in some cases, but the good presented here far outweighs my quibbles.July 3 2008, 13:09:17 UTC 3 years ago
July 3 2008, 13:29:35 UTC 3 years ago
Anonymous
July 7 2008, 23:14:01 UTC 3 years ago
Anonymous functions
Well the lack of anonymous functions surely doesn't help... Of course I know about anonymous classes but they're not quite the same thing.- Peter Odding
Anonymous
July 6 2008, 00:02:12 UTC 3 years ago
Spiffy framework
Hi.Very nice post! Some of the FP-stuff requires a bit of getting used to though before I would say it's more readable ;-) But great stuff anyhow. Some of the ideas you present, I've already put in a small framework called 'spiffy framework', I was wondering if it's worth adding a package with some of the FP stuff? The cool way you use inference for instantiating maps is just awesome! I'm adding that right now. Was that your idea, or did you pick it up from somewhere?
The framework is available at http://spiffyframework.sourceforge.net/ should anyone be interested.
cheers,
Kasper
July 6 2008, 04:43:01 UTC 3 years ago
Re: Spiffy framework
Thanks! I'll have a look at Spiffy.As for adding a package with some of the FP, I'm not sure about that: it seems to me that there are not too many useful general-purpose functions (map,filter,concatMap,stuff about pairs and some more); probably not many enough to put them into a package. I think in most cases a domain-specific solution is better.
However, inclusion of a good FP package will be motivating for the readers of it to use FP, to write their domain-specific solutions and to increase amount of good code in the world :)
As for the maps, that was my idea, and actually I first thought of it while writing the post; now I am using it everywhere, too. In combination with factory methods one gets a bunch of statics like newUnorderedMap()/newLinkedMap() which better emphasize the choice of HashMap vs LinkedHashMap (because new HashMap() is such a common construction that one never knows whether the person wanted to emphasize that order doesn't matter or just forgot about LinkedHashMap at the moment).
Anonymous
July 6 2008, 17:59:09 UTC 3 years ago
Re: Spiffy framework
By package, I meant a java package inside the spiffy framework ;-) Let me know if you/we could do some FP util methods.July 6 2008, 18:25:38 UTC 3 years ago
Re: Spiffy framework
Please understand me correctly but spiffy seems to me like a rather diverse but small collection of rather diverse useful classes; I beleive that more focused libraries are more convenient to use and actually are more used. So, I personally would prefer to either write a new well-thought FP library from scratch and publish it on sourceforce, or to select the most 'promising' of FP libraries on sourceforge and collaborate with the author on its enhancement. However, some time ago I searched for fp libraries in java and found nothing that I could now classify as 'having room for sufficient enhancement to be usable'.Having all that said, I am definitely not against you including FP stuff with ideas from this post in spiffy, and I would be grateful if you notified me of that in the case, so that we could discuss it.
By the way, I think you will be interested to have a look at this one, by my colleague: http://people.yandex-team.ru/~stepanche
Anonymous
July 6 2008, 21:03:00 UTC 3 years ago
Re: Spiffy framework
Well, the idea is that spiffy framework will be distributed as a core and several separated libraries. but i will only do so when it warrants it. as of now, thee library is far too small to be split up. I constantly find myself missing stuff from spiffy, so I honestly do believe it's a spiffy framework ;-)cheers for the link..
kasper
Anonymous
July 7 2008, 19:57:53 UTC 3 years ago
You might be interested in...
Functional Java http://functionaljava.org/Anonymous
July 7 2008, 23:11:37 UTC 3 years ago
Thank you
The subject says it all! In my experience Java code gets verbose quickly because of all the typing. This post helped me further reduce the amount of boilerplate in my code. I've experimented with functional idioms in Java before but I've never gone as far as your post describes. Bookmarked!- Peter Odding
Anonymous
January 25 2012, 20:41:16 UTC 4 months ago
I'd like to see the end of the practice of putting the leading curly bracket at the end of the preceding line. Things are far more readable when brackets line up. I guess this is done to make the code appear different from the traditional way C/C++ is written, but this is an odd, and arbitrary, point to suddenly attempt to make the languages appear different.
Needlessly wrapping lines at 80 or whatever characters has to stop. If the newline point(s) make since because it displays a listing of useful things in a column then it can be OK, but to arbitrarily throw in some newline because you think someone from 1985 is going to either print the source on paper or view your code on their Commodore 64 is foolish.
Variables should have some sort of useful prefix or suffix. The fact that so many do not even add a simple "m" to the beginning of a member variable name shows how many too many programmers are actually concerned with making code more difficult to read to protect their job. When asked, defenders of these mystery variables will always cite some sort of weak argument: the funniest is when they claim that the colors and tooltips provided by the editor make that irrelevant. Aside from the fact that I cannot move a cursor with my mind, I am amused at how these are also the ones with the biggest phobia over crossing the 80 characters-in-a-line threshold because of the sheer number of people (who exist on in their mind) who print code. Hum... I didn't know tooltips worked on paper...
My member variables are always prefixed with an 'm' and (if they represent a GUI control) have a suffix representing their control type. It's 2012 folks. You have lots of pixels on a screen. Use them. If you can actually program then the fact that someone else can read your code shouldn't be too much of a job threat.