Part 3 – Java 8 and Lambda Expressions

Does Java 8 lambda expressions improve our code?  It’s almost time to find out.  We’ll make use of them later, but just as a trivial example let’s see what they look like as a for loop.  The power will come in as we make our package more configurable.

If you are unfamiliar with lambda expressions can I suggest you use the Oracle Java 8 Lambda Quick Start?  It’s a great tutorial.  If you’re wondering what it is, let me give you the 5 second ScrumBucket definition:

Anything that helps to clearly express your code’s behavior is a good thing.  So using meaningful method names, commonly used libraries, and the most direct syntax possible is always a good thing.  Be warned, terse and clear are not the same thing.  Lisp was terse, but never clear.  Lambda expressions don’t let you do things you couldn’t before, but they do help remove the clutter of anonymous classes and other bizarre constructs.

Proof test: if using a lambda expression makes your code more clear, then use it.

For this installment we are going to look at our process links function and see if we can clean it up with Java 8.  Here’s the original method:

This is really easy to read for you Java 7 types.  The newer for(Type type : container) syntax is a huge readability improvement over the old iterator.next() or for(int i = 0; …) days.  Everything else is very clear and direct.  Heck, it even uses Apache StringUtils to minimize all that null checking business.  What could be easier?

links.forEach(link -> { … }

Lambda expressions has a replacement for the for loop called ForEach.  It looks like this:

That doesn’t really look all that different.  It didn’t help readability at all.  Line 2 makes me feel like I’m writing Ruby and lines 11-12 feel like JavaScript.  However, that’s no reason to shut out lambda expressions.  It’s just this trivial example doesn’t show the power.  Let’s try streams.

links.stream().filter().forEach()

A powerful new playtoy built on lambdas is stream().  It’s sort of like unix pipes.  You can glue parts together in a stream and the parts will pass information on.  Here’s processLinks using a stream:

Now things are getting a little more interesting.  The big deal here is we’ve clearly separated our ‘filter’ – what we want to work on – from our operation.  That is a good thing really.  It’s no different than what we had before, but it’s a little more expressive and clear.  Many examples of filter() are cool and don’t need the { }. It looks more like this one from dreamsys software:

Performance

Is code clarity all that is important?  No, performance has its place, but it always has a distant second place.  ScrumBucket subscribes to doing your performance work after you find out your performance hot spots.  For example, most of the runtime on a crawler is waiting on the network and parsing the HTML.  Heck the parsing is going to overshadow this little bit of fluff by orders of magnitude.  Still I was a little curious about what the generated code looks like.  Just for fun, I set a breakpoint on the urlList.add() function just to see the stack for all three implementation.  Here it is:

for(Element link : elements)

lambdaforloop

forEach(link ->

lambdaforeach

 stream().filter().foreach()

lambdastream

I think you get the point.  Cool has a price to pay.  Changing your for loops to ForEach lambda expressions isn’t worth it.  However, as we will see, it’s a much better solution than anonymous classes.

Still we’re going to use lambda expressions when possible for the rest of this project, just because it’s cool.  Please see our github repo for the complete code.