Keep Calm and Dev On | Streams API

One of my favorite Java 8 features is the introduction of the Streams API. Not to be confused with Input/Output Streams, the Streams API is Java’s ‘new’ way to process data that is expressive, efficient and, relatively speaking, more functional than ever.  Streams are new in the sense that Lambda functions were new to Java when version 8 was released in 2014, but on the other hand, the notion of Lambdas are not really anything ‘new’ in the context of functional programming.

Without further ado, check out this super simple example and see if you can figure out what it does.  Streams are intentionally expressive.  The idea is that you can read the code and just understand what is going on.

Java Code Block:

List<Integer> numberList = new ArrayList<>(Arrays.asList(
new Integer[] {1, 2, 3, 4, 5, 2, 4}
));

Set<Integer> evenNumberSet = numberList.stream()
.filter(number -> isEven(number))
.filter(number -> isGreaterThanThree(number))
.collect(Collectors.toSet());

Ok, so what is going on here?  Well, if you guessed that we start with a list of integers, filter out the non-even ones, then filter out results not greater than three, then reduce the unique results to a new Set, then you are correct!  Although this is among the simplest examples, I think it still illustrates just how expressive the Stream API is.  It reads like a book, right to left, top to bottom.

Right about now, a skeptical Java dev concerned about performance might say, “Great, Streams are expressive, but are they efficient?”  I’m here to tell you to ‘keep calm and dev on’ because the answer is (generally) yes, Streams are efficient.  To understand how they are efficient we need to go under the hood a little bit.  In a nutshell, here’s how it works:

The Streams API is composed of a few intermediate operations and a few terminal operations.  The intermediate operations are functions like ‘filter’ and ‘map’.  We can use these intermediate operations to solve a standard server-side use case, like when we need to change/process a collection of data in some way.  Intermediate operations can be chained together, one after the other, because they return Streams respectively.

Then, after all the intermediate operations, a Stream always ends with a terminal operation like ‘count, ‘findAny’ or ‘collect.  In other words, once we’re done processing our data, let’s get a count of the results, find a result of interest or return the results in the form of a new Collection.

Now, the interesting thing about the intermediate and terminal operations is that in the code, the intermediate operations appear to be called first, and then the terminal operation appears to be called last.  However, at run time, the intermediate operations don’t actually get executed until the terminal operations are invoked.  This concept is called lazy evaluation.  Anytime you see lazy evaluation, you can know that you’re saving precious CPU clock cycles by processing data in an ‘on-demand’ fashion.

Lastly, I’d just like to say that I really love the addition of Lambda Functions to Java.  Lambdas are great because they give Java that functional look and feel that it had always been missing.  Lambdas and the Streams API have given us the power to create complex, composable data processing, while allowing us to keep our code readable, self-documenting and maintainable.  Perhaps my favorite part of all is that anytime you do something with a Stream, regardless of complexity, it’s technically always a one liner!  Tell me who doesn’t love that?!?