Java Functional - How to use the groupingBy() Collector

dimitrilc 2 Tallied Votes 125 Views Share

Introduction

When collecting a stream, the groupingBy() Collector(java.util.stream.Collector) can be used to put data into different groups based on specified conditions. groupingBy() is a little bit more complicated to use than partitioningBy() because the key can be of any type, whereas with partitioningBy(), the keys must be of type Boolean.

There are 3 overloaded variants of the groupingBy() method in the Collectors(java.util.stream.Collectors) class. In this tutorial, we are going to learn how to use all of these 3 variants of groupingBy().

Goals

At the end of the tutorial, you would have learned:

  1. How to use the groupingBy() Collector.

Prerequisite Knowledge

  1. Intermediate Java.
  2. Java Streams(java.util.stream.Stream).

Tools Required

  1. A Java IDE.

Project Setup

To follow along with the tutorial, perform the steps below:

  1. Create a new Java project.

  2. Create a package com.example.

  3. Create a class called Entry.

  4. Create the main() method inside Entry.java.

  5. Our examples will use Cake objects with different properties to demonstrate groupingBy(), so declare the package-private top-level record class Cake inside Entry.java from the code below.

     record Cake(Shape shape, Color color, BigDecimal price){}
  6. Then create an enum called Shape.

     enum Shape { TRIANGLE, CIRCLE, DIAMOND }
  7. We do not have to create our own class to encapsulate colors, we will use the premade Java class java.awt.Color.

groupingBy() Concept Overview

In main(), add the List of cakes using the code below:

var cakes = List.of(
   new Cake(Shape.TRIANGLE, Color.BLUE, BigDecimal.valueOf(4.99)),
   new Cake(Shape.CIRCLE, Color.RED, BigDecimal.valueOf(3.99)),
   new Cake(Shape.DIAMOND, Color.CYAN, BigDecimal.valueOf(5.25)),
   new Cake(Shape.CIRCLE, Color.GREEN, BigDecimal.valueOf(3.49)),
   new Cake(Shape.DIAMOND, Color.BLACK, BigDecimal.valueOf(5.99))
);

Whenever we collect a stream using groupingBy(), we have to tell groupingBy() the key that we want to group items by.

For example, if we wanted to group the Cakes by Shape alone, we will end up with a Map with 3 different keys, and the keys being the Shapes themselves. The picture below simplifies the data structure returned by groupingBy().

groupingBy.png

We do not necessarily tell groupingBy()"how" to group things, instead, we mostly want to tell it “what” to group things by. In the List of Cakes above, we can tell groupingBy() to group Cakes by their Shape, Color, or price. It is not that we are not allowed to tell it how to group things by, but if we are performing comparisons on the items themselves, then we might as well use partitioningBy().

The differences between groupingBy() and partitioningBy() are:

  1. The Map returned by partitioningBy() will always produce only two Boolean keys, true and false, whereas the Map returned by groupingBy() can have keys of any type and can have as many keys as you want.
  2. There is no version of partitioningBy() that allows you to specify a custom Map implementation of your choice, but there is a version of groupingBy() that allows you to do so.

Simple groupingBy() Variant

Let us start with the simplest groupingBy() variant, where we only have to provide one argument. Its method signature is:

static <T, K> Collector<T,?,Map<K,List<T>>> groupingBy(Function<? super T,? extends K> classifier)

The only argument that we will have to provide is a Function object. It does not have to be anything fancy. A Function that extracts a property of each Cake will do. groupingBy() will automatically group the Cakes with the same property value for us.

In the Entry class, create a new method called groupByShape() from the code below:

private static void groupByShape(List<Cake> cakes){
   Map<Shape, List<Cake>> cakeGroups = cakes.stream()
           .collect(Collectors.groupingBy(Cake::shape)
           );

   cakeGroups.entrySet().forEach(System.out::println);
}

And then call it in main() with

groupByShape(cakes);

The output would be:

DIAMOND=[Cake[shape=DIAMOND, color=java.awt.Color[r=0,g=255,b=255], price=5.25], Cake[shape=DIAMOND, color=java.awt.Color[r=0,g=0,b=0], price=5.99]]

TRIANGLE=[Cake[shape=TRIANGLE, color=java.awt.Color[r=0,g=0,b=255], price=4.99]]

CIRCLE=[Cake[shape=CIRCLE, color=java.awt.Color[r=255,g=0,b=0], price=3.99], Cake[shape=CIRCLE, color=java.awt.Color[r=0,g=255,b=0], price=3.49]]

The groupByShape() method obviously groups Cakes by their Shape, and that is why our Map has 3 keys, with each being a Shape. Our List<Cake> only has one Cake with the TRIANGLE Shape, so we only see one Cake for the key TRIANGLE. For the other two Shapes, our List<Cake> has two of each, so that is why there are two Cakes for each Shape.

groupingBy() with a downstream Collector

The second variant of groupingBy() allows you to pass additional Collectors into it. If you want to group the Cakes by Shape and by Color as well, then this is the correct variant to use. Its method signature is:

static <T, K, A, D> Collector<T,?,Map<K,D>> groupingBy(Function<? super T,? extends K> classifier, Collector<? super T,A,D> downstream)

Add another method called groupByShapeThenColor() in the Entry class as well to see how the method is used:

private static void groupByShapeThenColor(List<Cake> cakes){
   Map<Shape, Map<Color, List<Cake>>> cakeGroups;
   cakeGroups = cakes.stream().collect(Collectors.groupingBy(
           Cake::shape,
           Collectors.groupingBy(Cake::color)
   ));

   cakeGroups.entrySet().forEach(System.out::println);
}

When calling it in main(), the output is:

CIRCLE={java.awt.Color[r=0,g=255,b=0]=[Cake[shape=CIRCLE, color=java.awt.Color[r=0,g=255,b=0], price=3.49]], java.awt.Color[r=255,g=0,b=0]=[Cake[shape=CIRCLE, color=java.awt.Color[r=255,g=0,b=0], price=3.99]]}

TRIANGLE={java.awt.Color[r=0,g=0,b=255]=[Cake[shape=TRIANGLE, color=java.awt.Color[r=0,g=0,b=255], price=4.99]]}

DIAMOND={java.awt.Color[r=0,g=0,b=0]=[Cake[shape=DIAMOND, color=java.awt.Color[r=0,g=0,b=0], price=5.99]], java.awt.Color[r=0,g=255,b=255]=[Cake[shape=DIAMOND, color=java.awt.Color[r=0,g=255,b=255], price=5.25]]}

Now we have nested Maps where the top-level keys are Shapes and the secondary-level keys are the Colors. The Cakes are first grouped by Shapes, and within each group, there will be subgroups by Color.

groupingBy() with custom Map implementation

All of the groupingBy() Collectors that we have used so far returned a Map, but did you notice that we were not able to specify a specific implementation of Map(HashMap, TreeMap) that we want? Fortunately, the last variant of groupingBy() allows us to do just that. It requires a Supplier that will supply a specific instance of Map. Its method signature is:

static <T, K, D, A, M extends Map<K, D>> Collector<T,?,M> groupingBy(Function<? super T,? extends K> classifier, Supplier<M> mapFactory, Collector<? super T,A,D> downstream)

To see how it is used, add a new method called groupByShapeThenColorCustomMap()into the Entry class.

private static void groupByShapeThenColorCustomMap(List<Cake> cakes){
   var cakeGroups = cakes.stream().collect(Collectors.groupingBy(
           Cake::shape,
           TreeMap::new,
           Collectors.groupingBy(Cake::color)
   ));

   cakeGroups.entrySet().forEach(System.out::println);
}

Notice that in the second parameter, we have specified that we want an instance of TreeMap, which will automatically sort the keys for us.

When we call it in main(), the output is:

TRIANGLE={java.awt.Color[r=0,g=0,b=255]=[Cake[shape=TRIANGLE, color=java.awt.Color[r=0,g=0,b=255], price=4.99]]}

CIRCLE={java.awt.Color[r=0,g=255,b=0]=[Cake[shape=CIRCLE, color=java.awt.Color[r=0,g=255,b=0], price=3.49]], java.awt.Color[r=255,g=0,b=0]=[Cake[shape=CIRCLE, color=java.awt.Color[r=255,g=0,b=0], price=3.99]]}

DIAMOND={java.awt.Color[r=0,g=0,b=0]=[Cake[shape=DIAMOND, color=java.awt.Color[r=0,g=0,b=0], price=5.99]], java.awt.Color[r=0,g=255,b=255]=[Cake[shape=DIAMOND, color=java.awt.Color[r=0,g=255,b=255], price=5.25]]}

enum constants are sorted by the order of their declaration, so that is why the keys are sorted in this order.

Solution Code

package com.example;

import java.awt.*;
import java.math.BigDecimal;
import java.util.List;
import java.util.Map;
import java.util.TreeMap;
import java.util.stream.Collectors;

public class Entry {
   public static void main(String[] args){
       var cakes = List.of(
           new Cake(Shape.TRIANGLE, Color.BLUE, BigDecimal.valueOf(4.99)),
           new Cake(Shape.CIRCLE, Color.RED, BigDecimal.valueOf(3.99)),
           new Cake(Shape.DIAMOND, Color.CYAN, BigDecimal.valueOf(5.25)),
           new Cake(Shape.CIRCLE, Color.GREEN, BigDecimal.valueOf(3.49)),
           new Cake(Shape.DIAMOND, Color.BLACK, BigDecimal.valueOf(5.99))
       );

       //groupByShape(cakes);
       //groupByShapeThenColor(cakes);
       groupByShapeThenColorCustomMap(cakes);
   }

   private static void groupByShape(List<Cake> cakes){
       Map<Shape, List<Cake>> cakeGroups = cakes.stream()
               .collect(Collectors.groupingBy(Cake::shape)
               );

       cakeGroups.entrySet().forEach(System.out::println);
   }

   private static void groupByShapeThenColor(List<Cake> cakes){
       Map<Shape, Map<Color, List<Cake>>> cakeGroups;
       cakeGroups = cakes.stream().collect(Collectors.groupingBy(
               Cake::shape,
               Collectors.groupingBy(Cake::color)
       ));

       cakeGroups.entrySet().forEach(System.out::println);
   }

   private static void groupByShapeThenColorCustomMap(List<Cake> cakes){
       var cakeGroups = cakes.stream().collect(Collectors.groupingBy(
               Cake::shape,
               TreeMap::new,
               Collectors.groupingBy(Cake::color)
       ));

       cakeGroups.entrySet().forEach(System.out::println);
   }

}

record Cake(Shape shape, Color color, BigDecimal price){}

enum Shape { TRIANGLE, CIRCLE, DIAMOND }

Summary

We have learned how to use the groupingBy() Collector in this tutorial, the full project code can be found here https://github.com/dmitrilc/DaniwebJavaGroupingBy/tree/master