Stream基本使用

一、介绍

jdk8中引入的函数式lambda表示式，同时引入了Stream流，这种流可以堆集合进行一些复杂查找、过滤、映射、规约等操作。一个stream是由三部分组成，数据源流 ->零个或多个中间操作 -> 零个或一个终止操作。中间操作是对数据的加工处理并且中间操作是懒lazy操作，并不会马上启动，需要等待终止操作允许到才会开始执行。

Stream分为终端操作和中间操作。

终端操作：也称为结束操作，即不能在继续处理数据。
中间操作：就是可以使用上一次处理的结果进行再次处理数据。

终端操作又分为短路操作和非短路操作：

短路操作：所有数据项不一定都需要处理完成即可结束。类似于a || b，这种判断语句只要某个数据项使得a=true即结束遍历。
非短路操作：所有数据项都需要遍历一遍方才结束。

中间操作又分为有状态和无状态：

有状态：表示改操作只有等待拿到所有元素后才能继续下去。
无状态：表示元素的处理不受其他元素的影响。

比如：sorted()排序，需要获取到流中的所有元素后才能进行排序。而filter()只需要获取流中的一个元素就可以进行处理。

二、使用

2.1 终端操作

2.1.1 短路操作

2.1.1.1 匹配

1.find

findFirst：获取数据流中的第一个元素
findAny ：随机获取数据流中的一个元素(然而大部分情况下是返回第一个元素)

通过findFirst/findAny返回的是一个Optional<T>对象。

public static void find(List<Integer> list){
    System.out.println(list.stream().findFirst().get());
    System.out.println(list.stream().findAny().get());
}

2.match

anyMatch：数据流中仅有一个数据项满足Predicate即返回true
allMatch：数据流中所有数据项满足Predicate才返回true
noneMatch：数据流中所有数据项都不满足Predicate才返回true

注意：当数据列表为空时, allMatch的返回值为true

public static void match(List<Integer> list){
    System.out.println(list.stream().anyMatch(item -> item % 2 == 0));
    System.out.println(list.stream().allMatch(item -> item % 2 == 0));
    System.out.println(list.stream().noneMatch(item -> item % 2 == 0));
    // 特殊情况, 数据列表为空
    List<Integer> items = new ArrayList<>();
    System.out.println(items.stream().anyMatch(item -> item % 2 == 0));
    System.out.println(items.stream().allMatch(item -> item % 2 == 0));
    System.out.println(items.stream().noneMatch(item -> item % 2 == 0));
}

2.1.2 非短路操作

2.1.2.1 遍历

在stream.forEach中不能使用break和continue关键字, 但stream.forEach中return和 continue达到的效果一致。
在parallelStream.forEachOrdered可以使得结果有序, 但同时牺牲了并行流的好处。

public static void forEach(List<Integer> list){
    list.stream().forEach(item -> System.out.print(item));
    System.out.println();
    list.stream().forEach(item -> {
        if(item % 2 == 0){
            System.out.print(item);
            return ;
        }
    });
    System.out.println();
    list.parallelStream().forEach(item -> System.out.print(item));
    System.out.println();
    list.parallelStream().forEachOrdered(item -> System.out.print(item));
    System.out.println();
}

2.1.2.2 聚合

max/min/count

private static void aggregation(List<Integer> list){
    // 自然排序
    System.out.println(list.stream().max(Integer::compareTo).get());
    // 自定义排序
    Integer max = list.stream().max(new Comparator<Integer>() {
        @Override
        public int compare(Integer o1, Integer o2) {
            return o1 - o2;
        }
    }).get();
    System.out.println(max);

    System.out.println(list.stream().min(Integer::compareTo).get());

    System.out.println(list.stream().count());

}

2.1.2.3 规约

reduce

规约: 将一个流通过一些计算/逻辑规约为一个值

第三个参数一般使用不到, 用处是在使用并行流(parallelStream)时, 最终将所有并行流的数据进行规约。

private static void reduce(List<Integer> list){
    System.out.println("求和: " + list.stream().reduce(0, (a, b) -> a + b));
    System.out.println("求积: " + list.stream().reduce(1, (a, b) -> a * b));
    System.out.println("最大值: " + list.stream().reduce((a, b) -> a > b ? a : b).get());
    // reduce三个参数的方法
    // 这种情况下会输出最大值, 在stream流下并不会调用第三个参数。
    System.out.println("stream最大值: " + list.stream().reduce(0, (a, b) -> a > b ? a : b,  (a, b) -> null));
    // 这种情况输出结果为null, 因为在每个并行流中计算得到每个并行流中的最大值后, 通过第三个参数将并行流的结果合并。
    System.out.println("parallelStream最大值: " + list.parallelStream().reduce(0, (a, b) -> a > b ? a : b,  (a, b) -> null));
}

2.2.2.4 收集

collect
在stream流中collect是功能最多的操作，可以将流中的数据收集为一个值或者一个集合。主要是依赖于java.util.stream.Collectors类内置的静态方法。

collect(Collector<? super T, A, R> collector)中传入的是Collector对象, 主要使用为实现对象java.util.stream.Collectors。在Collectors中内置了很多具体收集的静态方法
而这些静态方法最终也都依赖于静态内部实现类CollectorImpl<T, A, R>。这里简单理解下CollectorImpl<T, A, R>对象。

// (T:输入元素类型, A:累加类型, R:最后返回的对象类型)
static class CollectorImpl<T, A, R> implements Collector<T, A, R> {
    // 中间收集集合
    private final Supplier<A> supplier;
    // 累加算法/收集算法
    private final BiConsumer<A, T> accumulator;
    // 规约(主要针对并行流下规约每个流中的数据)
    private final BinaryOperator<A> combiner;
    // 结果操作(比如joining操作, 是先将数据收集到StringBuilder中, 最后通过
    // stringBuilder.toString返回结果, 这里finisher=StringBuilder:toString)
    private final Function<A, R> finisher;
    // 收集器特性, 有三个
    // UNORDERED: 规约结果不受流中元素的遍历和累加的顺序影响
    // CONCURRENT: 该收集器可以并行规约流
    // IDENTITY_FINISH: 表明finisher是一个恒等函数, 可以跳过(累加器的结果就是收集的最终结果)
    private final Set<Characteristics> characteristics;
}

这里举两个例子:

// 输入类型为T, 累加类型为?, 最后返回对象类型为List<T>。
// 假设调用为: students.stream().map(Student::getAge).collect(Collectors.toList())
public static <T> Collector<T, ?, List<T>> toList() {
    return new CollectorImpl<>(
        // 创建一个中间收集容器
        (Supplier<List<T>>) ArrayList::new,
        // 数据添加到容器中的算法/规则
        List::add,
        // 并行流下的规约机制
        (left, right) -> { left.addAll(right); return left; },
        // 收集器特性
        CH_ID
        );
}

public static Collector<CharSequence, ?, String> joining() {
    return new CollectorImpl<CharSequence, StringBuilder, String>(
            // 新建了一个StringBuilder容器
            StringBuilder::new,
            // 设置这个数据的添加规则
            StringBuilder::append,
            // 并行流下的规约规则
            (r1, r2) -> { r1.append(r2); return r1; },
            // 最后的输出规则
            StringBuilder::toString,
            // 收集器特性
            CH_NOID
        );
}

归集: 将流中的数据收集为集合(List、Set、Map)

private static void collect(){
    Student tom = Student.builder().age(19).name("tom").number(10).build();
    Student marry = Student.builder().age(12).name("marry").number(15).build();
    Student jack = Student.builder().age(12).name("jack").number(20).build();
    List<Student> students = ListUtil.of(tom, marry, jack);
    System.out.println("======== collect ========");
    // 归集
    System.out.println(students.stream().map(Student::getAge).collect(Collectors.toList()));
    System.out.println(students.stream().map(Student::getAge).collect(Collectors.toSet()));
    // Student类的number为key, Student对象为value的Map
    System.out.println(students.stream().collect(Collectors.toMap(Student::getNumber, p -> p)));
    // Function.identity() 是 p -> p 简单写法, 表示返回对象本身
    System.out.println(students.stream().collect(Collectors.toMap(Student::getNumber, Function.identity())));
    // Student类的age为key, Student类的name为value(由于marry和jack的age都是12, 所以在规约为map是出现一个key对应两个值, 这时由第三个参数来处理冲突键情况)
    // 这里简单处理: 重复时获取第一个键所对应的值。
    System.out.println(students.stream().collect(Collectors.toMap(Student::getAge, Student::getName, (key1, key2) -> key2)));
}

统计: 实际上就是聚合的那些操作(最大值、最小值、平均值、数量)

System.out.println(students.stream().collect(Collectors.counting()));
System.out.println(students.stream().collect(Collectors.averagingInt(Student::getAge)));
System.out.println(students.stream().map(Student::getAge).collect(Collectors.maxBy(Integer::compareTo)));
// 获取所有值, 返回IntSummaryStatistics对象(数量, 求和, 最小值, 平均值, 最大值)
System.out.println(students.stream().collect(Collectors.summarizingInt(Student::getAge)));

分组/分区: 分区是将流中的数据按照一定的规则分为两组数据(Map<Boolean, List

xiaocainiaoya's blog

Stream基本使用

一、介绍

二、使用

2.1 终端操作

2.1.1 短路操作

2.1.1.1 匹配

2.1.2 非短路操作

2.1.2.1 遍历

2.1.2.2 聚合

2.1.2.3 规约

2.2.2.4 收集

2.2 中间操作

2.2.1 有状态

2.2.1.1 排序

2.2.1.2 去重

2.2.1.3 切片

2.2.2 无状态

2.2.2.1 映射

2.2.2.2 过滤

2.2.2.3 消费