Что такое parallel stream java
Перейти к содержимому

Что такое parallel stream java

  • автор:

Parallelism

Parallel computing involves dividing a problem into subproblems, solving those problems simultaneously (in parallel, with each subproblem running in a separate thread), and then combining the results of the solutions to the subproblems. Java SE provides the fork/join framework, which enables you to more easily implement parallel computing in your applications. However, with this framework, you must specify how the problems are subdivided (partitioned). With aggregate operations, the Java runtime performs this partitioning and combining of solutions for you.

One difficulty in implementing parallelism in applications that use collections is that collections are not thread-safe, which means that multiple threads cannot manipulate a collection without introducing thread interference or memory consistency errors. The Collections Framework provides synchronization wrappers, which add automatic synchronization to an arbitrary collection, making it thread-safe. However, synchronization introduces thread contention. You want to avoid thread contention because it prevents threads from running in parallel. Aggregate operations and parallel streams enable you to implement parallelism with non-thread-safe collections provided that you do not modify the collection while you are operating on it.

Note that parallelism is not automatically faster than performing operations serially, although it can be if you have enough data and processor cores. While aggregate operations enable you to more easily implement parallelism, it is still your responsibility to determine if your application is suitable for parallelism.

This section covers the following topics:

  • Executing Streams in Parallel
  • Concurrent Reduction
  • Ordering
  • Side Effects
    • Laziness
    • Interference
    • Stateful Lambda Expressions

    You can find the code excerpts described in this section in the example ParallelismExamples .

    Executing Streams in Parallel

    You can execute streams in serial or in parallel. When a stream executes in parallel, the Java runtime partitions the stream into multiple substreams. Aggregate operations iterate over and process these substreams in parallel and then combine the results.

    When you create a stream, it is always a serial stream unless otherwise specified. To create a parallel stream, invoke the operation Collection.parallelStream . Alternatively, invoke the operation BaseStream.parallel . For example, the following statement calculates the average age of all male members in parallel:

    double average = roster .parallelStream() .filter(p -> p.getGender() == Person.Sex.MALE) .mapToInt(Person::getAge) .average() .getAsDouble();

    Concurrent Reduction

    Consider again the following example (which is described in the section Reduction) that groups members by gender. This example invokes the collect operation, which reduces the collection roster into a Map :

    Map> byGender = roster .stream() .collect( Collectors.groupingBy(Person::getGender));

    The following is the parallel equivalent:

    ConcurrentMap> byGender = roster .parallelStream() .collect( Collectors.groupingByConcurrent(Person::getGender));

    This is called a concurrent reduction. The Java runtime performs a concurrent reduction if all of the following are true for a particular pipeline that contains the collect operation:

    • The stream is parallel.
    • The parameter of the collect operation, the collector, has the characteristic Collector.Characteristics.CONCURRENT . To determine the characteristics of a collector, invoke the Collector.characteristics method.
    • Either the stream is unordered, or the collector has the characteristic Collector.Characteristics.UNORDERED . To ensure that the stream is unordered, invoke the BaseStream.unordered operation.

    Note: This example returns an instance of ConcurrentMap instead of Map and invokes the groupingByConcurrent operation instead of groupingBy . (See the section Concurrent Collections for more information about ConcurrentMap .) Unlike the operation groupingByConcurrent , the operation groupingBy performs poorly with parallel streams. (This is because it operates by merging two maps by key, which is computationally expensive.) Similarly, the operation Collectors.toConcurrentMap performs better with parallel streams than the operation Collectors.toMap .

    Ordering

    The order in which a pipeline processes the elements of a stream depends on whether the stream is executed in serial or in parallel, the source of the stream, and intermediate operations. For example, consider the following example that prints the elements of an instance of ArrayList with the forEach operation several times:

    Integer[] intArray = ; List listOfIntegers = new ArrayList<>(Arrays.asList(intArray)); System.out.println("listOfIntegers:"); listOfIntegers .stream() .forEach(e -> System.out.print(e + " ")); System.out.println(""); System.out.println("listOfIntegers sorted in reverse order:"); Comparator normal = Integer::compare; Comparator reversed = normal.reversed(); Collections.sort(listOfIntegers, reversed); listOfIntegers .stream() .forEach(e -> System.out.print(e + " ")); System.out.println(""); System.out.println("Parallel stream"); listOfIntegers .parallelStream() .forEach(e -> System.out.print(e + " ")); System.out.println(""); System.out.println("Another parallel stream:"); listOfIntegers .parallelStream() .forEach(e -> System.out.print(e + " ")); System.out.println(""); System.out.println("With forEachOrdered:"); listOfIntegers .parallelStream() .forEachOrdered(e -> System.out.print(e + " ")); System.out.println("");

    This example consists of five pipelines. It prints output similar to the following:

    listOfIntegers: 1 2 3 4 5 6 7 8 listOfIntegers sorted in reverse order: 8 7 6 5 4 3 2 1 Parallel stream: 3 4 1 6 2 5 7 8 Another parallel stream: 6 3 1 5 7 8 4 2 With forEachOrdered: 8 7 6 5 4 3 2 1

    This example does the following:

    • The first pipeline prints the elements of the list listOfIntegers in the order that they were added to the list.
    • The second pipeline prints the elements of listOfIntegers after it was sorted by the method Collections.sort .
    • The third and fourth pipelines print the elements of the list in an apparently random order. Remember that stream operations use internal iteration when processing elements of a stream. Consequently, when you execute a stream in parallel, the Java compiler and runtime determine the order in which to process the stream’s elements to maximize the benefits of parallel computing unless otherwise specified by the stream operation.
    • The fifth pipeline uses the method forEachOrdered , which processes the elements of the stream in the order specified by its source, regardless of whether you executed the stream in serial or parallel. Note that you may lose the benefits of parallelism if you use operations like forEachOrdered with parallel streams.

    Side Effects

    A method or an expression has a side effect if, in addition to returning or producing a value, it also modifies the state of the computer. Examples include mutable reductions (operations that use the collect operation; see the section Reduction for more information) as well as invoking the System.out.println method for debugging. The JDK handles certain side effects in pipelines well. In particular, the collect method is designed to perform the most common stream operations that have side effects in a parallel-safe manner. Operations like forEach and peek are designed for side effects; a lambda expression that returns void, such as one that invokes System.out.println , can do nothing but have side effects. Even so, you should use the forEach and peek operations with care; if you use one of these operations with a parallel stream, then the Java runtime may invoke the lambda expression that you specified as its parameter concurrently from multiple threads. In addition, never pass as parameters lambda expressions that have side effects in operations such as filter and map . The following sections discuss interference and stateful lambda expressions, both of which can be sources of side effects and can return inconsistent or unpredictable results, especially in parallel streams. However, the concept of laziness is discussed first, because it has a direct effect on interference.

    Laziness

    All intermediate operations are lazy. An expression, method, or algorithm is lazy if its value is evaluated only when it is required. (An algorithm is eager if it is evaluated or processed immediately.) Intermediate operations are lazy because they do not start processing the contents of the stream until the terminal operation commences. Processing streams lazily enables the Java compiler and runtime to optimize how they process streams. For example, in a pipeline such as the filter — mapToInt — average example described in the section Aggregate Operations, the average operation could obtain the first several integers from the stream created by the mapToInt operation, which obtains elements from the filter operation. The average operation would repeat this process until it had obtained all required elements from the stream, and then it would calculate the average.

    Interference

    Lambda expressions in stream operations should not interfere. Interference occurs when the source of a stream is modified while a pipeline processes the stream. For example, the following code attempts to concatenate the strings contained in the List listOfStrings . However, it throws a ConcurrentModificationException :

    try < ListlistOfStrings = new ArrayList<>(Arrays.asList(«one», «two»)); // This will fail as the peek operation will attempt to add the // string «three» to the source after the terminal operation has // commenced. String concatenatedString = listOfStrings .stream() // Don’t do this! Interference occurs here. .peek(s -> listOfStrings.add(«three»)) .reduce((a, b) -> a + » » + b) .get(); System.out.println(«Concatenated string: » + concatenatedString); > catch (Exception e)

    This example concatenates the strings contained in listOfStrings into an Optional value with the reduce operation, which is a terminal operation. However, the pipeline here invokes the intermediate operation peek , which attempts to add a new element to listOfStrings . Remember, all intermediate operations are lazy. This means that the pipeline in this example begins execution when the operation get is invoked, and ends execution when the get operation completes. The argument of the peek operation attempts to modify the stream source during the execution of the pipeline, which causes the Java runtime to throw a ConcurrentModificationException .

    Stateful Lambda Expressions

    Avoid using stateful lambda expressions as parameters in stream operations. A stateful lambda expression is one whose result depends on any state that might change during the execution of a pipeline. The following example adds elements from the List listOfIntegers to a new List instance with the map intermediate operation. It does this twice, first with a serial stream and then with a parallel stream:

    List serialStorage = new ArrayList<>(); System.out.println("Serial stream:"); listOfIntegers .stream() // Don't do this! It uses a stateful lambda expression. .map(e -> < serialStorage.add(e); return e; >) .forEachOrdered(e -> System.out.print(e + " ")); System.out.println(""); serialStorage .stream() .forEachOrdered(e -> System.out.print(e + " ")); System.out.println(""); System.out.println("Parallel stream:"); List parallelStorage = Collections.synchronizedList( new ArrayList<>()); listOfIntegers .parallelStream() // Don't do this! It uses a stateful lambda expression. .map(e -> < parallelStorage.add(e); return e; >) .forEachOrdered(e -> System.out.print(e + " ")); System.out.println(""); parallelStorage .stream() .forEachOrdered(e -> System.out.print(e + " ")); System.out.println("");

    The lambda expression e -> < parallelStorage.add(e); return e; >is a stateful lambda expression. Its result can vary every time the code is run. This example prints the following:

    Serial stream: 8 7 6 5 4 3 2 1 8 7 6 5 4 3 2 1 Parallel stream: 8 7 6 5 4 3 2 1 1 3 6 2 4 5 8 7

    The operation forEachOrdered processes elements in the order specified by the stream, regardless of whether the stream is executed in serial or parallel. However, when a stream is executed in parallel, the map operation processes elements of the stream specified by the Java runtime and compiler. Consequently, the order in which the lambda expression e -> < parallelStorage.add(e); return e; >adds elements to the List parallelStorage can vary every time the code is run. For deterministic and predictable results, ensure that lambda expression parameters in stream operations are not stateful.

    Note: This example invokes the method synchronizedList so that the List parallelStorage is thread-safe. Remember that collections are not thread-safe. This means that multiple threads should not access a particular collection at the same time. Suppose that you do not invoke the method synchronizedList when creating parallelStorage :

    List parallelStorage = new ArrayList<>();

    The example behaves erratically because multiple threads access and modify parallelStorage without a mechanism like synchronization to schedule when a particular thread may access the List instance. Consequently, the example could print output similar to the following:

    Parallel stream: 8 7 6 5 4 3 2 1 null 3 5 4 7 8 1 2

    Parallel Stream in java

    Parallel Stream is a feature introduced in Java 8 that allows programmers to perform data processing operations in a parallel and concurrent manner using the Stream API.

    In the Stream API, various operations such as filtering, mapping, and collecting can be performed on data elements in a linear (sequential) manner. By using Parallel Stream, these operations can be performed concurrently and in parallel across multiple threads, leading to improved performance and faster execution times.

    In general, Parallel Stream in Java enables programmers to perform data processing operations in a concurrent and optimized manner using the Stream API.

    As an example, suppose we have a list of integers and we want to calculate the sum of all the integers using Parallel Stream. In this case, we can write the following code:

    List numbers = Arrays.asList(1, 2, 3, 4, 5, 6, 7, 8, 9, 10);

    int sum = numbers.parallelStream()
    .mapToInt(Integer::intValue)
    .sum();

    In this code, by using parallelStream() instead of stream() , various operations on the list elements are performed concurrently and in parallel across multiple threads. Additionally, mapToInt() and sum() are used to calculate the sum of the integers in the list.

    It’s important to note that using Parallel Stream should be done with care and caution as incorrect usage can lead to issues such as race conditions and deadlocks. It’s important to be aware of the advantages and disadvantages of Parallel Stream and use it carefully.

    The advantages of Parallel Stream in Java include:

    1. High speed: Parallel Stream allows programmers to perform various operations on data elements in a concurrent and parallel manner, leading to improved performance and faster execution times.

    2. Readable code: By using Parallel Stream, program code becomes more readable and easier to understand. Programmers can write data processing code in a simpler and more intuitive way using this feature.

    3. Easy implementation: Using Parallel Stream is very easy and straightforward, and programmers don’t need to write any boilerplate code for performing concurrent operations on data elements.

    However, the disadvantages of Parallel Stream include:

    1. High memory consumption: Using Parallel Stream can lead to higher memory usage compared to traditional methods, as creating new threads to run parallel operations incurs additional memory overhead.

    2. Non-determinism: The use of Parallel Stream can lead to non-deterministic behavior, meaning that the output of the program may vary with each execution.

    3. Concurrency issues: Incorrect usage of Parallel Stream can lead to concurrency issues such as race conditions and deadlocks. Therefore, programmers should use this feature carefully and pay attention to concurrency issues.

    In general, using Parallel Stream in Java can be very useful and efficient for processing large and complex data sets, but it’s important to be aware of the advantages and disadvantages and use it carefully.

    Разница между parallel() and parallelStream()?

    Какая разница между parallel() and parallelStream()? Что-то ни дока ни инфа в инете пока не помогают. Если это важно, то для примера возьмем обычный arraylist c 100 000 элементов.

    Отслеживать
    задан 9 дек 2017 в 20:53
    53 6 6 бронзовых знаков
    9 дек 2017 в 22:59
    @Suvitruf совсем непонятно, одна вода без четких аргументов. Ну не считая пары мест))
    9 дек 2017 в 23:29

    2 ответа 2

    Сортировка: Сброс на вариант по умолчанию

    Разницы между методами parallelStream() и stream().parallel() , вызванными для коллекций нет. Единственное, что делается в каждом случае — поток помечается флагом, что он может выполняться в несколько потоков. А вот будет он выполняться в несколько потоков (и насколько много их может быть) или нет — зависит от реализации сплитератора, на основании которого поток создаётся.

    Более того, можно снять этот флаг, вызвав у потока метод sequential() , и тогда такой поток не будет распараллелен.

    Если для потока вызвать несколько раз подряд методы parallel() или sequential() то эффект будет от того вызова, который был последним (соответственно, какое значение флага осталось после последнего вызова).

    Parallel Stream in Java

    JavaTpoint

    One of the prominent features of Java 8 (or higher) is Java Parallel Stream. It is meant for utilizing the various cores of the processor. Usually, any Java code that has only one processing stream, where it is sequentially executed. However, by using parallel streams, one can separate the Java code into more than one stream, which is executed in parallel on their separate cores, and the end result is the combination of the individual results. The order in which they are executed is not in our control. Hence, it is suggested to use a parallel stream when the order of execution of individual items does not affect the final result.

    Analysis of Parallel Stream

    For increasing the performance of a program, parallel streams are introduced. However, it is not a guarantee that applying a parallel stream will enhance the result. For example, there can be a scenario where code must be executed in a certain order. There are certain instances in which we need the code to be executed in a certain order, and in such a case, it is required to use sequential streams instead of parallel streams.

    Different Ways to Create Stream

    There are two ways we can create, which are listed below and described later as follows:

    1. Using the parallel() method on a stream
    2. Using parallelStream() on a Collection

    Using parallel() method on a stream

    The parallel() method of the BaseStream interface returns an equivalent parallel stream. Let’s understand its working through an example.

    FileName: ParallelStream.java

    Output 1:

    Vestibulum urna lacus, eleifend venenatis ipsum at, venenatis fringilla mauris. Fusce nulla augue, convallis at velit ac, pulvinar convallis eros. Pellentesque habitant morbi tristique senectus et netus et malesuada fames ac turpis egestas. Sed ut ipsum molestie dolor dictum luctus. Maecenas interdum erat feugiat rhoncus mattis. Phasellus facilisis ex non magna faucibus mollis. Vestibulum eu tellus nec lectus rutrum ornare ac tincidunt sem.

    Explanation: In the above program, we have made a file object that points to a parallelstream.txt file that is already present in the system. After that, a stream is created that does reading from the text file (only one line at a time). Then we invoke the parallel() method to display the content of the parallelstream.txt on the console. Note that the order of the execution is different each time we execute the above code. The code is executed again; the following output is displayed on the console.

    Output 2:

    Fusce nulla augue, convallis at velit ac, pulvinar convallis eros. Vestibulum urna lacus, eleifend venenatis ipsum at, venenatis fringilla mauris. Maecenas interdum erat feugiat rhoncus mattis. Phasellus facilisis ex non magna faucibus mollis. Vestibulum eu tellus nec lectus rutrum ornare ac tincidunt sem. Pellentesque habitant morbi tristique senectus et netus et malesuada fames ac turpis egestas. Sed ut ipsum molestie dolor dictum luctus.

    Using parallelStream() on a Collection

    The parallelStream() method is part of the Collection interface and returns a parallel stream with the collection as a source. It’s working of it is explained in the following example.

    FileName: ParallelStream.java

    Output 1:

    Pellentesque habitant morbi tristique senectus et netus et malesuada fames ac turpis egestas. Sed ut ipsum molestie dolor dictum luctus. Vestibulum urna lacus, eleifend venenatis ipsum at, venenatis fringilla mauris. Maecenas interdum erat feugiat rhoncus mattis. Phasellus facilisis ex non magna faucibus mollis. Fusce nulla augue, convallis at velit ac, pulvinar convallis eros. Vestibulum eu tellus nec lectus rutrum ornare ac tincidunt sem.

    Explanation: In the above-mentioned code, we are using a parallel stream. However, we are using the List to read from the parallelstream.txt file. Hence, the parallelstream() method is required.

    Other Examples of Parallel Execution of Stream

    Let’s see a few more examples of parallel execution of streams.

    FileName: ParallelStream1.java

    Output:

    In Normal 1 2 3 4 5 In Parallel 3 5 4 2 1

    Let’s see another example.

    FileName: ParallelStream2.java

    Output:

    In Normal 1 2 3 4 5 In Parallel 3 5 4 2 1

    Java Program to Check Stream is Running Parallel or Not

    We can also check whether the stream is running in parallel or not.

    FileName: ParallelStream3.java

    Output:

    In Normal The stream is not running parallelly. 1 2 3 4 5 In Parallel The stream is running parallelly. 3 5 4 1 2

    Next Topic Java Convert Bytes to Unsigned Bytes

    Youtube

    For Videos Join Our Youtube Channel: Join Now

    Feedback

    • Send your Feedback to [email protected]

Добавить комментарий

Ваш адрес email не будет опубликован. Обязательные поля помечены *