Summary Efficient Parallelization of Ubiquitous Sequential Computation arxiv.org
1,410 words - PDF document - View PDF document
One Line
Parallel computation of the equation x-t = a-t*x-(t-1) + b-t is both efficient and numerically stable, surpassing sequential computation when executed on parallel hardware.
Slides
Slide Presentation (8 slides)
Key Points
- Efficient parallelization of a ubiquitous sequential computation is achieved by finding a succinct expression for computing a sequence in parallel with two prefix sums.
- The computation of n elements on n parallel processors incurs O(log n) time and O(n) space.
- Sequences of the form x_t = a_t x_t-1 + b_t are common in science and engineering.
- The vector log x_t can be computed as a composition of two cumulative sums, each of which is parallelizable.
- Prefix sums are associative and can be efficiently computed in parallel.
- The proposed expression for computing the sequence x_t is implemented in software and tested on parallel hardware.
- The implementation executes faster than sequential computation by a factor of log n n.
- The formulation for computing first-order linear recurrences as a composition of prefix sums is more general, but the proposed formulation is more specific to the most common case of real numbers.
Summaries
19 word summary
Parallel computation of x-t = a-t*x-(t-1) + b-t is efficient and numerically stable, outperforming sequential computation on parallel hardware.
57 word summary
Parallel computation of sequences of the form x_t = a_t*x_{t-1} + b_t can be achieved using two prefix sums, resulting in efficient computation of n elements on n processors with O(log n) time and O(n) space. The author presents a concise and numerically stable expression for parallel computation, which outperforms sequential computation when tested on parallel hardware.
128 word summary
Sequences of the form x t = a t x t?1 + b t can be computed in parallel using two prefix sums, resulting in an efficient computation of n elements on n parallel processors with O(log n) time and O(n) space. The prefix sum operation is associative, making it useful for various applications. Efficient parallel algorithms have been developed for computing the prefix sum of a sequence with n elements, and many software frameworks offer efficient parallel implementations of this operation. The author presents a concise expression for computing the sequence x t = a t x t?1 + b t in parallel with two prefix sums, which is numerically stable and readily implementable. The implementation is tested on parallel hardware and shown to outperform sequential computation.
437 word summary
Sequences of the form x t = a t x t?1 + b t are common in science and engineering. They can model quantities or populations that decay or grow at varying rates between net inflows or outflows at each time step. These sequences are often computed one element at a time, but it is possible to compute all elements in parallel using two prefix sums. The computation of n elements incurs O(log n) time and O(n) space on n parallel processors. The prefix sum operation is associative, allowing for efficient parallel computation.
The computation of the prefix sum of a sequence with n elements has been well-studied and efficient parallel algorithms have been developed. Prefix sums can be used for any binary operation that is associative, making them a useful primitive for many applications. Many software frameworks for numerical computing provide efficient parallel implementations of the prefix sum.
In this paper, the author presents a succinct expression for computing the sequence x t = a t x t?1 + b t in parallel with two prefix sums. The vector log x t can be computed as a composition of two cumulative sums. The computation of n elements on n parallel processors incurs O(log n) time and O(n) space. The author implemented this expression in software, tested it on parallel hardware, and verified that it executes faster than sequential computation by a factor of log n n.
The author compares their formulation to Blelloch's formulation for computing first-order linear recurrences as a composition of prefix sums. Blelloch's formulation is more general, expressed in terms of two binary operators that are associative or can be transformed into an associative one. The author's formulation applies only to real numbers with scalar sum and multiplication as the operators, making each step non-associative. The author finds a concise expression that is numerically stable and readily implementable with widely available implementations of the prefix sum.
The author provides a proof of their expression and shows that it can compute all elements of vector x t efficiently. They also discuss the implementation of their expression in software, including modifications for numerical stability and improved efficiency. The implementation is tested on parallel hardware and shown to execute faster than sequential computation.
In conclusion, the author presents a succinct expression for computing the sequence x t = a t x t?1 + b t in parallel with two prefix sums. This expression allows for efficient parallel computation of sequences that are ubiquitous in science and engineering. The author's implementation of the expression is tested on parallel hardware and shown to be faster than sequential computation.