PPAM'03
- Parallel Processing and Applied Mathematics
Analytical Modeling of Optimized Sparse
Linear Code
Authors: Ivan Šimeček and Pavel Tvrdík
Keywords
High-performance, analytical cache model, sparse
matrix-vector multiplication, cache locality, Conjugate Gradient
algorithm.
Abstract
In this paper, we describe source code transformations based on
sw-pipelining, loop unrolling, and loop fusion for the sparse
matrix-vector multiplication and for the Conjugate Gradient algorithm
that enable data prefetching and overlapping of load and FPU arithmetic
instructions and improve the temporal cache locality. We develop a
probabilistic model for estimation of the numbers of cache misses for 3
types of data caches: direct mapped and s-way set associative with
random and with
LRU replacement strategies. Using HW cache monitoring tools, we compare
the predicted number of cache misses with real numbers on Intel x86
architecture with L1 and L2 caches. The accuracy of our analytical
model is around 97%. The errors in estimations are due to minor
simplifying assumptions in our model.
.
Download:
final version ( in .PDF format)
BibTex entry:
@Article{JA_PPAM,
author = "P. Tvrd\'{\i}k and
I. \v{S}ime\v{c}ek",
title = "Analytical
modeling of sparse linear code",
journal = "PPAM",
volume = "12",
number= "4",
pages = "617-629",
year = "2003",
Address = "Czestochova, Poland"
}