Go to the first, previous, next, last section, table of contents.
A(N,N), B(N,N)
DISTRIBUTE (*,BLOCK) A, B
do i = 2, N-1
do j = i, N-1
A(i, j) = A(i-1, j-1)
enddo
enddo
For the above code segment, the last two examples find the computation
decomposition (see section Owner Computes Rule) and that communication is
needed when Ps > Pr (see section Find Existence of Communication). In
this example separate loop nests for the distributed loop as well as
receive and send instructions are found.
#iter = [2 <= i <= N - 1
i <= j <= N - 1]
#dists = [ blk*Ps <= bsy < blk*Ps + blk ]
#distr = [ blk*Pr <= bry < blk*Pr + blk ]
#lhs = [bsx = i
bsy = j ]
#rhs = [brx = i
bry = j - 1 ]
#loop = { iter, dists, lhs }
#loop = loop.order(N Ps i j bsx bsy)
#loop = loop.rename(N P i j bx by)
loop.code(2)
#comm = { iter, dists, distr, lhs, rhs
[ Ps > Pr ] }
#comms = comm.order(N Ps i Pr j bsx bsy brx bry)
#comms = comms.rename(N P i Pr j bx by brx bry)
comms.code(2)
#commr = comm.order(N Pr i Ps j brx bry bsx bsy)
#commr = commr.rename(N P i Ps j bx by bsx bsy)
commr.code(2)
end
The printout of the example session:
csh> lic -c
Rapid Prototyping System for Code Generation
> < example5
#iter = [2 <= i <= N - 1
i <= j <= N - 1]
#dists = [ blk*Ps <= bsy < blk*Ps + blk ]
#distr = [ blk*Pr <= bry < blk*Pr + blk ]
#lhs = [bsx = i
bsy = j ]
#rhs = [brx = i
bry = j - 1 ]
#loop = { iter, dists, lhs }
#loop = loop.order(N Ps i j bsx bsy)
#loop = loop.rename(N P i j bx by)
loop.code(2)
for(P = 2/blk; P <= (-1+N)/blk; P++)
for(i = 2; i <= min(-1+N, -1+blk+blk*P); i++)
for(j = max(blk*P, i); j <= min(-1+N, -1+blk+blk*P); j++) {
bx = i;
by = j;
}
#comm = { iter, dists, distr, lhs, rhs
[ Ps > Pr ] }
#comms = comm.order(N Ps i Pr j bsx bsy brx bry)
#comms = comms.rename(N P i Pr j bx by brx bry)
comms.code(2)
for(P = (1+blk)/blk; P <= (-1+N)/blk; P++)
for(i = 2; i <= blk*P; i++) {
Pr = -1+P;
j = blk+blk*Pr;
bx = i;
by = j;
brx = i;
bry = -1+j;
}
#commr = comm.order(N Pr i Ps j brx bry bsx bsy)
#commr = commr.rename(N P i Ps j bx by bsx bsy)
commr.code(2)
for(P = 1/blk; P <= (-1-blk+N)/blk; P++)
for(i = 2; i <= blk+blk*P; i++) {
Ps = 1+P;
j = blk*Ps;
bx = i;
by = -1+blk*Ps;
bsx = bx;
bsy = blk*Ps;
}
end
> quit
done(0)
csh>
Go to the first, previous, next, last section, table of contents.