| version 1.1, 1999/12/23 10:25:08 |
version 1.3, 2000/01/07 06:27:55 |
|
|
| % $OpenXM$ |
% $OpenXM: OpenXM/doc/issac2000/homogeneous-network.tex,v 1.2 2000/01/02 07:32:12 takayama Exp $ |
| |
|
| |
\section{Applications} |
| |
\subsection{Distributed computation with homogeneous servers} |
| |
|
| |
OpenXM also aims at speedup by a distributed computation |
| |
with homogeneous servers. As the current specification of OpenXM does |
| |
not include communication between servers, one cannot expect |
| |
the maximal parallel speedup. However it is possible to execute |
| |
several types of distributed computation as follows. |
| |
|
| |
\subsubsection{Product of univariate polynomials} |
| |
|
| |
Shoup \cite{Shoup} showed that the product of univariate polynomials |
| |
with large degrees and large coefficients can be computed efficiently |
| |
by FFT over small finite fields and Chinese remainder theorem. |
| |
It can be easily parallelized: |
| |
|
| |
\begin{tabbing} |
| |
Input :\= $f_1, f_2 \in Z[x]$\\ |
| |
\> such that $deg(f_1), deg(f_2) < 2^M$\\ |
| |
Output : $f = f_1f_2 \bmod p$\\ |
| |
$P \leftarrow$ \= $\{m_1,\cdots,m_N\}$ where $m_i$ is a prime, \\ |
| |
\> $2^{M+1}|m_i-1$ and $m=\prod m_i $ is sufficiently large. \\ |
| |
Separate $P$ into disjoint subsets $P_1, \cdots, P_L$.\\ |
| |
for \= $j=1$ to $L$ $M_j \leftarrow \prod_{m_i\in P_j} m_i$\\ |
| |
Compute $F_j$ such that $F_j \equiv f_1f_2 \bmod M_j$\\ |
| |
\> and $F_j \equiv 0 \bmod m/M_j$ in parallel.\\ |
| |
\> ($f_1, f_2$ are regarded as integral.\\ |
| |
\> The product is computed by FFT.)\\ |
| |
return $\phi_m(\sum F_j)$\\ |
| |
(For $a \in Z$, $\phi_m(a) \in (-m/2,m/2)$ and $\phi_m(a)\equiv a \bmod m$) |
| |
\end{tabbing} |
| |
|
| |
Figure \ref{speedup} |
| |
shows the speedup factor under the above distributed computation |
| |
on {\tt Risa/Asir}. For each $n$, two polynomials of degree $n$ |
| |
with 3000bit coefficients are generated and the product is computed. |
| |
The machine is Fujitsu AP3000, |
| |
a cluster of Sun connected with a high speed network and MPI over the |
| |
network is used to implement OpenXM. |
| |
\begin{figure}[htbp] |
| |
\epsfxsize=8.5cm |
| |
\epsffile{speedup.ps} |
| |
\caption{Speedup factor} |
| |
\label{speedup} |
| |
\end{figure} |
| |
|
| |
The task of a client is the generation and partition of $P$, sending |
| |
and receiving of polynomials and the synthesis of the result. If the |
| |
number of servers is $L$ and the inputs are fixed, then the time to |
| |
compute $F_j$ in parallel is proportional to $1/L$, whereas the time |
| |
for sending and receiving of polynomials is proportional to $L$ |
| |
because we don't have the broadcast and the reduce |
| |
operations. Therefore the speedup is limited and the upper bound of |
| |
the speedup factor depends on the communication cost and the degree |
| |
of inputs. Figure \ref{speedup} shows that |
| |
the speedup is satisfactory if the degree is large and the number of |
| |
servers is not large, say, up to 10. |
| |
|
| |
\subsubsection{Order counting of an elliptic curve} |
| |
|
| |
\subsubsection{Gr\"obner basis computation by various methods} |
| |
|
| |
Singular \cite{Singular} implements {\tt MP} interface for distributed |
| |
computation and a competitive Gr\"obner basis computation is |
| |
illustrated as an example of distributed computation. However, |
| |
interruption has not implemented yet and the looser process have to be |
| |
killed explicitly. As stated in Section \ref{secsession} OpenXM |
| |
provides such a function and one can safely reset the server and |
| |
continue to use it. Furthermore, if a client provides synchronous I/O |
| |
multiplexing by {\tt select()}, then a polling is not necessary. The |
| |
following {\tt Risa/Asir} function computes a Gr\"obner basis by |
| |
starting the computations simultaneously from the homogenized input and |
| |
the input itself. The client watches the streams by {\tt ox\_select()} |
| |
and The result which is returned first is taken. Then the remaining |
| |
server is reset. |
| |
|
| |
\begin{verbatim} |
| |
/* G:set of polys; V:list of variables */ |
| |
/* O:type of order; P0,P1: id's of servers */ |
| |
def dgr(G,V,O,P0,P1) |
| |
{ |
| |
P = [P0,P1]; /* server list */ |
| |
map(ox_reset,P); /* reset servers */ |
| |
/* P0 executes non-homogenized computation */ |
| |
ox_cmo_rpc(P0,"dp_gr_main",G,V,0,1,O); |
| |
/* P1 executes homogenized computation */ |
| |
ox_cmo_rpc(P1,"dp_gr_main",G,V,1,1,O); |
| |
map(ox_push_cmd,P,262); /* 262 = OX_popCMO */ |
| |
F = ox_select(P); /* wait for data */ |
| |
/* F[0] is a server's id which is ready */ |
| |
R = ox_get(F[0]); |
| |
if ( F[0] == P0 ) { |
| |
Win = "nonhomo"; Lose = P1; |
| |
} else { |
| |
Win = "homo"; Lose = P0; |
| |
} |
| |
ox_reset(Lose); /* reset the loser */ |
| |
return [Win,R]; |
| |
} |
| |
\end{verbatim} |