[BACK]Return to projects.html CVS log [TXT][DIR] Up to [local] / OpenXM_contrib / gmp / doc

Diff for /OpenXM_contrib/gmp/doc/Attic/projects.html between version 1.1 and 1.1.1.2

version 1.1, 2000/09/09 14:12:20 version 1.1.1.2, 2003/08/25 16:06:11
Line 13 
Line 13 
   </h1>    </h1>
 </center>  </center>
   
   <font size=-1>
   Copyright 2000, 2001, 2002 Free Software Foundation, Inc. <br><br>
   This file is part of the GNU MP Library. <br><br>
   The GNU MP Library is free software; you can redistribute it and/or modify
   it under the terms of the GNU Lesser General Public License as published
   by the Free Software Foundation; either version 2.1 of the License, or (at
   your option) any later version. <br><br>
   The GNU MP Library is distributed in the hope that it will be useful, but
   WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY
   or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU Lesser General Public
   License for more details. <br><br>
   You should have received a copy of the GNU Lesser General Public License
   along with the GNU MP Library; see the file COPYING.LIB.  If not, write to
   the Free Software Foundation, Inc., 59 Temple Place - Suite 330, Boston,
   MA 02111-1307, USA.
   </font>
   
   <hr>
   <!-- NB. timestamp updated automatically by emacs -->
 <comment>  <comment>
   An up-to-date version of this file is available at    This file current as of 14 May 2002.  An up-to-date version is available at
   <a href="http://www.swox.com/gmp/projects.html">http://www.swox.com/gmp/projects.html</a>.    <a href="http://www.swox.com/gmp/projects.html">http://www.swox.com/gmp/projects.html</a>.
     Please send comments about this page to
     <a href="mailto:bug-gmp@gnu.org">bug-gmp@gnu.org</a>.
 </comment>  </comment>
   
 <p> This file lists projects suitable for volunteers.  Please see the  <p> This file lists projects suitable for volunteers.  Please see the
Line 28 
Line 49 
     problems.)      problems.)
   
 <ul>  <ul>
 <li> <strong>C++ wrapper</strong>  <li> <strong>Faster multiplication</strong>
   
   <p> A C++ wrapper for GMP is highly desirable.  Several people have started  
   writing one, but nobody has so far finished it.  For a wrapper to be useful,  
   one needs to pay attention to efficiency.  
   
   <ol>  
     <li> Write arithmetic functions for direct applications of most  
          elementary C++ types.  This might be necessary to avoid  
          ambiguities, but it is also desirable from an efficiency  
          standpoint.  
     <li> Avoid running constructors/destructors when not necessary.  
          For example, implement <code>a&nbsp+=&nbsp;b</code> by directly applying mpz_add.  
   </ol>  
   
 <p> <li> <strong>Faster multiplication</strong>  
   
   <p> The current multiplication code uses Karatsuba, 3-way Toom-Cook,    <p> The current multiplication code uses Karatsuba, 3-way Toom-Cook,
       or Fermat FFT.  Several new developments are desirable:        or Fermat FFT.  Several new developments are desirable:
   
Line 62 
Line 68 
     <li> It's possible CPU dependent effects like cache locality will      <li> It's possible CPU dependent effects like cache locality will
          have a greater impact on speed than algorithmic improvements.           have a greater impact on speed than algorithmic improvements.
   
       <li> Add support for partial products, either a given number of low limbs
            or high limbs of the result.  A high partial product can be used by
            <code>mpf_mul</code> now, a low half partial product might be of use
            in a future sub-quadratic REDC.  On small sizes a partial product
            will be faster simply through fewer cross-products, similar to the
            way squaring is faster.  But work by Thom Mulders shows that for
            Karatsuba and higher order algorithms the advantage is progressively
            lost, so for large sizes partial products turn out to be no faster.
   
   </ol>    </ol>
   
     <p> Another possibility would be an optimized cube.  In the basecase that
         should definitely be able to save cross-products in a similar fashion to
         squaring, but some investigation might be needed for how best to adapt
         the higher-order algorithms.  Not sure whether cubing or further small
         powers have any particularly important uses though.
   
 <p> <li> <strong>Assembly routines</strong>  <p> <li> <strong>Assembly routines</strong>
   
   <p> Write new and tune existing assembly routines.  The programs in mpn/tests    <p> Write new and improve existing assembly routines.  The tests/devel
   and the tune/speed.c program are useful for testing and timing the routines    programs and the tune/speed.c and tune/many.pl programs are useful for
   you write.  See the README files in those directories for more information.    testing and timing the routines you write.  See the README files in those
     directories for more information.
   
   <p> Please make sure your new routines are fast for these three situations:    <p> Please make sure your new routines are fast for these three situations:
   <ol>    <ol>
Line 106 
Line 127 
   current code can mostly be reused.  It should be possible to share code    current code can mostly be reused.  It should be possible to share code
   between GCD and GCDEXT, and probably with Jacobi symbols too.    between GCD and GCDEXT, and probably with Jacobi symbols too.
   
     <p> Paul Zimmermann has worked on sub-quadratic GCD and GCDEXT, but it seems
     that the most likely algorithm doesn't kick in until about 3000 limbs.
   
 <p> <li> <strong>Math functions for the mpf layer</strong>  <p> <li> <strong>Math functions for the mpf layer</strong>
   
   <p> Implement the functions of math.h for the GMP mpf layer!  Check the book    <p> Implement the functions of math.h for the GMP mpf layer!  Check the book
Line 113 
Line 137 
   functions are desirable: acos, acosh, asin, asinh, atan, atanh, atan2, cos,    functions are desirable: acos, acosh, asin, asinh, atan, atanh, atan2, cos,
   cosh, exp, log, log10, pow, sin, sinh, tan, tanh.    cosh, exp, log, log10, pow, sin, sinh, tan, tanh.
   
     <p> These are implemented in mpfr, redoing them in mpf might not be worth
     bothering with, if the long term plan is to bring mpfr in as the new mpf.
   
 <p> <li> <strong>Faster sqrt</strong>  <p> <li> <strong>Faster sqrt</strong>
   
   <p> The current code for computing square roots use a Newton iteration that    <p> The current code uses divisions, which are reasonably fast, but it'd be
   rely on division.  It is possible to avoid using division by computing    possible to use only multiplications by computing 1/sqrt(A) using this
   1/sqrt(A) using this formula:    formula:
   <pre>    <pre>
                                     2                                      2
                    x   = x  (3 - A x )/2.                     x   = x  (3 - A x )/2.
                     i+1   i         i  </pre>                      i+1   i         i  </pre>
   The final result is then computed without division using,    And the final result
   <pre>    <pre>
                      sqrt(A) = A x .                       sqrt(A) = A x .
                                   n  </pre>                                    n  </pre>
   The resulting code should be substantially faster than the current code.    That final multiply might be the full size of the input (though it might
     only need the high half of that), so there may or may not be any speedup
     overall.
   
 <p> <li> <strong>Nth root</strong>  <p> <li> <strong>Nth root</strong>
   
Line 137 
Line 166 
   <p> If the routine becomes fast enough, perhaps square roots could be computed    <p> If the routine becomes fast enough, perhaps square roots could be computed
   using this function.    using this function.
   
 <p> <li> <strong>More random number generators</strong>  <p> <li> <strong>Quotient-Only Division</strong>
   
   <p> Implement some more pseudo random number generator algorithms.    <p> Some work can be saved when only the quotient is required in a division,
       Today there's only Linear Congruential.        basically the necessary correction -0, -1 or -2 to the estimated
         quotient can almost always be determined from only a few limbs of
         multiply and subtract, rather than forming a complete remainder.  The
         greatest savings are when the quotient is small compared to the dividend
         and divisor.
   
     <p> Some code along these lines can be found in the current
         <code>mpn_tdiv_qr</code>, though perhaps calculating bigger chunks of
         remainder than might be strictly necessary.  That function in its
         current form actually then always goes on to calculate a full remainder.
         Burnikel and Zeigler describe a similar approach for the divide and
         conquer case.
   
   <p> <li> <strong>Sub-Quadratic REDC and Exact Division</strong>
   
     <p> <code>mpn_bdivmod</code> and the <code>redc</code> in
         <code>mpz_powm</code> should use some sort of divide and conquer
         algorithm.  This would benefit <code>mpz_divexact</code>, and
         <code>mpn_gcd</code> on large unequal size operands.  See "Exact
         Division with Karatsuba Complexity" by Jebelean for a (brief)
         description.
   
     <p> Failing that, some sort of <code>DIVEXACT_THRESHOLD</code> could be
         added to control whether <code>mpz_divexact</code> uses
         <code>mpn_bdivmod</code> or <code>mpn_tdiv_qr</code>, since the latter
         is faster on large divisors.
   
     <p> For the REDC, basically all that's needed is Montgomery's algorithm done
         in multi-limb integers.  R is multiple limbs, and the inverse and
         operands are multi-precision.
   
     <p> For exact division the time to calculate a multi-limb inverse is not
         amortized across many modular operations, but instead will probably
         create a threshold below which the current style
         <code>mpn_bdivmod</code> is best.  There's also Krandick and Jebelean,
         "Bidirectional Exact Integer Division" to basically use a low to high
         exact division for the low half quotient, and a quotient-only division
         for the high half.
   
     <p> It will be noted that low-half and high-half multiplies, and a low-half
         square, can be used.  These ought to each take as little as half the
         time of a full multiply, or square, though work by Thom Mulders shows
         the advantage is progressively lost as Karatsuba and higher algorithms
         are applied.
   
   <p> <li> <strong>Exceptions</strong>
   
     <p> Some sort of scheme for exceptions handling would be desirable.
         Presently the only thing documented is that divide by zero in GMP
         functions provokes a deliberate machine divide by zero (on those systems
         where such a thing exists at least).  The global <code>gmp_errno</code>
         is not actually documented, except for the old <code>gmp_randinit</code>
         function.  Being currently just a plain global means it's not
         thread-safe.
   
     <p> The basic choices for exceptions are returning an error code or having
         a handler function to be called.  The disadvantage of error returns is
         they have to be checked, leading to tedious and rarely executed code,
         and strictly speaking such a scheme wouldn't be source or binary
         compatible.  The disadvantage of a handler function is that a
         <code>longjmp</code> or similar recovery from it may be difficult.  A
         combination would be possible, for instance by allowing the handler to
         return an error code.
   
     <p> Divide-by-zero, sqrt-of-negative, and similar operand range errors can
         normally be detected at the start of functions, so exception handling
         would have a clean state.  What's worth considering though is that the
         GMP function detecting the exception may have been called via some third
         party library or self contained application module, and hence have
         various bits of state to be cleaned up above it.  It'd be highly
         desirable for an exceptions scheme to allow for such cleanups.
   
     <p> The C++ destructor mechanism could help with cleanups both internally
         and externally, but being a plain C library we don't want to depend on
         that.
   
     <p> A C++ <code>throw</code> might be a good optional extra exceptions
         mechanism, perhaps under a build option.  For GCC
         <code>-fexceptions</code> will add the necessary frame information to
         plain C code, or GMP could be compiled as C++.
   
     <p> Out-of-memory exceptions are expected to be handled by the
         <code>mp_set_memory_functions</code> routines, rather than being a
         prospective part of divide-by-zero etc.  Some similar considerations
         apply but what differs is that out-of-memory can arise deep within GMP
         internals.  Even fundamental routines like <code>mpn_add_n</code> and
         <code>mpn_addmul_1</code> can use temporary memory (for instance on Cray
         vector systems).  Allowing for an error code return would require an
         awful lot of checking internally.  Perhaps it'd still be worthwhile, but
         it'd be a lot of changes and the extra code would probably be rather
         rarely executed in normal usages.
   
     <p> A <code>longjmp</code> recovery for out-of-memory will currently, in
         general, lead to memory leaks and may leave GMP variables operated on in
         inconsistent states.  Maybe it'd be possible to record recovery
         information for use by the relevant allocate or reallocate function, but
         that too would be a lot of changes.
   
     <p> One scheme for out-of-memory would be to note that all GMP allocations
         go through the <code>mp_set_memory_functions</code> routines.  So if the
         application has an intended <code>setjmp</code> recovery point it can
         record memory activity by GMP and abandon space allocated and variables
         initialized after that point.  This might be as simple as directing the
         allocation functions to a separate pool, but in general would have the
         disadvantage of needing application-level bookkeeping on top of the
         normal system <code>malloc</code>.  An advantage however is that it
         needs nothing from GMP itself and on that basis doesn't burden
         applications not needing recovery.  Note that there's probably some
         details to be worked out here about reallocs of existing variables, and
         perhaps about copying or swapping between "permanent" and "temporary"
         variables.
   
     <p> Applications desiring a fine-grained error control, for instance a
         language interpreter, would very possibly not be well served by a scheme
         requiring <code>longjmp</code>.  Wrapping every GMP function call with a
         <code>setjmp</code> would be very inconvenient.
   
     <p> Stack overflow is another possible exception, but perhaps not one that
         can be easily detected in general.  On i386 GNU/Linux for instance GCC
         normally doesn't generate stack probes for an <code>alloca</code>, but
         merely adjusts <code>%esp</code>.  A big enough <code>alloca</code> can
         miss the stack redzone and hit arbitrary data.  GMP stack usage is
         normally a function of operand size, knowing that might suffice for some
         applications.  Otherwise a fixed maximum usage can probably be obtained
         by building with <code>--enable-alloca=malloc-reentrant</code> (or
         <code>notreentrant</code>).
   
     <p> Actually recovering from stack overflow is of course another problem.
         It might be possible to catch a <code>SIGSEGV</code> in the stack
         redzone and do something in a <code>sigaltstack</code>, on systems which
         have that, but recovery might otherwise not be possible.  This is worth
         bearing in mind because there's no point worrying about tight and
         careful out-of-memory recovery if an out-of-stack is fatal.
   
   
 <p> <li> <strong>Test Suite</strong>  <p> <li> <strong>Test Suite</strong>
   
   <p> Add a test suite for old bugs.  These tests wouldn't loop or use    <p> Add a test suite for old bugs.  These tests wouldn't loop or use
Line 172 
Line 333 
       seeds used, and perhaps to snapshot operands before performing        seeds used, and perhaps to snapshot operands before performing
       each test, so any problem exposed can be reproduced.        each test, so any problem exposed can be reproduced.
   
 </ul>  
   
 <hr>  <p> <li> <strong>Performance Tool</strong>
   
 <table width="100%">    <p> It'd be nice to have some sort of tool for getting an overview of
   <tr>        performance.  Clearly a great many things could be done, but some
     <td>        primary uses would be,
       <font size=2>  
       Please send comments about this page to  
       <a href="mailto:tege@swox.com">tege@swox.com</a>.<br>  
       Copyright (C) 1999, 2000 Torbjörn Granlund.  
       </font>  
     </td>  
     <td align=right>  
     </td>  
   </tr>  
 </table>  
   
         <ol>
           <li> Checking speed variations between compilers.
           <li> Checking relative performance between systems or CPUs.
         </ol>
   
     <p> A combination of measuring some fundamental routines and some
         representative application routines might satisfy these.
   
     <p> The tune/time.c routines would be the easiest way to get good
         accurate measurements on lots of different systems.  The high level
         <code>speed_measure</code> may or may not suit, but the basic
         <code>speed_starttime</code> and <code>speed_endtime</code> would cover
         lots of portability and accuracy questions.
   
   
   </ul>
   <hr>
   
 </body>  </body>
 </html>  </html>
   
   <!--
   Local variables:
   eval: (add-hook 'write-file-hooks 'time-stamp)
   time-stamp-start: "This file current as of "
   time-stamp-format: "%:d %3b %:y"
   time-stamp-end: "\\."
   time-stamp-line-limit: 50
   End:
   -->

Legend:
Removed from v.1.1  
changed lines
  Added in v.1.1.1.2

FreeBSD-CVSweb <freebsd-cvsweb@FreeBSD.org>