Finalized documentation for SoftFloat Release 3.

This commit is contained in:
John Hauser 2014-12-17 19:08:03 -08:00
parent 437d9b9fb2
commit 7276b0022e
6 changed files with 362 additions and 320 deletions

View File

@ -2,7 +2,7 @@
License for Berkeley SoftFloat Release 3 License for Berkeley SoftFloat Release 3
John R. Hauser John R. Hauser
2014 ________ 2014 Dec 17
The following applies to the whole of SoftFloat Release 3 as well as to each The following applies to the whole of SoftFloat Release 3 as well as to each
source file individually. source file individually.

View File

@ -11,7 +11,7 @@
<P> <P>
John R. Hauser<BR> John R. Hauser<BR>
2014 ________<BR> 2014 Dec 17<BR>
</P> </P>
<P> <P>
@ -19,7 +19,8 @@ Berkeley SoftFloat is a software implementation of binary floating-point that
conforms to the IEEE Standard for Floating-Point Arithmetic. conforms to the IEEE Standard for Floating-Point Arithmetic.
SoftFloat is distributed in the form of C source code. SoftFloat is distributed in the form of C source code.
Building the SoftFloat sources generates a library file (typically Building the SoftFloat sources generates a library file (typically
<CODE>"softfloat.a"</CODE>) containing the floating-point subroutines. <CODE>softfloat.a</CODE> or <CODE>libsoftfloat.a</CODE>) containing the
floating-point subroutines.
</P> </P>
<P> <P>

View File

@ -2,13 +2,13 @@
Package Overview for Berkeley SoftFloat Release 3 Package Overview for Berkeley SoftFloat Release 3
John R. Hauser John R. Hauser
2014 ________ 2014 Dec 17
Berkeley SoftFloat is a software implementation of binary floating-point Berkeley SoftFloat is a software implementation of binary floating-point
that conforms to the IEEE Standard for Floating-Point Arithmetic. SoftFloat that conforms to the IEEE Standard for Floating-Point Arithmetic. SoftFloat
is distributed in the form of C source code. Building the SoftFloat sources is distributed in the form of C source code. Building the SoftFloat sources
generates a library file (typically "softfloat.a") containing the floating- generates a library file (typically "softfloat.a" or "libsoftfloat.a")
point subroutines. containing the floating-point subroutines.
The SoftFloat package is documented in the following files in the "doc" The SoftFloat package is documented in the following files in the "doc"
subdirectory: subdirectory:

View File

@ -11,11 +11,7 @@
<P> <P>
John R. Hauser<BR> John R. Hauser<BR>
2014 _____<BR> 2014 Dec 17<BR>
</P>
<P>
*** CONTENT DONE.
</P> </P>
@ -24,7 +20,8 @@ John R. Hauser<BR>
<UL> <UL>
<LI> <LI>
Complete rewrite, funded by the University of California, Berkeley. Complete rewrite, funded by the University of California, Berkeley, and
consequently having a different use license than earlier releases.
Major changes included renaming most types and functions, upgrading some Major changes included renaming most types and functions, upgrading some
algorithms, restructuring the source files, and making SoftFloat into a true algorithms, restructuring the source files, and making SoftFloat into a true
library. library.
@ -54,8 +51,9 @@ TestFloat package).
<UL> <UL>
<LI> <LI>
Further improved wording for the legal restrictions on using SoftFloat releases Further improved the wording for the legal restrictions on using SoftFloat
<NOBR>through 2c</NOBR>. releases <NOBR>through 2c</NOBR> (not applicable to <NOBR>Release 3</NOBR> or
later).
</UL> </UL>
@ -134,7 +132,8 @@ tininess is detected before or after rounding.
<UL> <UL>
<LI> <LI>
Original release. Original release, based on work done for the International Computer Science
Institute (ICSI) in Berkely, California.
</UL> </UL>

View File

@ -11,36 +11,39 @@
<P> <P>
John R. Hauser<BR> John R. Hauser<BR>
2014 _____<BR> 2014 Dec 17<BR>
</P>
<P>
*** REPLACE QUOTATION MARKS.
</P> </P>
<H2>Contents</H2> <H2>Contents</H2>
<P> <BLOCKQUOTE>
*** CHECK.<BR> <TABLE BORDER=0 CELLSPACING=0 CELLPADDING=0>
*** FIX FORMATTING. <COL WIDTH=25>
</P> <COL WIDTH=*>
<TR><TD COLSPAN=2>1. Introduction</TD></TR>
<PRE> <TR><TD COLSPAN=2>2. Limitations</TD></TR>
Introduction <TR><TD COLSPAN=2>3. Acknowledgments and License</TD></TR>
Limitations <TR><TD COLSPAN=2>4. SoftFloat Package Directory Structure</TD></TR>
Acknowledgments and License <TR><TD COLSPAN=2>5. Issues for Porting SoftFloat to a New Target</TD></TR>
SoftFloat Package Directory Structure <TR>
Issues for Porting SoftFloat to a New Target <TD></TD>
Standard Headers &lt;stdbool.h&gt; and &lt;stdint.h&gt; <TD>5.1. Standard Headers <CODE>&lt;stdbool.h&gt;</CODE> and
Specializing Floating-Point Behavior <CODE>&lt;stdint.h&gt;</CODE></TD>
Macros for Build Options </TR>
Adapting a Template Target Directory <TR><TD></TD><TD>5.2. Specializing Floating-Point Behavior</TD></TR>
Target-Specific Optimization of Primitive Functions <TR><TD></TD><TD>5.3. Macros for Build Options</TD></TR>
Testing SoftFloat <TR><TD></TD><TD>5.4. Adapting a Template Target Directory</TD></TR>
Providing SoftFloat as a Common Library for Applications <TR>
Contact Information <TD></TD><TD>5.5. Target-Specific Optimization of Primitive Functions</TD>
</PRE> </TR>
<TR><TD COLSPAN=2>6. Testing SoftFloat</TD></TR>
<TR>
<TD COLSPAN=2>7. Providing SoftFloat as a Common Library for Applications</TD>
</TR>
<TR><TD COLSPAN=2>8. Contact Information</TD></TR>
</TABLE>
</BLOCKQUOTE>
<H2>1. Introduction</H2> <H2>1. Introduction</H2>
@ -98,7 +101,7 @@ strictly required.
integer types. integer types.
If these headers are not supplied with the C compiler, minimal substitutes must If these headers are not supplied with the C compiler, minimal substitutes must
be provided. be provided.
SoftFloat's dependence on these headers is detailed later in SoftFloat&rsquo;s dependence on these headers is detailed later in
<NOBR>section 5.1</NOBR>, <I>Standard Headers &lt;stdbool.h&gt; and <NOBR>section 5.1</NOBR>, <I>Standard Headers &lt;stdbool.h&gt; and
&lt;stdint.h&gt;</I>. &lt;stdint.h&gt;</I>.
</P> </P>
@ -110,15 +113,20 @@ SoftFloat's dependence on these headers is detailed later in
The SoftFloat package was written by me, <NOBR>John R.</NOBR> Hauser. The SoftFloat package was written by me, <NOBR>John R.</NOBR> Hauser.
<NOBR>Release 3</NOBR> of SoftFloat is a completely new implementation <NOBR>Release 3</NOBR> of SoftFloat is a completely new implementation
supplanting earlier releases. supplanting earlier releases.
This project was done in the employ of the University of California, Berkeley, This project (<NOBR>Release 3</NOBR> only, not earlier releases) was done in
within the Department of Electrical Engineering and Computer Sciences, first the employ of the University of California, Berkeley, within the Department of
for the Parallel Computing Laboratory (Par Lab) and then for the ASPIRE Lab. Electrical Engineering and Computer Sciences, first for the Parallel Computing
Laboratory (Par Lab) and then for the ASPIRE Lab.
The work was officially overseen by Prof. Krste Asanovic, with funding provided The work was officially overseen by Prof. Krste Asanovic, with funding provided
by these sources: by these sources:
<BLOCKQUOTE> <BLOCKQUOTE>
<TABLE> <TABLE>
<COL WIDTH=*>
<COL WIDTH=10>
<COL WIDTH=*>
<TR> <TR>
<TD><NOBR>Par Lab:</NOBR></TD> <TD VALIGN=TOP><NOBR>Par Lab:</NOBR></TD>
<TD></TD>
<TD> <TD>
Microsoft (Award #024263), Intel (Award #024894), and U.C. Discovery Microsoft (Award #024263), Intel (Award #024894), and U.C. Discovery
(Award #DIG07-10227), with additional support from Par Lab affiliates Nokia, (Award #DIG07-10227), with additional support from Par Lab affiliates Nokia,
@ -126,7 +134,8 @@ NVIDIA, Oracle, and Samsung.
</TD> </TD>
</TR> </TR>
<TR> <TR>
<TD><NOBR>ASPIRE Lab:</NOBR></TD> <TD VALIGN=TOP><NOBR>ASPIRE Lab:</NOBR></TD>
<TD></TD>
<TD> <TD>
DARPA PERFECT program (Award #HR0011-12-2-0016), with additional support from DARPA PERFECT program (Award #HR0011-12-2-0016), with additional support from
ASPIRE industrial sponsor Intel and ASPIRE affiliates Google, Nokia, NVIDIA, ASPIRE industrial sponsor Intel and ASPIRE affiliates Google, Nokia, NVIDIA,
@ -185,27 +194,40 @@ Because SoftFloat is targeted to multiple platforms, its source code is
slightly scattered between target-specific and target-independent directories slightly scattered between target-specific and target-independent directories
and files. and files.
The supplied directory structure is as follows: The supplied directory structure is as follows:
<BLOCKQUOTE>
<PRE> <PRE>
doc doc
source source
include include
8086 8086
build 8086-SSE
build
template-FAST_INT64 template-FAST_INT64
template-not-FAST_INT64 template-not-FAST_INT64
Linux-386-GCC Linux-386-GCC
Linux-386-SSE2-GCC
Linux-x86_64-GCC Linux-x86_64-GCC
Win32-MinGW Win32-MinGW
Win32-SSE2-MinGW
Win64-MinGW-w64 Win64-MinGW-w64
</PRE> </PRE>
</BLOCKQUOTE>
The majority of the SoftFloat sources are provided in the <CODE>source</CODE> The majority of the SoftFloat sources are provided in the <CODE>source</CODE>
directory. directory.
The <CODE>include</CODE> subdirectory of <CODE>source</CODE> contains several The <CODE>include</CODE> subdirectory of <CODE>source</CODE> contains several
header files (unsurprisingly), while the <CODE>8086</CODE> subdirectory header files (unsurprisingly), while the <CODE>8086</CODE> and
contains source files that specialize the floating-point behavior to match the <NOBR><CODE>8086-SSE</CODE></NOBR> subdirectories contain source files that
Intel x86 line of processors. specialize the floating-point behavior to match the Intel x86 line of
processors.
The files in directory <CODE>8086</CODE> give floating-point behavior
consistent solely with Intel&rsquo;s older, 8087-derived floating-point, while
those in <NOBR><CODE>8086-SSE</CODE></NOBR> update the behavior of the
non-extended formats (<CODE>float32_t</CODE>, <CODE>float64_t</CODE>, and
<CODE>float128_t</CODE>) to mirror Intel&rsquo;s more recent Streaming SIMD
Extensions (SSE) and other compatible extensions.
If other specializations are attempted, these would be expected to be other If other specializations are attempted, these would be expected to be other
subdirectories of <CODE>source</CODE> alongside <CODE>8086</CODE>. subdirectories of <CODE>source</CODE> alongside <CODE>8086</CODE> and
<NOBR><CODE>8086-SSE</CODE></NOBR>.
Specialization is covered later, in <NOBR>section 5.2</NOBR>, <I>Specializing Specialization is covered later, in <NOBR>section 5.2</NOBR>, <I>Specializing
Floating-Point Behavior</I>. Floating-Point Behavior</I>.
</P> </P>
@ -213,9 +235,9 @@ Floating-Point Behavior</I>.
<P> <P>
The <CODE>build</CODE> directory is intended to contain a subdirectory for each The <CODE>build</CODE> directory is intended to contain a subdirectory for each
target platform for which a build of the SoftFloat library may be created. target platform for which a build of the SoftFloat library may be created.
For each build target, the target's subdirectory is where all derived object For each build target, the target&rsquo;s subdirectory is where all derived
files and the completed SoftFloat library (typically <CODE>softfloat.a</CODE> object files and the completed SoftFloat library (typically
or <CODE>libsoftfloat.a</CODE>) are created. <CODE>softfloat.a</CODE> or <CODE>libsoftfloat.a</CODE>) are created.
The two <CODE>template</CODE> subdirectories are not actual build targets but The two <CODE>template</CODE> subdirectories are not actual build targets but
contain sample files for creating new target directories. contain sample files for creating new target directories.
(The meaning of <CODE>FAST_INT64</CODE> will be explained later.) (The meaning of <CODE>FAST_INT64</CODE> will be explained later.)
@ -227,18 +249,21 @@ are intended to follow a naming system of
<NOBR><CODE>&lt;execution-environment&gt;-&lt;compiler&gt;</CODE></NOBR>. <NOBR><CODE>&lt;execution-environment&gt;-&lt;compiler&gt;</CODE></NOBR>.
For the example targets, For the example targets,
<NOBR><CODE>&lt;execution-environment&gt;</CODE></NOBR> is <NOBR><CODE>&lt;execution-environment&gt;</CODE></NOBR> is
<NOBR><CODE>Linux-386</CODE></NOBR>, <NOBR><CODE>Linux-x86_64</CODE></NOBR>, <NOBR><CODE>Linux-386</CODE></NOBR>, <NOBR><CODE>Linux-386-SSE2</CODE></NOBR>,
<CODE>Win32</CODE>, or <CODE>Win64</CODE>, and <NOBR><CODE>Linux-x86_64</CODE></NOBR>, <CODE>Win32</CODE>,
<NOBR><CODE>Win32-SSE2</CODE></NOBR>, or <CODE>Win64</CODE>, and
<NOBR><CODE>&lt;compiler&gt;</CODE></NOBR> is <CODE>GCC</CODE>, <NOBR><CODE>&lt;compiler&gt;</CODE></NOBR> is <CODE>GCC</CODE>,
<CODE>MinGW</CODE>, or <NOBR><CODE>MinGW-w64</CODE></NOBR>. <CODE>MinGW</CODE>, or <NOBR><CODE>MinGW-w64</CODE></NOBR>.
</P> </P>
<P> <P>
As supplied, each target directory contains two files: As supplied, each target directory contains two files:
<BLOCKQUOTE>
<PRE> <PRE>
Makefile Makefile
platform.h platform.h
</PRE> </PRE>
</BLOCKQUOTE>
The provided <CODE>Makefile</CODE> is written for GNU <CODE>make</CODE>. The provided <CODE>Makefile</CODE> is written for GNU <CODE>make</CODE>.
A build of SoftFloat for the specific target is begun by executing the A build of SoftFloat for the specific target is begun by executing the
<CODE>make</CODE> command with the target directory as the current directory. <CODE>make</CODE> command with the target directory as the current directory.
@ -258,10 +283,10 @@ desirable to include in header <CODE>platform.h</CODE> (directly or via
<CODE>#include</CODE>) declarations for numerous target-specific optimizations. <CODE>#include</CODE>) declarations for numerous target-specific optimizations.
Such possibilities are discussed in the next section, <I>Issues for Porting Such possibilities are discussed in the next section, <I>Issues for Porting
SoftFloat to a New Target</I>. SoftFloat to a New Target</I>.
If the target's compiler or library has bugs or other shortcomings, workarounds If the target&rsquo;s compiler or library has bugs or other shortcomings,
for these issues may also be possible with target-specific declarations in workarounds for these issues may also be possible with target-specific
<CODE>platform.h</CODE>, avoiding the need to modify the main SoftFloat declarations in <CODE>platform.h</CODE>, avoiding the need to modify the main
sources. SoftFloat sources.
</P> </P>
@ -280,30 +305,34 @@ For older or nonstandard compilers, substitutes for
<CODE>&lt;stdbool.h&gt;</CODE> and <CODE>&lt;stdint.h&gt;</CODE> may need to be <CODE>&lt;stdbool.h&gt;</CODE> and <CODE>&lt;stdint.h&gt;</CODE> may need to be
created. created.
SoftFloat depends on these names from <CODE>&lt;stdbool.h&gt;</CODE>: SoftFloat depends on these names from <CODE>&lt;stdbool.h&gt;</CODE>:
<BLOCKQUOTE>
<PRE> <PRE>
bool bool
true true
false false
</PRE> </PRE>
</BLOCKQUOTE>
and on these names from <CODE>&lt;stdint.h&gt;</CODE>: and on these names from <CODE>&lt;stdint.h&gt;</CODE>:
<BLOCKQUOTE>
<PRE> <PRE>
uint16_t uint16_t
uint32_t uint32_t
uint64_t uint64_t
int32_t int32_t
int64_t int64_t
UINT64_C UINT64_C
INT64_C INT64_C
uint_least8_t uint_least8_t
uint_fast8_t uint_fast8_t
uint_fast16_t uint_fast16_t
uint_fast32_t uint_fast32_t
uint_fast64_t uint_fast64_t
int_fast8_t int_fast8_t
int_fast16_t int_fast16_t
int_fast32_t int_fast32_t
int_fast64_t int_fast64_t
</PRE> </PRE>
</BLOCKQUOTE>
</P> </P>
@ -312,12 +341,12 @@ and on these names from <CODE>&lt;stdint.h&gt;</CODE>:
<P> <P>
The IEEE Floating-Point Standard allows for some flexibility in a conforming The IEEE Floating-Point Standard allows for some flexibility in a conforming
implementation, particularly concerning NaNs. implementation, particularly concerning NaNs.
The SoftFloat <CODE>source</CODE> directory is supplied with one or more The SoftFloat <CODE>source</CODE> directory is supplied with some
<I>specialization</I> subdirectories containing possible definitions for this <I>specialization</I> subdirectories containing possible definitions for this
implementation-specific behavior. implementation-specific behavior.
For example, the <CODE>8086</CODE> subdirectory has source files that For example, the <CODE>8086</CODE> and <NOBR><CODE>8086-SSE</CODE></NOBR>
specialize SoftFloat's behavior to match that of Intel's x86 line of subdirectories have source files that specialize SoftFloat&rsquo;s behavior to
processors. match that of Intel&rsquo;s x86 line of processors.
The files in a specialization subdirectory must determine: The files in a specialization subdirectory must determine:
<UL> <UL>
<LI> <LI>
@ -343,8 +372,9 @@ source files are needed to complete the specialization.
</P> </P>
<P> <P>
A new build target may use an existing specialization, such as the one provided A new build target may use an existing specialization, such as the ones
by the <CODE>8086</CODE> subdirectory. provided by the <CODE>8086</CODE> and <NOBR><CODE>8086-SSE</CODE></NOBR>
subdirectories.
If a build target needs a new specialization, different from any existing ones, If a build target needs a new specialization, different from any existing ones,
it is recommended that a new specialization subdirectory be created in the it is recommended that a new specialization subdirectory be created in the
<CODE>source</CODE> directory for this purpose. <CODE>source</CODE> directory for this purpose.
@ -367,18 +397,18 @@ Must be defined for little-endian machines; must not be defined for big-endian
machines. machines.
<DT><CODE>SOFTFLOAT_FAST_INT64</CODE> <DT><CODE>SOFTFLOAT_FAST_INT64</CODE>
<DD> <DD>
Can be defined to indicate that the build target's implementation of Can be defined to indicate that the build target&rsquo;s implementation of
<CODE>64-bit</CODE> arithmetic is efficient. <NOBR>64-bit</NOBR> arithmetic is efficient.
For newer <CODE>64-bit</CODE> processors, this macro should usually be defined. For newer <NOBR>64-bit</NOBR> processors, this macro should usually be defined.
For very small microprocessors whose buses and registers are <CODE>8-bit</CODE> For very small microprocessors whose buses and registers are <NOBR>8-bit</NOBR>
or <CODE>16-bit</CODE> in size, this macro should usually not be defined. or <NOBR>16-bit</NOBR> in size, this macro should usually not be defined.
Whether this macro should be defined for a <CODE>32-bit</CODE> processor may Whether this macro should be defined for a <NOBR>32-bit</NOBR> processor may
depend on the target machine and the applications that will use SoftFloat. depend on the target machine and the applications that will use SoftFloat.
<DT><CODE>SOFTFLOAT_FAST_DIV64TO32</CODE> <DT><CODE>SOFTFLOAT_FAST_DIV64TO32</CODE>
<DD> <DD>
Can be defined to indicate that the target's division operator Can be defined to indicate that the target&rsquo;s division operator
<NOBR>in C</NOBR> (written as <CODE>/</CODE>) is reasonably efficient for <NOBR>in C</NOBR> (written as <CODE>/</CODE>) is reasonably efficient for
dividing a <CODE>64-bit</CODE> unsigned integer by a <CODE>32-bit</CODE> dividing a <NOBR>64-bit</NOBR> unsigned integer by a <NOBR>32-bit</NOBR>
unsigned integer. unsigned integer.
Setting this macro may affect the performance of division, remainder, and Setting this macro may affect the performance of division, remainder, and
square root operations. square root operations.
@ -411,16 +441,16 @@ defined to <CODE>extern</CODE> <CODE>inline</CODE>.
Following the usual custom <NOBR>for C</NOBR>, for the first three macros (all Following the usual custom <NOBR>for C</NOBR>, for the first three macros (all
except <CODE>INLINE_LEVEL</CODE> and <CODE>INLINE</CODE>), the content of any except <CODE>INLINE_LEVEL</CODE> and <CODE>INLINE</CODE>), the content of any
definition is irrelevant; definition is irrelevant;
what matters is a macro's effect on <CODE>#ifdef</CODE> directives. what matters is a macro&rsquo;s effect on <CODE>#ifdef</CODE> directives.
</P> </P>
<P> <P>
It is recommended that any definitions of macros <CODE>LITTLEENDIAN</CODE> and It is recommended that any definitions of macros <CODE>LITTLEENDIAN</CODE> and
<CODE>INLINE</CODE> be made in a build target's <CODE>platform.h</CODE> header <CODE>INLINE</CODE> be made in a build target&rsquo;s <CODE>platform.h</CODE>
file, because these macros are expected to be determined inflexibly by the header file, because these macros are expected to be determined inflexibly by
target machine and compiler. the target machine and compiler.
The other three macros control optimization and might be better located in the The other three macros control optimization and might be better located in the
target's Makefile (or its equivalent). target&rsquo;s Makefile (or its equivalent).
</P> </P>
@ -433,8 +463,9 @@ Two different templates exist because different functions are needed in the
SoftFloat library depending on whether macro <CODE>SOFTFLOAT_FAST_INT64</CODE> SoftFloat library depending on whether macro <CODE>SOFTFLOAT_FAST_INT64</CODE>
is defined. is defined.
If macro <CODE>SOFTFLOAT_FAST_INT64</CODE> will be defined, If macro <CODE>SOFTFLOAT_FAST_INT64</CODE> will be defined,
<CODE>template-FAST_INT64</CODE> is the template to use; <NOBR><CODE>template-FAST_INT64</CODE></NOBR> is the template to use;
otherwise, <CODE>template-not-FAST_INT64</CODE> is the appropriate template. otherwise, <NOBR><CODE>template-not-FAST_INT64</CODE></NOBR> is the appropriate
template.
A new target directory can be created by copying the correct template directory A new target directory can be created by copying the correct template directory
and editing the files inside. and editing the files inside.
To avoid confusion, it would be wise to refrain from editing the files within a To avoid confusion, it would be wise to refrain from editing the files within a
@ -447,12 +478,12 @@ template directory directly.
<P> <P>
Header file <CODE>primitives.h</CODE> (in directory Header file <CODE>primitives.h</CODE> (in directory
<CODE>source/include</CODE>) declares macros and functions for numerous <CODE>source/include</CODE>) declares macros and functions for numerous
underlying arithmetic operations upon which many of SoftFloat's floating-point underlying arithmetic operations upon which many of SoftFloat&rsquo;s
functions are ultimately built. floating-point functions are ultimately built.
The SoftFloat sources include implementations of all of these functions/macros, The SoftFloat sources include implementations of all of these functions/macros,
written as standard C code, so a complete and correct SoftFloat library can be written as standard C code, so a complete and correct SoftFloat library can be
built using only the supplied code for all functions. built using only the supplied code for all functions.
However, for many targets, SoftFloat's performance can be improved by However, for many targets, SoftFloat&rsquo;s performance can be improved by
substituting target-specific implementations of some of the functions/macros substituting target-specific implementations of some of the functions/macros
declared in <CODE>primitives.h</CODE>. declared in <CODE>primitives.h</CODE>.
</P> </P>
@ -461,7 +492,7 @@ declared in <CODE>primitives.h</CODE>.
For example, <CODE>primitives.h</CODE> declares a function called For example, <CODE>primitives.h</CODE> declares a function called
<CODE>softfloat_countLeadingZeros32</CODE> that takes an unsigned <CODE>softfloat_countLeadingZeros32</CODE> that takes an unsigned
<NOBR>32-bit</NOBR> integer as an argument and returns the maximal number of <NOBR>32-bit</NOBR> integer as an argument and returns the maximal number of
the integer's most-significant bits that are all zeros. the integer&rsquo;s most-significant bits that are all zeros.
While the SoftFloat sources include an implementation of this function written While the SoftFloat sources include an implementation of this function written
in <NOBR>standard C</NOBR>, many processors can perform this same function in <NOBR>standard C</NOBR>, many processors can perform this same function
directly in only one or two machine instructions. directly in only one or two machine instructions.
@ -473,19 +504,22 @@ package.
<P> <P>
A build target can replace the supplied version of any function or macro of A build target can replace the supplied version of any function or macro of
<CODE>primitives.h</CODE> by defining a macro with the same name in the <CODE>primitives.h</CODE> by defining a macro with the same name in the
target's <CODE>platform.h</CODE> header file. target&rsquo;s <CODE>platform.h</CODE> header file.
For this purpose, it may be helpful for <CODE>platform.h</CODE> to For this purpose, it may be helpful for <CODE>platform.h</CODE> to
<CODE>#include</CODE> header file <CODE>primitiveTypes.h</CODE>, which defines <CODE>#include</CODE> header file <CODE>primitiveTypes.h</CODE>, which defines
types used for arguments and results of functions declared in types used for arguments and results of functions declared in
<CODE>primitives.h</CODE>. <CODE>primitives.h</CODE>.
When a desired replacement implementation is a function, not a macro, it is When a desired replacement implementation is a function, not a macro, it is
sufficient for <CODE>platform.h</CODE> to include the line sufficient for <CODE>platform.h</CODE> to include the line
<BLOCKQUOTE>
<PRE> <PRE>
#define &lt;function-name&gt; &lt;function-name&gt; #define &lt;function-name&gt; &lt;function-name&gt;
</PRE> </PRE>
where <CODE>&lt;function-name&gt;</CODE> is the name of the function. </BLOCKQUOTE>
This technically defines <CODE>&lt;function-name&gt;</CODE> as a macro, but one where <NOBR><CODE>&lt;function-name&gt;</CODE></NOBR> is the name of the
that resolves to the same name, which may then be a function. function.
This technically defines <NOBR><CODE>&lt;function-name&gt;</CODE></NOBR> as a
macro, but one that resolves to the same name, which may then be a function.
(A preprocessor conforming to the C Standard must limit recursive macro (A preprocessor conforming to the C Standard must limit recursive macro
expansion from being applied more than once.) expansion from being applied more than once.)
</P> </P>
@ -500,46 +534,34 @@ This program is part of the Berkeley TestFloat package available at the Web
page page
<A HREF="http://www.jhauser.us/arithmetic/TestFloat.html"><CODE>http://www.jhauser.us/arithmetic/TestFloat.html</CODE></A>. <A HREF="http://www.jhauser.us/arithmetic/TestFloat.html"><CODE>http://www.jhauser.us/arithmetic/TestFloat.html</CODE></A>.
The TestFloat package also has a program called <CODE>timesoftfloat</CODE> that The TestFloat package also has a program called <CODE>timesoftfloat</CODE> that
measures the speed of SoftFloat's floating-point functions. measures the speed of SoftFloat&rsquo;s floating-point functions.
</P> </P>
<H2>7. Providing SoftFloat as a Common Library for Applications</H2> <H2>7. Providing SoftFloat as a Common Library for Applications</H2>
<P> <P>
Supplied <CODE>softfloat.h</CODE> depends on <CODE>softfloat_types.h</CODE>. Header file <CODE>softfloat.h</CODE> defines the SoftFloat interface as seen by
clients.
If the SoftFloat library will be made a common library for programs on a
particular system, the supplied <CODE>softfloat.h</CODE> has a couple of
deficiencies for this purpose:
<UL>
<LI>
As supplied, <CODE>softfloat.h</CODE> depends on another header,
<CODE>softfloat_types.h</CODE>, that is not intended for public use but which
must also be visible to the programmer&rsquo;s compiler.
<LI>
More troubling, at the time <CODE>softfloat.h</CODE> is included in a C
source file, macro <CODE>SOFTFLOAT_FAST_INT64</CODE> must be defined, or not
defined, consistent with whether this macro was defined when the SoftFloat
library was built.
</UL>
In the situation that new programs may regularly <CODE>#include</CODE> header
file <CODE>softfloat.h</CODE>, it is recommended that a custom, self-contained
version of this header file be created that eliminates these issues.
</P> </P>
<PRE>
The target-specific `softfloat.h' header file defines the SoftFloat
interface as seen by clients.
Unlike the actual function definitions in `softfloat.c', the declarations
in `softfloat.h' do not use any of the types defined by the `processors'
header file. This is done so that clients will not have to include the
`processors' header file in order to use SoftFloat. Nevertheless, the
target-specific declarations in `softfloat.h' must match what `softfloat.c'
expects. For example, if `int32' is defined as `int' in the `processors'
header file, then in `softfloat.h' the output of `float32_to_int32' should
be stated as `int', although in `softfloat.c' it is given in target-
independent form as `int32'.
</PRE>
<PRE>
*** HERE
Porting and/or compiling SoftFloat involves the following steps:
4. In the target-specific subdirectory, edit the files `softfloat-specialize'
and `softfloat.h' to define the desired exception handling functions
and mode control values. In the `softfloat.h' header file, ensure also
that all declarations give the proper target-specific type (such as
`int' or `long') corresponding to the target-independent type used in
`softfloat.c' (such as `int32'). None of the type names declared in the
`processors' header file should appear in `softfloat.h'.
</PRE>
<H2>8. Contact Information</H2> <H2>8. Contact Information</H2>

View File

@ -11,66 +11,59 @@
<P> <P>
John R. Hauser<BR> John R. Hauser<BR>
2014 ______<BR> 2014 Dec 17<BR>
</P>
<P>
*** CONTENT DONE.
</P>
<P>
*** REPLACE QUOTATION MARKS.
<BR>
*** REPLACE APOSTROPHES.
<BR>
*** REPLACE EM DASH.
</P> </P>
<H2>Contents</H2> <H2>Contents</H2>
<P> <BLOCKQUOTE>
*** CHECK.<BR> <TABLE BORDER=0 CELLSPACING=0 CELLPADDING=0>
*** FIX FORMATTING. <COL WIDTH=25>
</P> <COL WIDTH=*>
<TR><TD COLSPAN=2>1. Introduction</TD></TR>
<PRE> <TR><TD COLSPAN=2>2. Limitations</TD></TR>
Introduction <TR><TD COLSPAN=2>3. Acknowledgments and License</TD></TR>
Limitations <TR><TD COLSPAN=2>4. Types and Functions</TD></TR>
Acknowledgments and License <TR><TD></TD><TD>4.1. Boolean and Integer Types</TD></TR>
Types and Functions <TR><TD></TD><TD>4.2. Floating-Point Types</TD></TR>
Boolean and Integer Types <TR><TD></TD><TD>4.3. Supported Floating-Point Functions</TD></TR>
Floating-Point Types <TR>
Supported Floating-Point Functions <TD></TD>
Non-canonical Representations in extFloat80_t <TD>4.4. Non-canonical Representations in <CODE>extFloat80_t</CODE></TD>
Conventions for Passing Arguments and Results </TR>
Reserved Names <TR><TD></TD><TD>4.5. Conventions for Passing Arguments and Results</TD></TR>
Mode Variables <TR><TD COLSPAN=2>5. Reserved Names</TD></TR>
Rounding Mode <TR><TD COLSPAN=2>6. Mode Variables</TD></TR>
Underflow Detection <TR><TD></TD><TD>6.1. Rounding Mode</TD></TR>
Rounding Precision for 80-Bit Extended Format <TR><TD></TD><TD>6.2. Underflow Detection</TD></TR>
Exceptions and Exception Flags <TR>
Function Details <TD></TD>
Conversions from Integer to Floating-Point <TD>6.3. Rounding Precision for the <NOBR>80-Bit</NOBR> Extended Format</TD>
Conversions from Floating-Point to Integer </TR>
Conversions Among Floating-Point Types <TR><TD COLSPAN=2>7. Exceptions and Exception Flags</TD></TR>
Basic Arithmetic Functions <TR><TD COLSPAN=2>8. Function Details</TD></TR>
Fused Multiply-Add Functions <TR><TD></TD><TD>8.1. Conversions from Integer to Floating-Point</TD></TR>
Remainder Functions <TR><TD></TD><TD>8.2. Conversions from Floating-Point to Integer</TD></TR>
Round-to-Integer Functions <TR><TD></TD><TD>8.3. Conversions Among Floating-Point Types</TD></TR>
Comparison Functions <TR><TD></TD><TD>8.4. Basic Arithmetic Functions</TD></TR>
Signaling NaN Test Functions <TR><TD></TD><TD>8.5. Fused Multiply-Add Functions</TD></TR>
Raise-Exception Function <TR><TD></TD><TD>8.6. Remainder Functions</TD></TR>
Changes from SoftFloat Release 2 <TR><TD></TD><TD>8.7. Round-to-Integer Functions</TD></TR>
Name Changes <TR><TD></TD><TD>8.8. Comparison Functions</TD></TR>
Changes to Function Arguments <TR><TD></TD><TD>8.9. Signaling NaN Test Functions</TD></TR>
Added Capabilities <TR><TD></TD><TD>8.10. Raise-Exception Function</TD></TR>
Better Compatibility with the C Language <TR><TD COLSPAN=2>9. Changes from SoftFloat <NOBR>Release 2</NOBR></TD></TR>
New Organization as a Library <TR><TD></TD><TD>9.1. Name Changes</TD></TR>
Optimization Gains (and Losses) <TR><TD></TD><TD>9.2. Changes to Function Arguments</TD></TR>
Future Directions <TR><TD></TD><TD>9.3. Added Capabilities</TD></TR>
Contact Information <TR><TD></TD><TD>9.4. Better Compatibility with the C Language</TD></TR>
</PRE> <TR><TD></TD><TD>9.5. New Organization as a Library</TD></TR>
<TR><TD></TD><TD>9.6. Optimization Gains (and Losses)</TD></TR>
<TR><TD COLSPAN=2>10. Future Directions</TD></TR>
<TR><TD COLSPAN=2>11. Contact Information</TD></TR>
</TABLE>
</BLOCKQUOTE>
<H2>1. Introduction</H2> <H2>1. Introduction</H2>
@ -156,15 +149,20 @@ SoftFloat <NOBR>Release 3</NOBR>.
The SoftFloat package was written by me, <NOBR>John R.</NOBR> Hauser. The SoftFloat package was written by me, <NOBR>John R.</NOBR> Hauser.
<NOBR>Release 3</NOBR> of SoftFloat is a completely new implementation <NOBR>Release 3</NOBR> of SoftFloat is a completely new implementation
supplanting earlier releases. supplanting earlier releases.
This project was done in the employ of the University of California, Berkeley, This project (<NOBR>Release 3</NOBR> only, not earlier releases) was done in
within the Department of Electrical Engineering and Computer Sciences, first the employ of the University of California, Berkeley, within the Department of
for the Parallel Computing Laboratory (Par Lab) and then for the ASPIRE Lab. Electrical Engineering and Computer Sciences, first for the Parallel Computing
Laboratory (Par Lab) and then for the ASPIRE Lab.
The work was officially overseen by Prof. Krste Asanovic, with funding provided The work was officially overseen by Prof. Krste Asanovic, with funding provided
by these sources: by these sources:
<BLOCKQUOTE> <BLOCKQUOTE>
<TABLE> <TABLE>
<COL WIDTH=*>
<COL WIDTH=10>
<COL WIDTH=*>
<TR> <TR>
<TD><NOBR>Par Lab:</NOBR></TD> <TD VALIGN=TOP><NOBR>Par Lab:</NOBR></TD>
<TD></TD>
<TD> <TD>
Microsoft (Award #024263), Intel (Award #024894), and U.C. Discovery Microsoft (Award #024263), Intel (Award #024894), and U.C. Discovery
(Award #DIG07-10227), with additional support from Par Lab affiliates Nokia, (Award #DIG07-10227), with additional support from Par Lab affiliates Nokia,
@ -172,7 +170,8 @@ NVIDIA, Oracle, and Samsung.
</TD> </TD>
</TR> </TR>
<TR> <TR>
<TD><NOBR>ASPIRE Lab:</NOBR></TD> <TD VALIGN=TOP><NOBR>ASPIRE Lab:</NOBR></TD>
<TD></TD>
<TD> <TD>
DARPA PERFECT program (Award #HR0011-12-2-0016), with additional support from DARPA PERFECT program (Award #HR0011-12-2-0016), with additional support from
ASPIRE industrial sponsor Intel and ASPIRE affiliates Google, Nokia, NVIDIA, ASPIRE industrial sponsor Intel and ASPIRE affiliates Google, Nokia, NVIDIA,
@ -245,16 +244,18 @@ for these headers.
Header <CODE>softfloat.h</CODE> depends only on the name <CODE>bool</CODE> from Header <CODE>softfloat.h</CODE> depends only on the name <CODE>bool</CODE> from
<CODE>&lt;stdbool.h&gt;</CODE> and on these type names from <CODE>&lt;stdbool.h&gt;</CODE> and on these type names from
<CODE>&lt;stdint.h&gt;</CODE>: <CODE>&lt;stdint.h&gt;</CODE>:
<BLOCKQUOTE>
<PRE> <PRE>
uint16_t uint16_t
uint32_t uint32_t
uint64_t uint64_t
int32_t int32_t
int64_t int64_t
uint_fast8_t uint_fast8_t
uint_fast32_t uint_fast32_t
uint_fast64_t uint_fast64_t
</PRE> </PRE>
</BLOCKQUOTE>
</P> </P>
@ -263,26 +264,22 @@ Header <CODE>softfloat.h</CODE> depends only on the name <CODE>bool</CODE> from
<P> <P>
The <CODE>softfloat.h</CODE> header defines four floating-point types: The <CODE>softfloat.h</CODE> header defines four floating-point types:
<BLOCKQUOTE> <BLOCKQUOTE>
<TABLE> <TABLE CELLSPACING=0 CELLPADDING=0>
<TR> <TR>
<TD><CODE>float32_t</CODE></TD> <TD><CODE>float32_t</CODE></TD>
<TD>&nbsp;</TD>
<TD><NOBR>32-bit</NOBR> single-precision binary format</TD> <TD><NOBR>32-bit</NOBR> single-precision binary format</TD>
</TR> </TR>
<TR> <TR>
<TD><CODE>float64_t</CODE></TD> <TD><CODE>float64_t</CODE></TD>
<TD>&nbsp;</TD>
<TD><NOBR>64-bit</NOBR> double-precision binary format</TD> <TD><NOBR>64-bit</NOBR> double-precision binary format</TD>
</TR> </TR>
<TR> <TR>
<TD><CODE>extFloat80_t</CODE></TD> <TD><CODE>extFloat80_t&nbsp;&nbsp;&nbsp;</CODE></TD>
<TD>&nbsp;</TD>
<TD><NOBR>80-bit</NOBR> double-extended-precision binary format (old Intel or <TD><NOBR>80-bit</NOBR> double-extended-precision binary format (old Intel or
Motorola format)</TD> Motorola format)</TD>
</TR> </TR>
<TR> <TR>
<TD><CODE>float128_t</CODE></TD> <TD><CODE>float128_t</CODE></TD>
<TD>&nbsp;</TD>
<TD><NOBR>128-bit</NOBR> quadruple-precision binary format</TD> <TD><NOBR>128-bit</NOBR> quadruple-precision binary format</TD>
</TR> </TR>
</TABLE> </TABLE>
@ -304,10 +301,10 @@ Header file <CODE>softfloat.h</CODE> also defines a structure,
This structure is the same size as type <CODE>extFloat80_t</CODE> and contains This structure is the same size as type <CODE>extFloat80_t</CODE> and contains
at least these two fields (not necessarily in this order): at least these two fields (not necessarily in this order):
<BLOCKQUOTE> <BLOCKQUOTE>
<TABLE> <PRE>
<TR><TD><CODE>uint16_t signExp;</CODE></TD></TR> uint16_t signExp;
<TR><TD><CODE>uint64_t signif;</CODE></TD></TR> uint64_t signif;
</TABLE> </PRE>
</BLOCKQUOTE> </BLOCKQUOTE>
Field <CODE>signExp</CODE> contains the sign and exponent of the floating-point Field <CODE>signExp</CODE> contains the sign and exponent of the floating-point
value, with the sign in the most significant bit (<NOBR>bit 15</NOBR>) and the value, with the sign in the most significant bit (<NOBR>bit 15</NOBR>) and the
@ -339,8 +336,8 @@ operation defined by the IEEE Standard;
for each format, the floating-point remainder operation defined by the IEEE for each format, the floating-point remainder operation defined by the IEEE
Standard; Standard;
<LI> <LI>
for each format, a ``round to integer'' operation that rounds to the nearest for each format, a &ldquo;round to integer&rdquo; operation that rounds to the
integer value in the same format; and nearest integer value in the same format; and
<LI> <LI>
comparisons between two values in the same floating-point format. comparisons between two values in the same floating-point format.
</UL> </UL>
@ -357,12 +354,12 @@ not supported in SoftFloat <NOBR>Release 3</NOBR>:
conversions between floating-point formats and decimal or hexadecimal character conversions between floating-point formats and decimal or hexadecimal character
sequences; sequences;
<LI> <LI>
all ``quiet-computation'' operations (<B>copy</B>, <B>negate</B>, <B>abs</B>, all &ldquo;quiet-computation&rdquo; operations (<B>copy</B>, <B>negate</B>,
and <B>copySign</B>, which all involve only simple copying and/or manipulation <B>abs</B>, and <B>copySign</B>, which all involve only simple copying and/or
of the floating-point sign bit); and manipulation of the floating-point sign bit); and
<LI> <LI>
all ``non-computational'' operations other than <B>isSignaling</B> (which is all &ldquo;non-computational&rdquo; operations other than <B>isSignaling</B>
supported). (which is supported).
</UL> </UL>
</P> </P>
@ -393,9 +390,9 @@ leading significand bit must <NOBR>be 1</NOBR> unless it is required to
For <NOBR>Release 3</NOBR> of SoftFloat, functions are not guaranteed to For <NOBR>Release 3</NOBR> of SoftFloat, functions are not guaranteed to
operate as expected when inputs of type <CODE>extFloat80_t</CODE> are operate as expected when inputs of type <CODE>extFloat80_t</CODE> are
non-canonical. non-canonical.
Assuming all of a function's <CODE>extFloat80_t</CODE> inputs (if any) are Assuming all of a function&rsquo;s <CODE>extFloat80_t</CODE> inputs (if any)
canonical, function outputs of type <CODE>extFloat80_t</CODE> will always be are canonical, function outputs of type <CODE>extFloat80_t</CODE> will always
canonical. be canonical.
</P> </P>
<H3>4.5. Conventions for Passing Arguments and Results</H3> <H3>4.5. Conventions for Passing Arguments and Results</H3>
@ -426,8 +423,8 @@ SoftFloat supplies this function:
The first two arguments point to the values to be added, and the last argument The first two arguments point to the values to be added, and the last argument
points to the location where the sum will be stored. points to the location where the sum will be stored.
The <CODE>M</CODE> in the name <CODE>f128M_add</CODE> is mnemonic for the fact The <CODE>M</CODE> in the name <CODE>f128M_add</CODE> is mnemonic for the fact
that the <NOBR>128-bit</NOBR> inputs and outputs are ``in memory'', pointed to that the <NOBR>128-bit</NOBR> inputs and outputs are &ldquo;in memory&rdquo;,
by pointer arguments. pointed to by pointer arguments.
</P> </P>
<P> <P>
@ -464,10 +461,11 @@ platforms of interest, programmers can use whichever version they prefer.
<P> <P>
In addition to the variables and functions documented here, SoftFloat defines In addition to the variables and functions documented here, SoftFloat defines
some symbol names for its own private use. some symbol names for its own private use.
These private names always begin with the prefix `<CODE>softfloat_</CODE>'. These private names always begin with the prefix
&lsquo;<CODE>softfloat_</CODE>&rsquo;.
When a program includes header <CODE>softfloat.h</CODE> or links with the When a program includes header <CODE>softfloat.h</CODE> or links with the
SoftFloat library, all names with prefix `<CODE>softfloat_</CODE>' are reserved SoftFloat library, all names with prefix &lsquo;<CODE>softfloat_</CODE>&rsquo;
for possible use by SoftFloat. are reserved for possible use by SoftFloat.
Applications that use SoftFloat should not define their own names with this Applications that use SoftFloat should not define their own names with this
prefix, and should reference only such names as are documented. prefix, and should reference only such names as are documented.
</P> </P>
@ -477,7 +475,7 @@ prefix, and should reference only such names as are documented.
<P> <P>
The following variables control rounding mode, underflow detection, and the The following variables control rounding mode, underflow detection, and the
<NOBR>80-bit</NOBR> extended format's rounding precision: <NOBR>80-bit</NOBR> extended format&rsquo;s rounding precision:
<BLOCKQUOTE> <BLOCKQUOTE>
<CODE>softfloat_roundingMode</CODE><BR> <CODE>softfloat_roundingMode</CODE><BR>
<CODE>softfloat_detectTininess</CODE><BR> <CODE>softfloat_detectTininess</CODE><BR>
@ -497,30 +495,25 @@ The rounding mode is selected by the global variable
</BLOCKQUOTE> </BLOCKQUOTE>
This variable may be set to one of the values This variable may be set to one of the values
<BLOCKQUOTE> <BLOCKQUOTE>
<TABLE> <TABLE CELLSPACING=0 CELLPADDING=0>
<TR> <TR>
<TD><CODE>softfloat_round_near_even</CODE></TD> <TD><CODE>softfloat_round_near_even</CODE></TD>
<TD>&nbsp;</TD>
<TD>round to nearest, with ties to even</TD> <TD>round to nearest, with ties to even</TD>
</TR> </TR>
<TR> <TR>
<TD><CODE>softfloat_round_near_maxMag</CODE></TD> <TD><CODE>softfloat_round_near_maxMag&nbsp;&nbsp;</CODE></TD>
<TD>&nbsp;</TD>
<TD>round to nearest, with ties to maximum magnitude (away from zero)</TD> <TD>round to nearest, with ties to maximum magnitude (away from zero)</TD>
</TR> </TR>
<TR> <TR>
<TD><CODE>softfloat_round_minMag</CODE></TD> <TD><CODE>softfloat_round_minMag</CODE></TD>
<TD>&nbsp;</TD>
<TD>round to minimum magnitude (toward zero)</TD> <TD>round to minimum magnitude (toward zero)</TD>
</TR> </TR>
<TR> <TR>
<TD><CODE>softfloat_round_min</CODE></TD> <TD><CODE>softfloat_round_min</CODE></TD>
<TD>&nbsp;</TD>
<TD>round to minimum (down)</TD> <TD>round to minimum (down)</TD>
</TR> </TR>
<TR> <TR>
<TD><CODE>softfloat_round_max</CODE></TD> <TD><CODE>softfloat_round_max</CODE></TD>
<TD>&nbsp;</TD>
<TD>round to maximum (up)</TD> <TD>round to maximum (up)</TD>
</TR> </TR>
</TABLE> </TABLE>
@ -550,7 +543,7 @@ Like most systems (and as required by the newer 2008 IEEE Standard), SoftFloat
always detects loss of accuracy for underflow as an inexact result. always detects loss of accuracy for underflow as an inexact result.
</P> </P>
<H3>6.3. Rounding Precision for 80-Bit Extended Format</H3> <H3>6.3. Rounding Precision for the <NOBR>80-Bit</NOBR> Extended Format</H3>
<P> <P>
For <CODE>extFloat80_t</CODE> only, the rounding precision of the basic For <CODE>extFloat80_t</CODE> only, the rounding precision of the basic
@ -639,7 +632,7 @@ It does always raise the <I>inexact</I> exception flag as required.
In this section, <CODE>&lt;<I>float</I>&gt;</CODE> appears in function names as In this section, <CODE>&lt;<I>float</I>&gt;</CODE> appears in function names as
a substitute for one of these abbreviations: a substitute for one of these abbreviations:
<BLOCKQUOTE> <BLOCKQUOTE>
<TABLE> <TABLE CELLSPACING=0 CELLPADDING=0>
<TR> <TR>
<TD><CODE>f32</CODE></TD> <TD><CODE>f32</CODE></TD>
<TD>indicates <CODE>float32_t</CODE>, passed by value</TD> <TD>indicates <CODE>float32_t</CODE>, passed by value</TD>
@ -696,11 +689,14 @@ Each conversion function takes one input of the appropriate type and generates
one output. one output.
The following illustrates the signatures of these functions in cases when the The following illustrates the signatures of these functions in cases when the
floating-point result is passed either by value or via pointers: floating-point result is passed either by value or via pointers:
<BLOCKQUOTE>
<PRE> <PRE>
float64_t i32_to_f64( int32_t <I>a</I> ); float64_t i32_to_f64( int32_t <I>a</I> );
void i32_to_f128M( int32_t <I>a</I>, float128_t *<I>destPtr</I> );
</PRE> </PRE>
<PRE>
void i32_to_f128M( int32_t <I>a</I>, float128_t *<I>destPtr</I> );
</PRE>
</BLOCKQUOTE>
</P> </P>
<H3>8.2. Conversions from Floating-Point to Integer</H3> <H3>8.2. Conversions from Floating-Point to Integer</H3>
@ -717,12 +713,15 @@ functions:
</BLOCKQUOTE> </BLOCKQUOTE>
The functions have signatures as follows, depending on whether the The functions have signatures as follows, depending on whether the
floating-point input is passed by value or via pointers: floating-point input is passed by value or via pointers:
<BLOCKQUOTE>
<PRE> <PRE>
int32_t f64_to_i32( float64_t <I>a</I>, uint_fast8_t <I>roundingMode</I>, bool <I>exact</I> ); int32_t f64_to_i32( float64_t <I>a</I>, uint_fast8_t <I>roundingMode</I>, bool <I>exact</I> );
</PRE>
int32_t <PRE>
int32_t
f128M_to_i32( const float128_t *<I>aPtr</I>, uint_fast8_t <I>roundingMode</I>, bool <I>exact</I> ); f128M_to_i32( const float128_t *<I>aPtr</I>, uint_fast8_t <I>roundingMode</I>, bool <I>exact</I> );
</PRE> </PRE>
</BLOCKQUOTE>
The <CODE><I>roundingMode</I></CODE> argument specifies the rounding mode for The <CODE><I>roundingMode</I></CODE> argument specifies the rounding mode for
the conversion. the conversion.
The variable that usually indicates rounding mode, The variable that usually indicates rounding mode,
@ -768,12 +767,14 @@ and convenience:
These functions round only toward zero (to minimum magnitude). These functions round only toward zero (to minimum magnitude).
The signatures for these functions are the same as above without the redundant The signatures for these functions are the same as above without the redundant
<CODE><I>roundingMode</I></CODE> argument: <CODE><I>roundingMode</I></CODE> argument:
<BLOCKQUOTE>
<PRE> <PRE>
int32_t f64_to_i32_r_minMag( float64_t <I>a</I>, bool <I>exact</I> ); int32_t f64_to_i32_r_minMag( float64_t <I>a</I>, bool <I>exact</I> );
</PRE> </PRE>
<PRE> <PRE>
int32_t f128M_to_i32_r_minMag( const float128_t *<I>aPtr</I>, bool <I>exact</I> ); int32_t f128M_to_i32_r_minMag( const float128_t *<I>aPtr</I>, bool <I>exact</I> );
</PRE> </PRE>
</BLOCKQUOTE>
</P> </P>
<H3>8.3. Conversions Among Floating-Point Types</H3> <H3>8.3. Conversions Among Floating-Point Types</H3>
@ -789,18 +790,20 @@ result are different formats.
There are four different styles of signature for these functions, depending on There are four different styles of signature for these functions, depending on
whether the input and the output floating-point values are passed by value or whether the input and the output floating-point values are passed by value or
via pointers: via pointers:
<BLOCKQUOTE>
<PRE> <PRE>
float32_t f64_to_f32( float64_t <I>a</I> ); float32_t f64_to_f32( float64_t <I>a</I> );
</PRE> </PRE>
<PRE> <PRE>
float32_t f128M_to_f32( const float128_t *<I>aPtr</I> ); float32_t f128M_to_f32( const float128_t *<I>aPtr</I> );
</PRE> </PRE>
<PRE> <PRE>
void f32_to_f128M( float32_t <I>a</I>, float128_t *<I>destPtr</I> ); void f32_to_f128M( float32_t <I>a</I>, float128_t *<I>destPtr</I> );
</PRE> </PRE>
<PRE> <PRE>
void extF80M_to_f128M( const extFloat80_t *<I>aPtr</I>, float128_t *<I>destPtr</I> ); void extF80M_to_f128M( const extFloat80_t *<I>aPtr</I>, float128_t *<I>destPtr</I> );
</PRE> </PRE>
</BLOCKQUOTE>
</P> </P>
<P> <P>
@ -823,22 +826,22 @@ Each floating-point operation takes two operands, except for <CODE>sqrt</CODE>
(square root) which takes only one. (square root) which takes only one.
The operands and result are all of the same floating-point format. The operands and result are all of the same floating-point format.
Signatures for these functions take the following forms: Signatures for these functions take the following forms:
<BLOCKQUOTE>
<PRE> <PRE>
float64_t f64_add( float64_t <I>a</I>, float64_t <I>b</I> ); float64_t f64_add( float64_t <I>a</I>, float64_t <I>b</I> );
</PRE> </PRE>
<PRE> <PRE>
void void
f128M_add( f128M_add(
const float128_t *<I>aPtr</I>, const float128_t *<I>bPtr</I>, float128_t *<I>destPtr</I> ); const float128_t *<I>aPtr</I>, const float128_t *<I>bPtr</I>, float128_t *<I>destPtr</I> );
</PRE> </PRE>
</P>
<P>
<PRE> <PRE>
float64_t f64_sqrt( float64_t <I>a</I> ); float64_t f64_sqrt( float64_t <I>a</I> );
</PRE> </PRE>
<PRE> <PRE>
void f128M_sqrt( const float128_t *<I>aPtr</I>, float128_t *<I>destPtr</I> ); void f128M_sqrt( const float128_t *<I>aPtr</I>, float128_t *<I>destPtr</I> );
</PRE> </PRE>
</BLOCKQUOTE>
When floating-point values are passed indirectly through pointers, arguments When floating-point values are passed indirectly through pointers, arguments
<CODE><I>aPtr</I></CODE> and <CODE><I>bPtr</I></CODE> point to the input <CODE><I>aPtr</I></CODE> and <CODE><I>bPtr</I></CODE> point to the input
operands, and the last argument, <CODE><I>destPtr</I></CODE>, points to the operands, and the last argument, <CODE><I>destPtr</I></CODE>, points to the
@ -850,7 +853,7 @@ Rounding of the <NOBR>80-bit</NOBR> double-extended-precision
(<CODE>extFloat80_t</CODE>) functions is affected by variable (<CODE>extFloat80_t</CODE>) functions is affected by variable
<CODE>extF80_roundingPrecision</CODE>, as explained earlier in <CODE>extF80_roundingPrecision</CODE>, as explained earlier in
<NOBR>section 6.3</NOBR>, <NOBR>section 6.3</NOBR>,
<I>Rounding Precision for <NOBR>80-Bit</NOBR> Extended Format</I>. <I>Rounding Precision for the <NOBR>80-Bit</NOBR> Extended Format</I>.
</P> </P>
<H3>8.5. Fused Multiply-Add Functions</H3> <H3>8.5. Fused Multiply-Add Functions</H3>
@ -873,11 +876,12 @@ No fused multiple-add function is currently provided for the
<P> <P>
Depending on whether floating-point values are passed by value or via pointers, Depending on whether floating-point values are passed by value or via pointers,
the fused multiply-add functions have signatures of these forms: the fused multiply-add functions have signatures of these forms:
<BLOCKQUOTE>
<PRE> <PRE>
float64_t f64_mulAdd( float64_t <I>a</I>, float64_t <I>b</I>, float64_t <I>c</I> ); float64_t f64_mulAdd( float64_t <I>a</I>, float64_t <I>b</I>, float64_t <I>c</I> );
</PRE> </PRE>
<PRE> <PRE>
void void
f128M_mulAdd( f128M_mulAdd(
const float128_t *<I>aPtr</I>, const float128_t *<I>aPtr</I>,
const float128_t *<I>bPtr</I>, const float128_t *<I>bPtr</I>,
@ -885,6 +889,7 @@ the fused multiply-add functions have signatures of these forms:
float128_t *<I>destPtr</I> float128_t *<I>destPtr</I>
); );
</PRE> </PRE>
</BLOCKQUOTE>
The functions compute The functions compute
<NOBR>(<CODE><I>a</I></CODE> &times; <CODE><I>b</I></CODE>) <NOBR>(<CODE><I>a</I></CODE> &times; <CODE><I>b</I></CODE>)
+ <CODE><I>c</I></CODE></NOBR> + <CODE><I>c</I></CODE></NOBR>
@ -915,14 +920,16 @@ Each remainder operation takes two floating-point operands of the same format
and returns a result in the same format. and returns a result in the same format.
Depending on whether floating-point values are passed by value or via pointers, Depending on whether floating-point values are passed by value or via pointers,
the remainder functions have signatures of these forms: the remainder functions have signatures of these forms:
<BLOCKQUOTE>
<PRE> <PRE>
float64_t f64_rem( float64_t <I>a</I>, float64_t <I>b</I> ); float64_t f64_rem( float64_t <I>a</I>, float64_t <I>b</I> );
</PRE> </PRE>
<PRE> <PRE>
void void
f128M_rem( f128M_rem(
const float128_t *<I>aPtr</I>, const float128_t *<I>bPtr</I>, float128_t *<I>destPtr</I> ); const float128_t *<I>aPtr</I>, const float128_t *<I>bPtr</I>, float128_t *<I>destPtr</I> );
</PRE> </PRE>
</BLOCKQUOTE>
When floating-point values are passed indirectly through pointers, arguments When floating-point values are passed indirectly through pointers, arguments
<CODE><I>aPtr</I></CODE> and <CODE><I>bPtr</I></CODE> point to operands <CODE><I>aPtr</I></CODE> and <CODE><I>bPtr</I></CODE> point to operands
<CODE><I>a</I></CODE> and <CODE><I>b</I></CODE> respectively, and <CODE><I>a</I></CODE> and <CODE><I>b</I></CODE> respectively, and
@ -938,8 +945,8 @@ where <I>n</I> is the integer closest to
If <NOBR><CODE><I>a</I></CODE> &divide; <CODE><I>b</I></CODE></NOBR> is exactly If <NOBR><CODE><I>a</I></CODE> &divide; <CODE><I>b</I></CODE></NOBR> is exactly
halfway between two integers, <I>n</I> is the <EM>even</EM> integer closest to halfway between two integers, <I>n</I> is the <EM>even</EM> integer closest to
<NOBR><CODE><I>a</I></CODE> &divide; <CODE><I>b</I></CODE></NOBR>. <NOBR><CODE><I>a</I></CODE> &divide; <CODE><I>b</I></CODE></NOBR>.
The IEEE Standard's remainder operation is always exact and so requires no The IEEE Standard&rsquo;s remainder operation is always exact and so requires
rounding. no rounding.
</P> </P>
<P> <P>
@ -968,11 +975,12 @@ and the resulting integer value is returned in the same floating-point format.
<P> <P>
The signatures of the round-to-integer functions are similar to those for The signatures of the round-to-integer functions are similar to those for
conversions to an integer type: conversions to an integer type:
<BLOCKQUOTE>
<PRE> <PRE>
float64_t f64_roundToInt( float64_t <I>a</I>, uint_fast8_t <I>roundingMode</I>, bool <I>exact</I> ); float64_t f64_roundToInt( float64_t <I>a</I>, uint_fast8_t <I>roundingMode</I>, bool <I>exact</I> );
</PRE> </PRE>
<PRE> <PRE>
void void
f128M_roundToInt( f128M_roundToInt(
const float128_t *<I>aPtr</I>, const float128_t *<I>aPtr</I>,
uint_fast8_t <I>roundingMode</I>, uint_fast8_t <I>roundingMode</I>,
@ -980,6 +988,7 @@ conversions to an integer type:
float128_t *<I>destPtr</I> float128_t *<I>destPtr</I>
); );
</PRE> </PRE>
</BLOCKQUOTE>
The <CODE><I>roundingMode</I></CODE> argument specifies the rounding mode to The <CODE><I>roundingMode</I></CODE> argument specifies the rounding mode to
apply. apply.
The variable that usually indicates rounding mode, The variable that usually indicates rounding mode,
@ -1005,17 +1014,19 @@ provided:
<CODE>&lt;<I>float</I>&gt;_lt</CODE> <CODE>&lt;<I>float</I>&gt;_lt</CODE>
</BLOCKQUOTE> </BLOCKQUOTE>
Each comparison takes two operands of the same type and returns a Boolean. Each comparison takes two operands of the same type and returns a Boolean.
The abbreviation <CODE>eq</CODE> stands for ``equal'' (=); The abbreviation <CODE>eq</CODE> stands for &ldquo;equal&rdquo; (=);
<CODE>le</CODE> stands for ``less than or equal'' (&le;); <CODE>le</CODE> stands for &ldquo;less than or equal&rdquo; (&le;);
and <CODE>lt</CODE> stands for ``less than'' (&lt;). and <CODE>lt</CODE> stands for &ldquo;less than&rdquo; (&lt;).
Depending on whether the floating-point operands are passed by value or via Depending on whether the floating-point operands are passed by value or via
pointers, the comparison functions have signatures of these forms: pointers, the comparison functions have signatures of these forms:
<BLOCKQUOTE>
<PRE> <PRE>
bool f64_eq( float64_t <I>a</I>, float64_t <I>b</I> ); bool f64_eq( float64_t <I>a</I>, float64_t <I>b</I> );
</PRE> </PRE>
<PRE> <PRE>
bool f128M_eq( const float128_t *<I>aPtr</I>, const float128_t *<I>bPtr</I> ); bool f128M_eq( const float128_t *<I>aPtr</I>, const float128_t *<I>bPtr</I> );
</PRE> </PRE>
</BLOCKQUOTE>
</P> </P>
<P> <P>
@ -1058,21 +1069,25 @@ provided with these names:
The functions take one floating-point operand and return a Boolean indicating The functions take one floating-point operand and return a Boolean indicating
whether the operand is a signaling NaN. whether the operand is a signaling NaN.
Accordingly, the functions have the forms Accordingly, the functions have the forms
<BLOCKQUOTE>
<PRE> <PRE>
bool f64_isSignalingNaN( float64_t <I>a</I> ); bool f64_isSignalingNaN( float64_t <I>a</I> );
</PRE> </PRE>
<PRE> <PRE>
bool f128M_isSignalingNaN( const float128_t *<I>aPtr</I> ); bool f128M_isSignalingNaN( const float128_t *<I>aPtr</I> );
</PRE> </PRE>
</BLOCKQUOTE>
</P> </P>
<H3>8.10. Raise-Exception Function</H3> <H3>8.10. Raise-Exception Function</H3>
<P> <P>
SoftFloat provides a single function for raising floating-point exceptions: SoftFloat provides a single function for raising floating-point exceptions:
<BLOCKQUOTE>
<PRE> <PRE>
void softfloat_raise( uint_fast8_t <I>exceptions</I> ); void softfloat_raise( uint_fast8_t <I>exceptions</I> );
</PRE> </PRE>
</BLOCKQUOTE>
The <CODE><I>exceptions</I></CODE> argument is a mask indicating the set of The <CODE><I>exceptions</I></CODE> argument is a mask indicating the set of
exceptions to raise. exceptions to raise.
(See earlier section 7, <I>Exceptions and Exception Flags</I>.) (See earlier section 7, <I>Exceptions and Exception Flags</I>.)
@ -1084,6 +1099,11 @@ function may cause a trap or abort appropriate for the current system.
<H2>9. Changes from SoftFloat <NOBR>Release 2</NOBR></H2> <H2>9. Changes from SoftFloat <NOBR>Release 2</NOBR></H2>
<P>
Apart from the change in the legal use license, there are numerous technical
differences between <NOBR>Release 3</NOBR> of SoftFloat and earlier releases.
</P>
<H3>9.1. Name Changes</H3> <H3>9.1. Name Changes</H3>
<P> <P>
@ -1214,17 +1234,17 @@ Lastly, there are a few other changes to function names:
<TR> <TR>
<TD><CODE>_round_to_zero</CODE></TD> <TD><CODE>_round_to_zero</CODE></TD>
<TD><CODE>_r_minMag</CODE></TD> <TD><CODE>_r_minMag</CODE></TD>
<TD>conversions from floating-point to integer, section 8.2</TD> <TD>conversions from floating-point to integer (<NOBR>section 8.2</NOBR>)</TD>
</TR> </TR>
<TR> <TR>
<TD><CODE>round_to_int</CODE></TD> <TD><CODE>round_to_int</CODE></TD>
<TD><CODE>roundToInt</CODE></TD> <TD><CODE>roundToInt</CODE></TD>
<TD>round-to-integer functions, section 8.7</TD> <TD>round-to-integer functions (<NOBR>section 8.7</NOBR>)</TD>
</TR> </TR>
<TR> <TR>
<TD><CODE>is_signaling_nan&nbsp;&nbsp;&nbsp;&nbsp;</CODE></TD> <TD><CODE>is_signaling_nan&nbsp;&nbsp;&nbsp;&nbsp;</CODE></TD>
<TD><CODE>isSignalingNaN</CODE></TD> <TD><CODE>isSignalingNaN</CODE></TD>
<TD>signaling NaN test functions, section 8.9</TD> <TD>signaling NaN test functions (<NOBR>section 8.9</NOBR>)</TD>
</TR> </TR>
</TABLE> </TABLE>
</BLOCKQUOTE> </BLOCKQUOTE>
@ -1296,7 +1316,7 @@ argument <CODE><I>exact</I></CODE>.
<P> <P>
With <NOBR>Release 3</NOBR>, a port of SoftFloat can now define any of the With <NOBR>Release 3</NOBR>, a port of SoftFloat can now define any of the
floating-point types <CODE>float32_t</CODE>, <CODE>float64_t</CODE>, floating-point types <CODE>float32_t</CODE>, <CODE>float64_t</CODE>,
<CODE>extFloat80_t</CODE>, and <CODE>float128_t</CODE> as aliases for C's <CODE>extFloat80_t</CODE>, and <CODE>float128_t</CODE> as aliases for C&rsquo;s
standard floating-point types <CODE>float</CODE>, <CODE>double</CODE>, and standard floating-point types <CODE>float</CODE>, <CODE>double</CODE>, and
<CODE>long</CODE> <CODE>double</CODE>, using either <CODE>#define</CODE> or <CODE>long</CODE> <CODE>double</CODE>, using either <CODE>#define</CODE> or
<CODE>typedef</CODE>. <CODE>typedef</CODE>.
@ -1304,9 +1324,9 @@ This potential convenience was not supported under <NOBR>Release 2</NOBR>.
</P> </P>
<P> <P>
(Note, however, that there may be a performance cost to defining SoftFloat's (Note, however, that there may be a performance cost to defining
floating-point types this way, depending on the platform and the applications SoftFloat&rsquo;s floating-point types this way, depending on the platform and
using SoftFloat. the applications using SoftFloat.
Ports of SoftFloat may choose to forgo the convenience in favor of better Ports of SoftFloat may choose to forgo the convenience in favor of better
speed.) speed.)
</P> </P>
@ -1338,7 +1358,7 @@ Fused multiply-add functions have been added for the non-extended formats,
<P> <P>
<NOBR>Release 3</NOBR> of SoftFloat is written to conform better to the ISO C <NOBR>Release 3</NOBR> of SoftFloat is written to conform better to the ISO C
Standard's rules for portability. Standard&rsquo;s rules for portability.
For example, older releases of SoftFloat employed type conversions in ways For example, older releases of SoftFloat employed type conversions in ways
that, while commonly practiced, are not fully defined by the C Standard. that, while commonly practiced, are not fully defined by the C Standard.
Such problematic type conversions have generally been replaced by the use of Such problematic type conversions have generally been replaced by the use of
@ -1387,8 +1407,8 @@ Some loss of speed has been observed due to this change.
The following improvements are anticipated for future releases of SoftFloat: The following improvements are anticipated for future releases of SoftFloat:
<UL> <UL>
<LI> <LI>
support for the common <NOBR>16-bit</NOBR> ``half-precision'' floating-point support for the common <NOBR>16-bit</NOBR> &ldquo;half-precision&rdquo;
format; floating-point format;
<LI> <LI>
more functions from the 2008 version of the IEEE Floating-Point Standard; more functions from the 2008 version of the IEEE Floating-Point Standard;
<LI> <LI>