Finalized documentation for SoftFloat Release 3.
This commit is contained in:
parent
437d9b9fb2
commit
7276b0022e
|
@ -2,7 +2,7 @@
|
||||||
License for Berkeley SoftFloat Release 3
|
License for Berkeley SoftFloat Release 3
|
||||||
|
|
||||||
John R. Hauser
|
John R. Hauser
|
||||||
2014 ________
|
2014 Dec 17
|
||||||
|
|
||||||
The following applies to the whole of SoftFloat Release 3 as well as to each
|
The following applies to the whole of SoftFloat Release 3 as well as to each
|
||||||
source file individually.
|
source file individually.
|
||||||
|
|
|
@ -11,7 +11,7 @@
|
||||||
|
|
||||||
<P>
|
<P>
|
||||||
John R. Hauser<BR>
|
John R. Hauser<BR>
|
||||||
2014 ________<BR>
|
2014 Dec 17<BR>
|
||||||
</P>
|
</P>
|
||||||
|
|
||||||
<P>
|
<P>
|
||||||
|
@ -19,7 +19,8 @@ Berkeley SoftFloat is a software implementation of binary floating-point that
|
||||||
conforms to the IEEE Standard for Floating-Point Arithmetic.
|
conforms to the IEEE Standard for Floating-Point Arithmetic.
|
||||||
SoftFloat is distributed in the form of C source code.
|
SoftFloat is distributed in the form of C source code.
|
||||||
Building the SoftFloat sources generates a library file (typically
|
Building the SoftFloat sources generates a library file (typically
|
||||||
<CODE>"softfloat.a"</CODE>) containing the floating-point subroutines.
|
<CODE>softfloat.a</CODE> or <CODE>libsoftfloat.a</CODE>) containing the
|
||||||
|
floating-point subroutines.
|
||||||
</P>
|
</P>
|
||||||
|
|
||||||
<P>
|
<P>
|
||||||
|
|
|
@ -2,13 +2,13 @@
|
||||||
Package Overview for Berkeley SoftFloat Release 3
|
Package Overview for Berkeley SoftFloat Release 3
|
||||||
|
|
||||||
John R. Hauser
|
John R. Hauser
|
||||||
2014 ________
|
2014 Dec 17
|
||||||
|
|
||||||
Berkeley SoftFloat is a software implementation of binary floating-point
|
Berkeley SoftFloat is a software implementation of binary floating-point
|
||||||
that conforms to the IEEE Standard for Floating-Point Arithmetic. SoftFloat
|
that conforms to the IEEE Standard for Floating-Point Arithmetic. SoftFloat
|
||||||
is distributed in the form of C source code. Building the SoftFloat sources
|
is distributed in the form of C source code. Building the SoftFloat sources
|
||||||
generates a library file (typically "softfloat.a") containing the floating-
|
generates a library file (typically "softfloat.a" or "libsoftfloat.a")
|
||||||
point subroutines.
|
containing the floating-point subroutines.
|
||||||
|
|
||||||
The SoftFloat package is documented in the following files in the "doc"
|
The SoftFloat package is documented in the following files in the "doc"
|
||||||
subdirectory:
|
subdirectory:
|
||||||
|
|
|
@ -11,11 +11,7 @@
|
||||||
|
|
||||||
<P>
|
<P>
|
||||||
John R. Hauser<BR>
|
John R. Hauser<BR>
|
||||||
2014 _____<BR>
|
2014 Dec 17<BR>
|
||||||
</P>
|
|
||||||
|
|
||||||
<P>
|
|
||||||
*** CONTENT DONE.
|
|
||||||
</P>
|
</P>
|
||||||
|
|
||||||
|
|
||||||
|
@ -24,7 +20,8 @@ John R. Hauser<BR>
|
||||||
<UL>
|
<UL>
|
||||||
|
|
||||||
<LI>
|
<LI>
|
||||||
Complete rewrite, funded by the University of California, Berkeley.
|
Complete rewrite, funded by the University of California, Berkeley, and
|
||||||
|
consequently having a different use license than earlier releases.
|
||||||
Major changes included renaming most types and functions, upgrading some
|
Major changes included renaming most types and functions, upgrading some
|
||||||
algorithms, restructuring the source files, and making SoftFloat into a true
|
algorithms, restructuring the source files, and making SoftFloat into a true
|
||||||
library.
|
library.
|
||||||
|
@ -54,8 +51,9 @@ TestFloat package).
|
||||||
<UL>
|
<UL>
|
||||||
|
|
||||||
<LI>
|
<LI>
|
||||||
Further improved wording for the legal restrictions on using SoftFloat releases
|
Further improved the wording for the legal restrictions on using SoftFloat
|
||||||
<NOBR>through 2c</NOBR>.
|
releases <NOBR>through 2c</NOBR> (not applicable to <NOBR>Release 3</NOBR> or
|
||||||
|
later).
|
||||||
|
|
||||||
</UL>
|
</UL>
|
||||||
|
|
||||||
|
@ -134,7 +132,8 @@ tininess is detected before or after rounding.
|
||||||
<UL>
|
<UL>
|
||||||
|
|
||||||
<LI>
|
<LI>
|
||||||
Original release.
|
Original release, based on work done for the International Computer Science
|
||||||
|
Institute (ICSI) in Berkely, California.
|
||||||
|
|
||||||
</UL>
|
</UL>
|
||||||
|
|
||||||
|
|
|
@ -11,36 +11,39 @@
|
||||||
|
|
||||||
<P>
|
<P>
|
||||||
John R. Hauser<BR>
|
John R. Hauser<BR>
|
||||||
2014 _____<BR>
|
2014 Dec 17<BR>
|
||||||
</P>
|
|
||||||
|
|
||||||
<P>
|
|
||||||
*** REPLACE QUOTATION MARKS.
|
|
||||||
</P>
|
</P>
|
||||||
|
|
||||||
|
|
||||||
<H2>Contents</H2>
|
<H2>Contents</H2>
|
||||||
|
|
||||||
<P>
|
<BLOCKQUOTE>
|
||||||
*** CHECK.<BR>
|
<TABLE BORDER=0 CELLSPACING=0 CELLPADDING=0>
|
||||||
*** FIX FORMATTING.
|
<COL WIDTH=25>
|
||||||
</P>
|
<COL WIDTH=*>
|
||||||
|
<TR><TD COLSPAN=2>1. Introduction</TD></TR>
|
||||||
<PRE>
|
<TR><TD COLSPAN=2>2. Limitations</TD></TR>
|
||||||
Introduction
|
<TR><TD COLSPAN=2>3. Acknowledgments and License</TD></TR>
|
||||||
Limitations
|
<TR><TD COLSPAN=2>4. SoftFloat Package Directory Structure</TD></TR>
|
||||||
Acknowledgments and License
|
<TR><TD COLSPAN=2>5. Issues for Porting SoftFloat to a New Target</TD></TR>
|
||||||
SoftFloat Package Directory Structure
|
<TR>
|
||||||
Issues for Porting SoftFloat to a New Target
|
<TD></TD>
|
||||||
Standard Headers <stdbool.h> and <stdint.h>
|
<TD>5.1. Standard Headers <CODE><stdbool.h></CODE> and
|
||||||
Specializing Floating-Point Behavior
|
<CODE><stdint.h></CODE></TD>
|
||||||
Macros for Build Options
|
</TR>
|
||||||
Adapting a Template Target Directory
|
<TR><TD></TD><TD>5.2. Specializing Floating-Point Behavior</TD></TR>
|
||||||
Target-Specific Optimization of Primitive Functions
|
<TR><TD></TD><TD>5.3. Macros for Build Options</TD></TR>
|
||||||
Testing SoftFloat
|
<TR><TD></TD><TD>5.4. Adapting a Template Target Directory</TD></TR>
|
||||||
Providing SoftFloat as a Common Library for Applications
|
<TR>
|
||||||
Contact Information
|
<TD></TD><TD>5.5. Target-Specific Optimization of Primitive Functions</TD>
|
||||||
</PRE>
|
</TR>
|
||||||
|
<TR><TD COLSPAN=2>6. Testing SoftFloat</TD></TR>
|
||||||
|
<TR>
|
||||||
|
<TD COLSPAN=2>7. Providing SoftFloat as a Common Library for Applications</TD>
|
||||||
|
</TR>
|
||||||
|
<TR><TD COLSPAN=2>8. Contact Information</TD></TR>
|
||||||
|
</TABLE>
|
||||||
|
</BLOCKQUOTE>
|
||||||
|
|
||||||
|
|
||||||
<H2>1. Introduction</H2>
|
<H2>1. Introduction</H2>
|
||||||
|
@ -98,7 +101,7 @@ strictly required.
|
||||||
integer types.
|
integer types.
|
||||||
If these headers are not supplied with the C compiler, minimal substitutes must
|
If these headers are not supplied with the C compiler, minimal substitutes must
|
||||||
be provided.
|
be provided.
|
||||||
SoftFloat's dependence on these headers is detailed later in
|
SoftFloat’s dependence on these headers is detailed later in
|
||||||
<NOBR>section 5.1</NOBR>, <I>Standard Headers <stdbool.h> and
|
<NOBR>section 5.1</NOBR>, <I>Standard Headers <stdbool.h> and
|
||||||
<stdint.h></I>.
|
<stdint.h></I>.
|
||||||
</P>
|
</P>
|
||||||
|
@ -110,15 +113,20 @@ SoftFloat's dependence on these headers is detailed later in
|
||||||
The SoftFloat package was written by me, <NOBR>John R.</NOBR> Hauser.
|
The SoftFloat package was written by me, <NOBR>John R.</NOBR> Hauser.
|
||||||
<NOBR>Release 3</NOBR> of SoftFloat is a completely new implementation
|
<NOBR>Release 3</NOBR> of SoftFloat is a completely new implementation
|
||||||
supplanting earlier releases.
|
supplanting earlier releases.
|
||||||
This project was done in the employ of the University of California, Berkeley,
|
This project (<NOBR>Release 3</NOBR> only, not earlier releases) was done in
|
||||||
within the Department of Electrical Engineering and Computer Sciences, first
|
the employ of the University of California, Berkeley, within the Department of
|
||||||
for the Parallel Computing Laboratory (Par Lab) and then for the ASPIRE Lab.
|
Electrical Engineering and Computer Sciences, first for the Parallel Computing
|
||||||
|
Laboratory (Par Lab) and then for the ASPIRE Lab.
|
||||||
The work was officially overseen by Prof. Krste Asanovic, with funding provided
|
The work was officially overseen by Prof. Krste Asanovic, with funding provided
|
||||||
by these sources:
|
by these sources:
|
||||||
<BLOCKQUOTE>
|
<BLOCKQUOTE>
|
||||||
<TABLE>
|
<TABLE>
|
||||||
|
<COL WIDTH=*>
|
||||||
|
<COL WIDTH=10>
|
||||||
|
<COL WIDTH=*>
|
||||||
<TR>
|
<TR>
|
||||||
<TD><NOBR>Par Lab:</NOBR></TD>
|
<TD VALIGN=TOP><NOBR>Par Lab:</NOBR></TD>
|
||||||
|
<TD></TD>
|
||||||
<TD>
|
<TD>
|
||||||
Microsoft (Award #024263), Intel (Award #024894), and U.C. Discovery
|
Microsoft (Award #024263), Intel (Award #024894), and U.C. Discovery
|
||||||
(Award #DIG07-10227), with additional support from Par Lab affiliates Nokia,
|
(Award #DIG07-10227), with additional support from Par Lab affiliates Nokia,
|
||||||
|
@ -126,7 +134,8 @@ NVIDIA, Oracle, and Samsung.
|
||||||
</TD>
|
</TD>
|
||||||
</TR>
|
</TR>
|
||||||
<TR>
|
<TR>
|
||||||
<TD><NOBR>ASPIRE Lab:</NOBR></TD>
|
<TD VALIGN=TOP><NOBR>ASPIRE Lab:</NOBR></TD>
|
||||||
|
<TD></TD>
|
||||||
<TD>
|
<TD>
|
||||||
DARPA PERFECT program (Award #HR0011-12-2-0016), with additional support from
|
DARPA PERFECT program (Award #HR0011-12-2-0016), with additional support from
|
||||||
ASPIRE industrial sponsor Intel and ASPIRE affiliates Google, Nokia, NVIDIA,
|
ASPIRE industrial sponsor Intel and ASPIRE affiliates Google, Nokia, NVIDIA,
|
||||||
|
@ -185,27 +194,40 @@ Because SoftFloat is targeted to multiple platforms, its source code is
|
||||||
slightly scattered between target-specific and target-independent directories
|
slightly scattered between target-specific and target-independent directories
|
||||||
and files.
|
and files.
|
||||||
The supplied directory structure is as follows:
|
The supplied directory structure is as follows:
|
||||||
|
<BLOCKQUOTE>
|
||||||
<PRE>
|
<PRE>
|
||||||
doc
|
doc
|
||||||
source
|
source
|
||||||
include
|
include
|
||||||
8086
|
8086
|
||||||
build
|
8086-SSE
|
||||||
|
build
|
||||||
template-FAST_INT64
|
template-FAST_INT64
|
||||||
template-not-FAST_INT64
|
template-not-FAST_INT64
|
||||||
Linux-386-GCC
|
Linux-386-GCC
|
||||||
|
Linux-386-SSE2-GCC
|
||||||
Linux-x86_64-GCC
|
Linux-x86_64-GCC
|
||||||
Win32-MinGW
|
Win32-MinGW
|
||||||
|
Win32-SSE2-MinGW
|
||||||
Win64-MinGW-w64
|
Win64-MinGW-w64
|
||||||
</PRE>
|
</PRE>
|
||||||
|
</BLOCKQUOTE>
|
||||||
The majority of the SoftFloat sources are provided in the <CODE>source</CODE>
|
The majority of the SoftFloat sources are provided in the <CODE>source</CODE>
|
||||||
directory.
|
directory.
|
||||||
The <CODE>include</CODE> subdirectory of <CODE>source</CODE> contains several
|
The <CODE>include</CODE> subdirectory of <CODE>source</CODE> contains several
|
||||||
header files (unsurprisingly), while the <CODE>8086</CODE> subdirectory
|
header files (unsurprisingly), while the <CODE>8086</CODE> and
|
||||||
contains source files that specialize the floating-point behavior to match the
|
<NOBR><CODE>8086-SSE</CODE></NOBR> subdirectories contain source files that
|
||||||
Intel x86 line of processors.
|
specialize the floating-point behavior to match the Intel x86 line of
|
||||||
|
processors.
|
||||||
|
The files in directory <CODE>8086</CODE> give floating-point behavior
|
||||||
|
consistent solely with Intel’s older, 8087-derived floating-point, while
|
||||||
|
those in <NOBR><CODE>8086-SSE</CODE></NOBR> update the behavior of the
|
||||||
|
non-extended formats (<CODE>float32_t</CODE>, <CODE>float64_t</CODE>, and
|
||||||
|
<CODE>float128_t</CODE>) to mirror Intel’s more recent Streaming SIMD
|
||||||
|
Extensions (SSE) and other compatible extensions.
|
||||||
If other specializations are attempted, these would be expected to be other
|
If other specializations are attempted, these would be expected to be other
|
||||||
subdirectories of <CODE>source</CODE> alongside <CODE>8086</CODE>.
|
subdirectories of <CODE>source</CODE> alongside <CODE>8086</CODE> and
|
||||||
|
<NOBR><CODE>8086-SSE</CODE></NOBR>.
|
||||||
Specialization is covered later, in <NOBR>section 5.2</NOBR>, <I>Specializing
|
Specialization is covered later, in <NOBR>section 5.2</NOBR>, <I>Specializing
|
||||||
Floating-Point Behavior</I>.
|
Floating-Point Behavior</I>.
|
||||||
</P>
|
</P>
|
||||||
|
@ -213,9 +235,9 @@ Floating-Point Behavior</I>.
|
||||||
<P>
|
<P>
|
||||||
The <CODE>build</CODE> directory is intended to contain a subdirectory for each
|
The <CODE>build</CODE> directory is intended to contain a subdirectory for each
|
||||||
target platform for which a build of the SoftFloat library may be created.
|
target platform for which a build of the SoftFloat library may be created.
|
||||||
For each build target, the target's subdirectory is where all derived object
|
For each build target, the target’s subdirectory is where all derived
|
||||||
files and the completed SoftFloat library (typically <CODE>softfloat.a</CODE>
|
object files and the completed SoftFloat library (typically
|
||||||
or <CODE>libsoftfloat.a</CODE>) are created.
|
<CODE>softfloat.a</CODE> or <CODE>libsoftfloat.a</CODE>) are created.
|
||||||
The two <CODE>template</CODE> subdirectories are not actual build targets but
|
The two <CODE>template</CODE> subdirectories are not actual build targets but
|
||||||
contain sample files for creating new target directories.
|
contain sample files for creating new target directories.
|
||||||
(The meaning of <CODE>FAST_INT64</CODE> will be explained later.)
|
(The meaning of <CODE>FAST_INT64</CODE> will be explained later.)
|
||||||
|
@ -227,18 +249,21 @@ are intended to follow a naming system of
|
||||||
<NOBR><CODE><execution-environment>-<compiler></CODE></NOBR>.
|
<NOBR><CODE><execution-environment>-<compiler></CODE></NOBR>.
|
||||||
For the example targets,
|
For the example targets,
|
||||||
<NOBR><CODE><execution-environment></CODE></NOBR> is
|
<NOBR><CODE><execution-environment></CODE></NOBR> is
|
||||||
<NOBR><CODE>Linux-386</CODE></NOBR>, <NOBR><CODE>Linux-x86_64</CODE></NOBR>,
|
<NOBR><CODE>Linux-386</CODE></NOBR>, <NOBR><CODE>Linux-386-SSE2</CODE></NOBR>,
|
||||||
<CODE>Win32</CODE>, or <CODE>Win64</CODE>, and
|
<NOBR><CODE>Linux-x86_64</CODE></NOBR>, <CODE>Win32</CODE>,
|
||||||
|
<NOBR><CODE>Win32-SSE2</CODE></NOBR>, or <CODE>Win64</CODE>, and
|
||||||
<NOBR><CODE><compiler></CODE></NOBR> is <CODE>GCC</CODE>,
|
<NOBR><CODE><compiler></CODE></NOBR> is <CODE>GCC</CODE>,
|
||||||
<CODE>MinGW</CODE>, or <NOBR><CODE>MinGW-w64</CODE></NOBR>.
|
<CODE>MinGW</CODE>, or <NOBR><CODE>MinGW-w64</CODE></NOBR>.
|
||||||
</P>
|
</P>
|
||||||
|
|
||||||
<P>
|
<P>
|
||||||
As supplied, each target directory contains two files:
|
As supplied, each target directory contains two files:
|
||||||
|
<BLOCKQUOTE>
|
||||||
<PRE>
|
<PRE>
|
||||||
Makefile
|
Makefile
|
||||||
platform.h
|
platform.h
|
||||||
</PRE>
|
</PRE>
|
||||||
|
</BLOCKQUOTE>
|
||||||
The provided <CODE>Makefile</CODE> is written for GNU <CODE>make</CODE>.
|
The provided <CODE>Makefile</CODE> is written for GNU <CODE>make</CODE>.
|
||||||
A build of SoftFloat for the specific target is begun by executing the
|
A build of SoftFloat for the specific target is begun by executing the
|
||||||
<CODE>make</CODE> command with the target directory as the current directory.
|
<CODE>make</CODE> command with the target directory as the current directory.
|
||||||
|
@ -258,10 +283,10 @@ desirable to include in header <CODE>platform.h</CODE> (directly or via
|
||||||
<CODE>#include</CODE>) declarations for numerous target-specific optimizations.
|
<CODE>#include</CODE>) declarations for numerous target-specific optimizations.
|
||||||
Such possibilities are discussed in the next section, <I>Issues for Porting
|
Such possibilities are discussed in the next section, <I>Issues for Porting
|
||||||
SoftFloat to a New Target</I>.
|
SoftFloat to a New Target</I>.
|
||||||
If the target's compiler or library has bugs or other shortcomings, workarounds
|
If the target’s compiler or library has bugs or other shortcomings,
|
||||||
for these issues may also be possible with target-specific declarations in
|
workarounds for these issues may also be possible with target-specific
|
||||||
<CODE>platform.h</CODE>, avoiding the need to modify the main SoftFloat
|
declarations in <CODE>platform.h</CODE>, avoiding the need to modify the main
|
||||||
sources.
|
SoftFloat sources.
|
||||||
</P>
|
</P>
|
||||||
|
|
||||||
|
|
||||||
|
@ -280,30 +305,34 @@ For older or nonstandard compilers, substitutes for
|
||||||
<CODE><stdbool.h></CODE> and <CODE><stdint.h></CODE> may need to be
|
<CODE><stdbool.h></CODE> and <CODE><stdint.h></CODE> may need to be
|
||||||
created.
|
created.
|
||||||
SoftFloat depends on these names from <CODE><stdbool.h></CODE>:
|
SoftFloat depends on these names from <CODE><stdbool.h></CODE>:
|
||||||
|
<BLOCKQUOTE>
|
||||||
<PRE>
|
<PRE>
|
||||||
bool
|
bool
|
||||||
true
|
true
|
||||||
false
|
false
|
||||||
</PRE>
|
</PRE>
|
||||||
|
</BLOCKQUOTE>
|
||||||
and on these names from <CODE><stdint.h></CODE>:
|
and on these names from <CODE><stdint.h></CODE>:
|
||||||
|
<BLOCKQUOTE>
|
||||||
<PRE>
|
<PRE>
|
||||||
uint16_t
|
uint16_t
|
||||||
uint32_t
|
uint32_t
|
||||||
uint64_t
|
uint64_t
|
||||||
int32_t
|
int32_t
|
||||||
int64_t
|
int64_t
|
||||||
UINT64_C
|
UINT64_C
|
||||||
INT64_C
|
INT64_C
|
||||||
uint_least8_t
|
uint_least8_t
|
||||||
uint_fast8_t
|
uint_fast8_t
|
||||||
uint_fast16_t
|
uint_fast16_t
|
||||||
uint_fast32_t
|
uint_fast32_t
|
||||||
uint_fast64_t
|
uint_fast64_t
|
||||||
int_fast8_t
|
int_fast8_t
|
||||||
int_fast16_t
|
int_fast16_t
|
||||||
int_fast32_t
|
int_fast32_t
|
||||||
int_fast64_t
|
int_fast64_t
|
||||||
</PRE>
|
</PRE>
|
||||||
|
</BLOCKQUOTE>
|
||||||
</P>
|
</P>
|
||||||
|
|
||||||
|
|
||||||
|
@ -312,12 +341,12 @@ and on these names from <CODE><stdint.h></CODE>:
|
||||||
<P>
|
<P>
|
||||||
The IEEE Floating-Point Standard allows for some flexibility in a conforming
|
The IEEE Floating-Point Standard allows for some flexibility in a conforming
|
||||||
implementation, particularly concerning NaNs.
|
implementation, particularly concerning NaNs.
|
||||||
The SoftFloat <CODE>source</CODE> directory is supplied with one or more
|
The SoftFloat <CODE>source</CODE> directory is supplied with some
|
||||||
<I>specialization</I> subdirectories containing possible definitions for this
|
<I>specialization</I> subdirectories containing possible definitions for this
|
||||||
implementation-specific behavior.
|
implementation-specific behavior.
|
||||||
For example, the <CODE>8086</CODE> subdirectory has source files that
|
For example, the <CODE>8086</CODE> and <NOBR><CODE>8086-SSE</CODE></NOBR>
|
||||||
specialize SoftFloat's behavior to match that of Intel's x86 line of
|
subdirectories have source files that specialize SoftFloat’s behavior to
|
||||||
processors.
|
match that of Intel’s x86 line of processors.
|
||||||
The files in a specialization subdirectory must determine:
|
The files in a specialization subdirectory must determine:
|
||||||
<UL>
|
<UL>
|
||||||
<LI>
|
<LI>
|
||||||
|
@ -343,8 +372,9 @@ source files are needed to complete the specialization.
|
||||||
</P>
|
</P>
|
||||||
|
|
||||||
<P>
|
<P>
|
||||||
A new build target may use an existing specialization, such as the one provided
|
A new build target may use an existing specialization, such as the ones
|
||||||
by the <CODE>8086</CODE> subdirectory.
|
provided by the <CODE>8086</CODE> and <NOBR><CODE>8086-SSE</CODE></NOBR>
|
||||||
|
subdirectories.
|
||||||
If a build target needs a new specialization, different from any existing ones,
|
If a build target needs a new specialization, different from any existing ones,
|
||||||
it is recommended that a new specialization subdirectory be created in the
|
it is recommended that a new specialization subdirectory be created in the
|
||||||
<CODE>source</CODE> directory for this purpose.
|
<CODE>source</CODE> directory for this purpose.
|
||||||
|
@ -367,18 +397,18 @@ Must be defined for little-endian machines; must not be defined for big-endian
|
||||||
machines.
|
machines.
|
||||||
<DT><CODE>SOFTFLOAT_FAST_INT64</CODE>
|
<DT><CODE>SOFTFLOAT_FAST_INT64</CODE>
|
||||||
<DD>
|
<DD>
|
||||||
Can be defined to indicate that the build target's implementation of
|
Can be defined to indicate that the build target’s implementation of
|
||||||
<CODE>64-bit</CODE> arithmetic is efficient.
|
<NOBR>64-bit</NOBR> arithmetic is efficient.
|
||||||
For newer <CODE>64-bit</CODE> processors, this macro should usually be defined.
|
For newer <NOBR>64-bit</NOBR> processors, this macro should usually be defined.
|
||||||
For very small microprocessors whose buses and registers are <CODE>8-bit</CODE>
|
For very small microprocessors whose buses and registers are <NOBR>8-bit</NOBR>
|
||||||
or <CODE>16-bit</CODE> in size, this macro should usually not be defined.
|
or <NOBR>16-bit</NOBR> in size, this macro should usually not be defined.
|
||||||
Whether this macro should be defined for a <CODE>32-bit</CODE> processor may
|
Whether this macro should be defined for a <NOBR>32-bit</NOBR> processor may
|
||||||
depend on the target machine and the applications that will use SoftFloat.
|
depend on the target machine and the applications that will use SoftFloat.
|
||||||
<DT><CODE>SOFTFLOAT_FAST_DIV64TO32</CODE>
|
<DT><CODE>SOFTFLOAT_FAST_DIV64TO32</CODE>
|
||||||
<DD>
|
<DD>
|
||||||
Can be defined to indicate that the target's division operator
|
Can be defined to indicate that the target’s division operator
|
||||||
<NOBR>in C</NOBR> (written as <CODE>/</CODE>) is reasonably efficient for
|
<NOBR>in C</NOBR> (written as <CODE>/</CODE>) is reasonably efficient for
|
||||||
dividing a <CODE>64-bit</CODE> unsigned integer by a <CODE>32-bit</CODE>
|
dividing a <NOBR>64-bit</NOBR> unsigned integer by a <NOBR>32-bit</NOBR>
|
||||||
unsigned integer.
|
unsigned integer.
|
||||||
Setting this macro may affect the performance of division, remainder, and
|
Setting this macro may affect the performance of division, remainder, and
|
||||||
square root operations.
|
square root operations.
|
||||||
|
@ -411,16 +441,16 @@ defined to <CODE>extern</CODE> <CODE>inline</CODE>.
|
||||||
Following the usual custom <NOBR>for C</NOBR>, for the first three macros (all
|
Following the usual custom <NOBR>for C</NOBR>, for the first three macros (all
|
||||||
except <CODE>INLINE_LEVEL</CODE> and <CODE>INLINE</CODE>), the content of any
|
except <CODE>INLINE_LEVEL</CODE> and <CODE>INLINE</CODE>), the content of any
|
||||||
definition is irrelevant;
|
definition is irrelevant;
|
||||||
what matters is a macro's effect on <CODE>#ifdef</CODE> directives.
|
what matters is a macro’s effect on <CODE>#ifdef</CODE> directives.
|
||||||
</P>
|
</P>
|
||||||
|
|
||||||
<P>
|
<P>
|
||||||
It is recommended that any definitions of macros <CODE>LITTLEENDIAN</CODE> and
|
It is recommended that any definitions of macros <CODE>LITTLEENDIAN</CODE> and
|
||||||
<CODE>INLINE</CODE> be made in a build target's <CODE>platform.h</CODE> header
|
<CODE>INLINE</CODE> be made in a build target’s <CODE>platform.h</CODE>
|
||||||
file, because these macros are expected to be determined inflexibly by the
|
header file, because these macros are expected to be determined inflexibly by
|
||||||
target machine and compiler.
|
the target machine and compiler.
|
||||||
The other three macros control optimization and might be better located in the
|
The other three macros control optimization and might be better located in the
|
||||||
target's Makefile (or its equivalent).
|
target’s Makefile (or its equivalent).
|
||||||
</P>
|
</P>
|
||||||
|
|
||||||
|
|
||||||
|
@ -433,8 +463,9 @@ Two different templates exist because different functions are needed in the
|
||||||
SoftFloat library depending on whether macro <CODE>SOFTFLOAT_FAST_INT64</CODE>
|
SoftFloat library depending on whether macro <CODE>SOFTFLOAT_FAST_INT64</CODE>
|
||||||
is defined.
|
is defined.
|
||||||
If macro <CODE>SOFTFLOAT_FAST_INT64</CODE> will be defined,
|
If macro <CODE>SOFTFLOAT_FAST_INT64</CODE> will be defined,
|
||||||
<CODE>template-FAST_INT64</CODE> is the template to use;
|
<NOBR><CODE>template-FAST_INT64</CODE></NOBR> is the template to use;
|
||||||
otherwise, <CODE>template-not-FAST_INT64</CODE> is the appropriate template.
|
otherwise, <NOBR><CODE>template-not-FAST_INT64</CODE></NOBR> is the appropriate
|
||||||
|
template.
|
||||||
A new target directory can be created by copying the correct template directory
|
A new target directory can be created by copying the correct template directory
|
||||||
and editing the files inside.
|
and editing the files inside.
|
||||||
To avoid confusion, it would be wise to refrain from editing the files within a
|
To avoid confusion, it would be wise to refrain from editing the files within a
|
||||||
|
@ -447,12 +478,12 @@ template directory directly.
|
||||||
<P>
|
<P>
|
||||||
Header file <CODE>primitives.h</CODE> (in directory
|
Header file <CODE>primitives.h</CODE> (in directory
|
||||||
<CODE>source/include</CODE>) declares macros and functions for numerous
|
<CODE>source/include</CODE>) declares macros and functions for numerous
|
||||||
underlying arithmetic operations upon which many of SoftFloat's floating-point
|
underlying arithmetic operations upon which many of SoftFloat’s
|
||||||
functions are ultimately built.
|
floating-point functions are ultimately built.
|
||||||
The SoftFloat sources include implementations of all of these functions/macros,
|
The SoftFloat sources include implementations of all of these functions/macros,
|
||||||
written as standard C code, so a complete and correct SoftFloat library can be
|
written as standard C code, so a complete and correct SoftFloat library can be
|
||||||
built using only the supplied code for all functions.
|
built using only the supplied code for all functions.
|
||||||
However, for many targets, SoftFloat's performance can be improved by
|
However, for many targets, SoftFloat’s performance can be improved by
|
||||||
substituting target-specific implementations of some of the functions/macros
|
substituting target-specific implementations of some of the functions/macros
|
||||||
declared in <CODE>primitives.h</CODE>.
|
declared in <CODE>primitives.h</CODE>.
|
||||||
</P>
|
</P>
|
||||||
|
@ -461,7 +492,7 @@ declared in <CODE>primitives.h</CODE>.
|
||||||
For example, <CODE>primitives.h</CODE> declares a function called
|
For example, <CODE>primitives.h</CODE> declares a function called
|
||||||
<CODE>softfloat_countLeadingZeros32</CODE> that takes an unsigned
|
<CODE>softfloat_countLeadingZeros32</CODE> that takes an unsigned
|
||||||
<NOBR>32-bit</NOBR> integer as an argument and returns the maximal number of
|
<NOBR>32-bit</NOBR> integer as an argument and returns the maximal number of
|
||||||
the integer's most-significant bits that are all zeros.
|
the integer’s most-significant bits that are all zeros.
|
||||||
While the SoftFloat sources include an implementation of this function written
|
While the SoftFloat sources include an implementation of this function written
|
||||||
in <NOBR>standard C</NOBR>, many processors can perform this same function
|
in <NOBR>standard C</NOBR>, many processors can perform this same function
|
||||||
directly in only one or two machine instructions.
|
directly in only one or two machine instructions.
|
||||||
|
@ -473,19 +504,22 @@ package.
|
||||||
<P>
|
<P>
|
||||||
A build target can replace the supplied version of any function or macro of
|
A build target can replace the supplied version of any function or macro of
|
||||||
<CODE>primitives.h</CODE> by defining a macro with the same name in the
|
<CODE>primitives.h</CODE> by defining a macro with the same name in the
|
||||||
target's <CODE>platform.h</CODE> header file.
|
target’s <CODE>platform.h</CODE> header file.
|
||||||
For this purpose, it may be helpful for <CODE>platform.h</CODE> to
|
For this purpose, it may be helpful for <CODE>platform.h</CODE> to
|
||||||
<CODE>#include</CODE> header file <CODE>primitiveTypes.h</CODE>, which defines
|
<CODE>#include</CODE> header file <CODE>primitiveTypes.h</CODE>, which defines
|
||||||
types used for arguments and results of functions declared in
|
types used for arguments and results of functions declared in
|
||||||
<CODE>primitives.h</CODE>.
|
<CODE>primitives.h</CODE>.
|
||||||
When a desired replacement implementation is a function, not a macro, it is
|
When a desired replacement implementation is a function, not a macro, it is
|
||||||
sufficient for <CODE>platform.h</CODE> to include the line
|
sufficient for <CODE>platform.h</CODE> to include the line
|
||||||
|
<BLOCKQUOTE>
|
||||||
<PRE>
|
<PRE>
|
||||||
#define <function-name> <function-name>
|
#define <function-name> <function-name>
|
||||||
</PRE>
|
</PRE>
|
||||||
where <CODE><function-name></CODE> is the name of the function.
|
</BLOCKQUOTE>
|
||||||
This technically defines <CODE><function-name></CODE> as a macro, but one
|
where <NOBR><CODE><function-name></CODE></NOBR> is the name of the
|
||||||
that resolves to the same name, which may then be a function.
|
function.
|
||||||
|
This technically defines <NOBR><CODE><function-name></CODE></NOBR> as a
|
||||||
|
macro, but one that resolves to the same name, which may then be a function.
|
||||||
(A preprocessor conforming to the C Standard must limit recursive macro
|
(A preprocessor conforming to the C Standard must limit recursive macro
|
||||||
expansion from being applied more than once.)
|
expansion from being applied more than once.)
|
||||||
</P>
|
</P>
|
||||||
|
@ -500,46 +534,34 @@ This program is part of the Berkeley TestFloat package available at the Web
|
||||||
page
|
page
|
||||||
<A HREF="http://www.jhauser.us/arithmetic/TestFloat.html"><CODE>http://www.jhauser.us/arithmetic/TestFloat.html</CODE></A>.
|
<A HREF="http://www.jhauser.us/arithmetic/TestFloat.html"><CODE>http://www.jhauser.us/arithmetic/TestFloat.html</CODE></A>.
|
||||||
The TestFloat package also has a program called <CODE>timesoftfloat</CODE> that
|
The TestFloat package also has a program called <CODE>timesoftfloat</CODE> that
|
||||||
measures the speed of SoftFloat's floating-point functions.
|
measures the speed of SoftFloat’s floating-point functions.
|
||||||
</P>
|
</P>
|
||||||
|
|
||||||
|
|
||||||
<H2>7. Providing SoftFloat as a Common Library for Applications</H2>
|
<H2>7. Providing SoftFloat as a Common Library for Applications</H2>
|
||||||
|
|
||||||
<P>
|
<P>
|
||||||
Supplied <CODE>softfloat.h</CODE> depends on <CODE>softfloat_types.h</CODE>.
|
Header file <CODE>softfloat.h</CODE> defines the SoftFloat interface as seen by
|
||||||
|
clients.
|
||||||
|
If the SoftFloat library will be made a common library for programs on a
|
||||||
|
particular system, the supplied <CODE>softfloat.h</CODE> has a couple of
|
||||||
|
deficiencies for this purpose:
|
||||||
|
<UL>
|
||||||
|
<LI>
|
||||||
|
As supplied, <CODE>softfloat.h</CODE> depends on another header,
|
||||||
|
<CODE>softfloat_types.h</CODE>, that is not intended for public use but which
|
||||||
|
must also be visible to the programmer’s compiler.
|
||||||
|
<LI>
|
||||||
|
More troubling, at the time <CODE>softfloat.h</CODE> is included in a C
|
||||||
|
source file, macro <CODE>SOFTFLOAT_FAST_INT64</CODE> must be defined, or not
|
||||||
|
defined, consistent with whether this macro was defined when the SoftFloat
|
||||||
|
library was built.
|
||||||
|
</UL>
|
||||||
|
In the situation that new programs may regularly <CODE>#include</CODE> header
|
||||||
|
file <CODE>softfloat.h</CODE>, it is recommended that a custom, self-contained
|
||||||
|
version of this header file be created that eliminates these issues.
|
||||||
</P>
|
</P>
|
||||||
|
|
||||||
<PRE>
|
|
||||||
The target-specific `softfloat.h' header file defines the SoftFloat
|
|
||||||
interface as seen by clients.
|
|
||||||
|
|
||||||
Unlike the actual function definitions in `softfloat.c', the declarations
|
|
||||||
in `softfloat.h' do not use any of the types defined by the `processors'
|
|
||||||
header file. This is done so that clients will not have to include the
|
|
||||||
`processors' header file in order to use SoftFloat. Nevertheless, the
|
|
||||||
target-specific declarations in `softfloat.h' must match what `softfloat.c'
|
|
||||||
expects. For example, if `int32' is defined as `int' in the `processors'
|
|
||||||
header file, then in `softfloat.h' the output of `float32_to_int32' should
|
|
||||||
be stated as `int', although in `softfloat.c' it is given in target-
|
|
||||||
independent form as `int32'.
|
|
||||||
</PRE>
|
|
||||||
|
|
||||||
<PRE>
|
|
||||||
*** HERE
|
|
||||||
|
|
||||||
Porting and/or compiling SoftFloat involves the following steps:
|
|
||||||
|
|
||||||
4. In the target-specific subdirectory, edit the files `softfloat-specialize'
|
|
||||||
and `softfloat.h' to define the desired exception handling functions
|
|
||||||
and mode control values. In the `softfloat.h' header file, ensure also
|
|
||||||
that all declarations give the proper target-specific type (such as
|
|
||||||
`int' or `long') corresponding to the target-independent type used in
|
|
||||||
`softfloat.c' (such as `int32'). None of the type names declared in the
|
|
||||||
`processors' header file should appear in `softfloat.h'.
|
|
||||||
|
|
||||||
</PRE>
|
|
||||||
|
|
||||||
|
|
||||||
<H2>8. Contact Information</H2>
|
<H2>8. Contact Information</H2>
|
||||||
|
|
||||||
|
|
|
@ -11,66 +11,59 @@
|
||||||
|
|
||||||
<P>
|
<P>
|
||||||
John R. Hauser<BR>
|
John R. Hauser<BR>
|
||||||
2014 ______<BR>
|
2014 Dec 17<BR>
|
||||||
</P>
|
|
||||||
|
|
||||||
<P>
|
|
||||||
*** CONTENT DONE.
|
|
||||||
</P>
|
|
||||||
|
|
||||||
<P>
|
|
||||||
*** REPLACE QUOTATION MARKS.
|
|
||||||
<BR>
|
|
||||||
*** REPLACE APOSTROPHES.
|
|
||||||
<BR>
|
|
||||||
*** REPLACE EM DASH.
|
|
||||||
</P>
|
</P>
|
||||||
|
|
||||||
|
|
||||||
<H2>Contents</H2>
|
<H2>Contents</H2>
|
||||||
|
|
||||||
<P>
|
<BLOCKQUOTE>
|
||||||
*** CHECK.<BR>
|
<TABLE BORDER=0 CELLSPACING=0 CELLPADDING=0>
|
||||||
*** FIX FORMATTING.
|
<COL WIDTH=25>
|
||||||
</P>
|
<COL WIDTH=*>
|
||||||
|
<TR><TD COLSPAN=2>1. Introduction</TD></TR>
|
||||||
<PRE>
|
<TR><TD COLSPAN=2>2. Limitations</TD></TR>
|
||||||
Introduction
|
<TR><TD COLSPAN=2>3. Acknowledgments and License</TD></TR>
|
||||||
Limitations
|
<TR><TD COLSPAN=2>4. Types and Functions</TD></TR>
|
||||||
Acknowledgments and License
|
<TR><TD></TD><TD>4.1. Boolean and Integer Types</TD></TR>
|
||||||
Types and Functions
|
<TR><TD></TD><TD>4.2. Floating-Point Types</TD></TR>
|
||||||
Boolean and Integer Types
|
<TR><TD></TD><TD>4.3. Supported Floating-Point Functions</TD></TR>
|
||||||
Floating-Point Types
|
<TR>
|
||||||
Supported Floating-Point Functions
|
<TD></TD>
|
||||||
Non-canonical Representations in extFloat80_t
|
<TD>4.4. Non-canonical Representations in <CODE>extFloat80_t</CODE></TD>
|
||||||
Conventions for Passing Arguments and Results
|
</TR>
|
||||||
Reserved Names
|
<TR><TD></TD><TD>4.5. Conventions for Passing Arguments and Results</TD></TR>
|
||||||
Mode Variables
|
<TR><TD COLSPAN=2>5. Reserved Names</TD></TR>
|
||||||
Rounding Mode
|
<TR><TD COLSPAN=2>6. Mode Variables</TD></TR>
|
||||||
Underflow Detection
|
<TR><TD></TD><TD>6.1. Rounding Mode</TD></TR>
|
||||||
Rounding Precision for 80-Bit Extended Format
|
<TR><TD></TD><TD>6.2. Underflow Detection</TD></TR>
|
||||||
Exceptions and Exception Flags
|
<TR>
|
||||||
Function Details
|
<TD></TD>
|
||||||
Conversions from Integer to Floating-Point
|
<TD>6.3. Rounding Precision for the <NOBR>80-Bit</NOBR> Extended Format</TD>
|
||||||
Conversions from Floating-Point to Integer
|
</TR>
|
||||||
Conversions Among Floating-Point Types
|
<TR><TD COLSPAN=2>7. Exceptions and Exception Flags</TD></TR>
|
||||||
Basic Arithmetic Functions
|
<TR><TD COLSPAN=2>8. Function Details</TD></TR>
|
||||||
Fused Multiply-Add Functions
|
<TR><TD></TD><TD>8.1. Conversions from Integer to Floating-Point</TD></TR>
|
||||||
Remainder Functions
|
<TR><TD></TD><TD>8.2. Conversions from Floating-Point to Integer</TD></TR>
|
||||||
Round-to-Integer Functions
|
<TR><TD></TD><TD>8.3. Conversions Among Floating-Point Types</TD></TR>
|
||||||
Comparison Functions
|
<TR><TD></TD><TD>8.4. Basic Arithmetic Functions</TD></TR>
|
||||||
Signaling NaN Test Functions
|
<TR><TD></TD><TD>8.5. Fused Multiply-Add Functions</TD></TR>
|
||||||
Raise-Exception Function
|
<TR><TD></TD><TD>8.6. Remainder Functions</TD></TR>
|
||||||
Changes from SoftFloat Release 2
|
<TR><TD></TD><TD>8.7. Round-to-Integer Functions</TD></TR>
|
||||||
Name Changes
|
<TR><TD></TD><TD>8.8. Comparison Functions</TD></TR>
|
||||||
Changes to Function Arguments
|
<TR><TD></TD><TD>8.9. Signaling NaN Test Functions</TD></TR>
|
||||||
Added Capabilities
|
<TR><TD></TD><TD>8.10. Raise-Exception Function</TD></TR>
|
||||||
Better Compatibility with the C Language
|
<TR><TD COLSPAN=2>9. Changes from SoftFloat <NOBR>Release 2</NOBR></TD></TR>
|
||||||
New Organization as a Library
|
<TR><TD></TD><TD>9.1. Name Changes</TD></TR>
|
||||||
Optimization Gains (and Losses)
|
<TR><TD></TD><TD>9.2. Changes to Function Arguments</TD></TR>
|
||||||
Future Directions
|
<TR><TD></TD><TD>9.3. Added Capabilities</TD></TR>
|
||||||
Contact Information
|
<TR><TD></TD><TD>9.4. Better Compatibility with the C Language</TD></TR>
|
||||||
</PRE>
|
<TR><TD></TD><TD>9.5. New Organization as a Library</TD></TR>
|
||||||
|
<TR><TD></TD><TD>9.6. Optimization Gains (and Losses)</TD></TR>
|
||||||
|
<TR><TD COLSPAN=2>10. Future Directions</TD></TR>
|
||||||
|
<TR><TD COLSPAN=2>11. Contact Information</TD></TR>
|
||||||
|
</TABLE>
|
||||||
|
</BLOCKQUOTE>
|
||||||
|
|
||||||
|
|
||||||
<H2>1. Introduction</H2>
|
<H2>1. Introduction</H2>
|
||||||
|
@ -156,15 +149,20 @@ SoftFloat <NOBR>Release 3</NOBR>.
|
||||||
The SoftFloat package was written by me, <NOBR>John R.</NOBR> Hauser.
|
The SoftFloat package was written by me, <NOBR>John R.</NOBR> Hauser.
|
||||||
<NOBR>Release 3</NOBR> of SoftFloat is a completely new implementation
|
<NOBR>Release 3</NOBR> of SoftFloat is a completely new implementation
|
||||||
supplanting earlier releases.
|
supplanting earlier releases.
|
||||||
This project was done in the employ of the University of California, Berkeley,
|
This project (<NOBR>Release 3</NOBR> only, not earlier releases) was done in
|
||||||
within the Department of Electrical Engineering and Computer Sciences, first
|
the employ of the University of California, Berkeley, within the Department of
|
||||||
for the Parallel Computing Laboratory (Par Lab) and then for the ASPIRE Lab.
|
Electrical Engineering and Computer Sciences, first for the Parallel Computing
|
||||||
|
Laboratory (Par Lab) and then for the ASPIRE Lab.
|
||||||
The work was officially overseen by Prof. Krste Asanovic, with funding provided
|
The work was officially overseen by Prof. Krste Asanovic, with funding provided
|
||||||
by these sources:
|
by these sources:
|
||||||
<BLOCKQUOTE>
|
<BLOCKQUOTE>
|
||||||
<TABLE>
|
<TABLE>
|
||||||
|
<COL WIDTH=*>
|
||||||
|
<COL WIDTH=10>
|
||||||
|
<COL WIDTH=*>
|
||||||
<TR>
|
<TR>
|
||||||
<TD><NOBR>Par Lab:</NOBR></TD>
|
<TD VALIGN=TOP><NOBR>Par Lab:</NOBR></TD>
|
||||||
|
<TD></TD>
|
||||||
<TD>
|
<TD>
|
||||||
Microsoft (Award #024263), Intel (Award #024894), and U.C. Discovery
|
Microsoft (Award #024263), Intel (Award #024894), and U.C. Discovery
|
||||||
(Award #DIG07-10227), with additional support from Par Lab affiliates Nokia,
|
(Award #DIG07-10227), with additional support from Par Lab affiliates Nokia,
|
||||||
|
@ -172,7 +170,8 @@ NVIDIA, Oracle, and Samsung.
|
||||||
</TD>
|
</TD>
|
||||||
</TR>
|
</TR>
|
||||||
<TR>
|
<TR>
|
||||||
<TD><NOBR>ASPIRE Lab:</NOBR></TD>
|
<TD VALIGN=TOP><NOBR>ASPIRE Lab:</NOBR></TD>
|
||||||
|
<TD></TD>
|
||||||
<TD>
|
<TD>
|
||||||
DARPA PERFECT program (Award #HR0011-12-2-0016), with additional support from
|
DARPA PERFECT program (Award #HR0011-12-2-0016), with additional support from
|
||||||
ASPIRE industrial sponsor Intel and ASPIRE affiliates Google, Nokia, NVIDIA,
|
ASPIRE industrial sponsor Intel and ASPIRE affiliates Google, Nokia, NVIDIA,
|
||||||
|
@ -245,16 +244,18 @@ for these headers.
|
||||||
Header <CODE>softfloat.h</CODE> depends only on the name <CODE>bool</CODE> from
|
Header <CODE>softfloat.h</CODE> depends only on the name <CODE>bool</CODE> from
|
||||||
<CODE><stdbool.h></CODE> and on these type names from
|
<CODE><stdbool.h></CODE> and on these type names from
|
||||||
<CODE><stdint.h></CODE>:
|
<CODE><stdint.h></CODE>:
|
||||||
|
<BLOCKQUOTE>
|
||||||
<PRE>
|
<PRE>
|
||||||
uint16_t
|
uint16_t
|
||||||
uint32_t
|
uint32_t
|
||||||
uint64_t
|
uint64_t
|
||||||
int32_t
|
int32_t
|
||||||
int64_t
|
int64_t
|
||||||
uint_fast8_t
|
uint_fast8_t
|
||||||
uint_fast32_t
|
uint_fast32_t
|
||||||
uint_fast64_t
|
uint_fast64_t
|
||||||
</PRE>
|
</PRE>
|
||||||
|
</BLOCKQUOTE>
|
||||||
</P>
|
</P>
|
||||||
|
|
||||||
|
|
||||||
|
@ -263,26 +264,22 @@ Header <CODE>softfloat.h</CODE> depends only on the name <CODE>bool</CODE> from
|
||||||
<P>
|
<P>
|
||||||
The <CODE>softfloat.h</CODE> header defines four floating-point types:
|
The <CODE>softfloat.h</CODE> header defines four floating-point types:
|
||||||
<BLOCKQUOTE>
|
<BLOCKQUOTE>
|
||||||
<TABLE>
|
<TABLE CELLSPACING=0 CELLPADDING=0>
|
||||||
<TR>
|
<TR>
|
||||||
<TD><CODE>float32_t</CODE></TD>
|
<TD><CODE>float32_t</CODE></TD>
|
||||||
<TD> </TD>
|
|
||||||
<TD><NOBR>32-bit</NOBR> single-precision binary format</TD>
|
<TD><NOBR>32-bit</NOBR> single-precision binary format</TD>
|
||||||
</TR>
|
</TR>
|
||||||
<TR>
|
<TR>
|
||||||
<TD><CODE>float64_t</CODE></TD>
|
<TD><CODE>float64_t</CODE></TD>
|
||||||
<TD> </TD>
|
|
||||||
<TD><NOBR>64-bit</NOBR> double-precision binary format</TD>
|
<TD><NOBR>64-bit</NOBR> double-precision binary format</TD>
|
||||||
</TR>
|
</TR>
|
||||||
<TR>
|
<TR>
|
||||||
<TD><CODE>extFloat80_t</CODE></TD>
|
<TD><CODE>extFloat80_t </CODE></TD>
|
||||||
<TD> </TD>
|
|
||||||
<TD><NOBR>80-bit</NOBR> double-extended-precision binary format (old Intel or
|
<TD><NOBR>80-bit</NOBR> double-extended-precision binary format (old Intel or
|
||||||
Motorola format)</TD>
|
Motorola format)</TD>
|
||||||
</TR>
|
</TR>
|
||||||
<TR>
|
<TR>
|
||||||
<TD><CODE>float128_t</CODE></TD>
|
<TD><CODE>float128_t</CODE></TD>
|
||||||
<TD> </TD>
|
|
||||||
<TD><NOBR>128-bit</NOBR> quadruple-precision binary format</TD>
|
<TD><NOBR>128-bit</NOBR> quadruple-precision binary format</TD>
|
||||||
</TR>
|
</TR>
|
||||||
</TABLE>
|
</TABLE>
|
||||||
|
@ -304,10 +301,10 @@ Header file <CODE>softfloat.h</CODE> also defines a structure,
|
||||||
This structure is the same size as type <CODE>extFloat80_t</CODE> and contains
|
This structure is the same size as type <CODE>extFloat80_t</CODE> and contains
|
||||||
at least these two fields (not necessarily in this order):
|
at least these two fields (not necessarily in this order):
|
||||||
<BLOCKQUOTE>
|
<BLOCKQUOTE>
|
||||||
<TABLE>
|
<PRE>
|
||||||
<TR><TD><CODE>uint16_t signExp;</CODE></TD></TR>
|
uint16_t signExp;
|
||||||
<TR><TD><CODE>uint64_t signif;</CODE></TD></TR>
|
uint64_t signif;
|
||||||
</TABLE>
|
</PRE>
|
||||||
</BLOCKQUOTE>
|
</BLOCKQUOTE>
|
||||||
Field <CODE>signExp</CODE> contains the sign and exponent of the floating-point
|
Field <CODE>signExp</CODE> contains the sign and exponent of the floating-point
|
||||||
value, with the sign in the most significant bit (<NOBR>bit 15</NOBR>) and the
|
value, with the sign in the most significant bit (<NOBR>bit 15</NOBR>) and the
|
||||||
|
@ -339,8 +336,8 @@ operation defined by the IEEE Standard;
|
||||||
for each format, the floating-point remainder operation defined by the IEEE
|
for each format, the floating-point remainder operation defined by the IEEE
|
||||||
Standard;
|
Standard;
|
||||||
<LI>
|
<LI>
|
||||||
for each format, a ``round to integer'' operation that rounds to the nearest
|
for each format, a “round to integer” operation that rounds to the
|
||||||
integer value in the same format; and
|
nearest integer value in the same format; and
|
||||||
<LI>
|
<LI>
|
||||||
comparisons between two values in the same floating-point format.
|
comparisons between two values in the same floating-point format.
|
||||||
</UL>
|
</UL>
|
||||||
|
@ -357,12 +354,12 @@ not supported in SoftFloat <NOBR>Release 3</NOBR>:
|
||||||
conversions between floating-point formats and decimal or hexadecimal character
|
conversions between floating-point formats and decimal or hexadecimal character
|
||||||
sequences;
|
sequences;
|
||||||
<LI>
|
<LI>
|
||||||
all ``quiet-computation'' operations (<B>copy</B>, <B>negate</B>, <B>abs</B>,
|
all “quiet-computation” operations (<B>copy</B>, <B>negate</B>,
|
||||||
and <B>copySign</B>, which all involve only simple copying and/or manipulation
|
<B>abs</B>, and <B>copySign</B>, which all involve only simple copying and/or
|
||||||
of the floating-point sign bit); and
|
manipulation of the floating-point sign bit); and
|
||||||
<LI>
|
<LI>
|
||||||
all ``non-computational'' operations other than <B>isSignaling</B> (which is
|
all “non-computational” operations other than <B>isSignaling</B>
|
||||||
supported).
|
(which is supported).
|
||||||
</UL>
|
</UL>
|
||||||
</P>
|
</P>
|
||||||
|
|
||||||
|
@ -393,9 +390,9 @@ leading significand bit must <NOBR>be 1</NOBR> unless it is required to
|
||||||
For <NOBR>Release 3</NOBR> of SoftFloat, functions are not guaranteed to
|
For <NOBR>Release 3</NOBR> of SoftFloat, functions are not guaranteed to
|
||||||
operate as expected when inputs of type <CODE>extFloat80_t</CODE> are
|
operate as expected when inputs of type <CODE>extFloat80_t</CODE> are
|
||||||
non-canonical.
|
non-canonical.
|
||||||
Assuming all of a function's <CODE>extFloat80_t</CODE> inputs (if any) are
|
Assuming all of a function’s <CODE>extFloat80_t</CODE> inputs (if any)
|
||||||
canonical, function outputs of type <CODE>extFloat80_t</CODE> will always be
|
are canonical, function outputs of type <CODE>extFloat80_t</CODE> will always
|
||||||
canonical.
|
be canonical.
|
||||||
</P>
|
</P>
|
||||||
|
|
||||||
<H3>4.5. Conventions for Passing Arguments and Results</H3>
|
<H3>4.5. Conventions for Passing Arguments and Results</H3>
|
||||||
|
@ -426,8 +423,8 @@ SoftFloat supplies this function:
|
||||||
The first two arguments point to the values to be added, and the last argument
|
The first two arguments point to the values to be added, and the last argument
|
||||||
points to the location where the sum will be stored.
|
points to the location where the sum will be stored.
|
||||||
The <CODE>M</CODE> in the name <CODE>f128M_add</CODE> is mnemonic for the fact
|
The <CODE>M</CODE> in the name <CODE>f128M_add</CODE> is mnemonic for the fact
|
||||||
that the <NOBR>128-bit</NOBR> inputs and outputs are ``in memory'', pointed to
|
that the <NOBR>128-bit</NOBR> inputs and outputs are “in memory”,
|
||||||
by pointer arguments.
|
pointed to by pointer arguments.
|
||||||
</P>
|
</P>
|
||||||
|
|
||||||
<P>
|
<P>
|
||||||
|
@ -464,10 +461,11 @@ platforms of interest, programmers can use whichever version they prefer.
|
||||||
<P>
|
<P>
|
||||||
In addition to the variables and functions documented here, SoftFloat defines
|
In addition to the variables and functions documented here, SoftFloat defines
|
||||||
some symbol names for its own private use.
|
some symbol names for its own private use.
|
||||||
These private names always begin with the prefix `<CODE>softfloat_</CODE>'.
|
These private names always begin with the prefix
|
||||||
|
‘<CODE>softfloat_</CODE>’.
|
||||||
When a program includes header <CODE>softfloat.h</CODE> or links with the
|
When a program includes header <CODE>softfloat.h</CODE> or links with the
|
||||||
SoftFloat library, all names with prefix `<CODE>softfloat_</CODE>' are reserved
|
SoftFloat library, all names with prefix ‘<CODE>softfloat_</CODE>’
|
||||||
for possible use by SoftFloat.
|
are reserved for possible use by SoftFloat.
|
||||||
Applications that use SoftFloat should not define their own names with this
|
Applications that use SoftFloat should not define their own names with this
|
||||||
prefix, and should reference only such names as are documented.
|
prefix, and should reference only such names as are documented.
|
||||||
</P>
|
</P>
|
||||||
|
@ -477,7 +475,7 @@ prefix, and should reference only such names as are documented.
|
||||||
|
|
||||||
<P>
|
<P>
|
||||||
The following variables control rounding mode, underflow detection, and the
|
The following variables control rounding mode, underflow detection, and the
|
||||||
<NOBR>80-bit</NOBR> extended format's rounding precision:
|
<NOBR>80-bit</NOBR> extended format’s rounding precision:
|
||||||
<BLOCKQUOTE>
|
<BLOCKQUOTE>
|
||||||
<CODE>softfloat_roundingMode</CODE><BR>
|
<CODE>softfloat_roundingMode</CODE><BR>
|
||||||
<CODE>softfloat_detectTininess</CODE><BR>
|
<CODE>softfloat_detectTininess</CODE><BR>
|
||||||
|
@ -497,30 +495,25 @@ The rounding mode is selected by the global variable
|
||||||
</BLOCKQUOTE>
|
</BLOCKQUOTE>
|
||||||
This variable may be set to one of the values
|
This variable may be set to one of the values
|
||||||
<BLOCKQUOTE>
|
<BLOCKQUOTE>
|
||||||
<TABLE>
|
<TABLE CELLSPACING=0 CELLPADDING=0>
|
||||||
<TR>
|
<TR>
|
||||||
<TD><CODE>softfloat_round_near_even</CODE></TD>
|
<TD><CODE>softfloat_round_near_even</CODE></TD>
|
||||||
<TD> </TD>
|
|
||||||
<TD>round to nearest, with ties to even</TD>
|
<TD>round to nearest, with ties to even</TD>
|
||||||
</TR>
|
</TR>
|
||||||
<TR>
|
<TR>
|
||||||
<TD><CODE>softfloat_round_near_maxMag</CODE></TD>
|
<TD><CODE>softfloat_round_near_maxMag </CODE></TD>
|
||||||
<TD> </TD>
|
|
||||||
<TD>round to nearest, with ties to maximum magnitude (away from zero)</TD>
|
<TD>round to nearest, with ties to maximum magnitude (away from zero)</TD>
|
||||||
</TR>
|
</TR>
|
||||||
<TR>
|
<TR>
|
||||||
<TD><CODE>softfloat_round_minMag</CODE></TD>
|
<TD><CODE>softfloat_round_minMag</CODE></TD>
|
||||||
<TD> </TD>
|
|
||||||
<TD>round to minimum magnitude (toward zero)</TD>
|
<TD>round to minimum magnitude (toward zero)</TD>
|
||||||
</TR>
|
</TR>
|
||||||
<TR>
|
<TR>
|
||||||
<TD><CODE>softfloat_round_min</CODE></TD>
|
<TD><CODE>softfloat_round_min</CODE></TD>
|
||||||
<TD> </TD>
|
|
||||||
<TD>round to minimum (down)</TD>
|
<TD>round to minimum (down)</TD>
|
||||||
</TR>
|
</TR>
|
||||||
<TR>
|
<TR>
|
||||||
<TD><CODE>softfloat_round_max</CODE></TD>
|
<TD><CODE>softfloat_round_max</CODE></TD>
|
||||||
<TD> </TD>
|
|
||||||
<TD>round to maximum (up)</TD>
|
<TD>round to maximum (up)</TD>
|
||||||
</TR>
|
</TR>
|
||||||
</TABLE>
|
</TABLE>
|
||||||
|
@ -550,7 +543,7 @@ Like most systems (and as required by the newer 2008 IEEE Standard), SoftFloat
|
||||||
always detects loss of accuracy for underflow as an inexact result.
|
always detects loss of accuracy for underflow as an inexact result.
|
||||||
</P>
|
</P>
|
||||||
|
|
||||||
<H3>6.3. Rounding Precision for 80-Bit Extended Format</H3>
|
<H3>6.3. Rounding Precision for the <NOBR>80-Bit</NOBR> Extended Format</H3>
|
||||||
|
|
||||||
<P>
|
<P>
|
||||||
For <CODE>extFloat80_t</CODE> only, the rounding precision of the basic
|
For <CODE>extFloat80_t</CODE> only, the rounding precision of the basic
|
||||||
|
@ -639,7 +632,7 @@ It does always raise the <I>inexact</I> exception flag as required.
|
||||||
In this section, <CODE><<I>float</I>></CODE> appears in function names as
|
In this section, <CODE><<I>float</I>></CODE> appears in function names as
|
||||||
a substitute for one of these abbreviations:
|
a substitute for one of these abbreviations:
|
||||||
<BLOCKQUOTE>
|
<BLOCKQUOTE>
|
||||||
<TABLE>
|
<TABLE CELLSPACING=0 CELLPADDING=0>
|
||||||
<TR>
|
<TR>
|
||||||
<TD><CODE>f32</CODE></TD>
|
<TD><CODE>f32</CODE></TD>
|
||||||
<TD>indicates <CODE>float32_t</CODE>, passed by value</TD>
|
<TD>indicates <CODE>float32_t</CODE>, passed by value</TD>
|
||||||
|
@ -696,11 +689,14 @@ Each conversion function takes one input of the appropriate type and generates
|
||||||
one output.
|
one output.
|
||||||
The following illustrates the signatures of these functions in cases when the
|
The following illustrates the signatures of these functions in cases when the
|
||||||
floating-point result is passed either by value or via pointers:
|
floating-point result is passed either by value or via pointers:
|
||||||
|
<BLOCKQUOTE>
|
||||||
<PRE>
|
<PRE>
|
||||||
float64_t i32_to_f64( int32_t <I>a</I> );
|
float64_t i32_to_f64( int32_t <I>a</I> );
|
||||||
|
|
||||||
void i32_to_f128M( int32_t <I>a</I>, float128_t *<I>destPtr</I> );
|
|
||||||
</PRE>
|
</PRE>
|
||||||
|
<PRE>
|
||||||
|
void i32_to_f128M( int32_t <I>a</I>, float128_t *<I>destPtr</I> );
|
||||||
|
</PRE>
|
||||||
|
</BLOCKQUOTE>
|
||||||
</P>
|
</P>
|
||||||
|
|
||||||
<H3>8.2. Conversions from Floating-Point to Integer</H3>
|
<H3>8.2. Conversions from Floating-Point to Integer</H3>
|
||||||
|
@ -717,12 +713,15 @@ functions:
|
||||||
</BLOCKQUOTE>
|
</BLOCKQUOTE>
|
||||||
The functions have signatures as follows, depending on whether the
|
The functions have signatures as follows, depending on whether the
|
||||||
floating-point input is passed by value or via pointers:
|
floating-point input is passed by value or via pointers:
|
||||||
|
<BLOCKQUOTE>
|
||||||
<PRE>
|
<PRE>
|
||||||
int32_t f64_to_i32( float64_t <I>a</I>, uint_fast8_t <I>roundingMode</I>, bool <I>exact</I> );
|
int32_t f64_to_i32( float64_t <I>a</I>, uint_fast8_t <I>roundingMode</I>, bool <I>exact</I> );
|
||||||
|
</PRE>
|
||||||
int32_t
|
<PRE>
|
||||||
|
int32_t
|
||||||
f128M_to_i32( const float128_t *<I>aPtr</I>, uint_fast8_t <I>roundingMode</I>, bool <I>exact</I> );
|
f128M_to_i32( const float128_t *<I>aPtr</I>, uint_fast8_t <I>roundingMode</I>, bool <I>exact</I> );
|
||||||
</PRE>
|
</PRE>
|
||||||
|
</BLOCKQUOTE>
|
||||||
The <CODE><I>roundingMode</I></CODE> argument specifies the rounding mode for
|
The <CODE><I>roundingMode</I></CODE> argument specifies the rounding mode for
|
||||||
the conversion.
|
the conversion.
|
||||||
The variable that usually indicates rounding mode,
|
The variable that usually indicates rounding mode,
|
||||||
|
@ -768,12 +767,14 @@ and convenience:
|
||||||
These functions round only toward zero (to minimum magnitude).
|
These functions round only toward zero (to minimum magnitude).
|
||||||
The signatures for these functions are the same as above without the redundant
|
The signatures for these functions are the same as above without the redundant
|
||||||
<CODE><I>roundingMode</I></CODE> argument:
|
<CODE><I>roundingMode</I></CODE> argument:
|
||||||
|
<BLOCKQUOTE>
|
||||||
<PRE>
|
<PRE>
|
||||||
int32_t f64_to_i32_r_minMag( float64_t <I>a</I>, bool <I>exact</I> );
|
int32_t f64_to_i32_r_minMag( float64_t <I>a</I>, bool <I>exact</I> );
|
||||||
</PRE>
|
</PRE>
|
||||||
<PRE>
|
<PRE>
|
||||||
int32_t f128M_to_i32_r_minMag( const float128_t *<I>aPtr</I>, bool <I>exact</I> );
|
int32_t f128M_to_i32_r_minMag( const float128_t *<I>aPtr</I>, bool <I>exact</I> );
|
||||||
</PRE>
|
</PRE>
|
||||||
|
</BLOCKQUOTE>
|
||||||
</P>
|
</P>
|
||||||
|
|
||||||
<H3>8.3. Conversions Among Floating-Point Types</H3>
|
<H3>8.3. Conversions Among Floating-Point Types</H3>
|
||||||
|
@ -789,18 +790,20 @@ result are different formats.
|
||||||
There are four different styles of signature for these functions, depending on
|
There are four different styles of signature for these functions, depending on
|
||||||
whether the input and the output floating-point values are passed by value or
|
whether the input and the output floating-point values are passed by value or
|
||||||
via pointers:
|
via pointers:
|
||||||
|
<BLOCKQUOTE>
|
||||||
<PRE>
|
<PRE>
|
||||||
float32_t f64_to_f32( float64_t <I>a</I> );
|
float32_t f64_to_f32( float64_t <I>a</I> );
|
||||||
</PRE>
|
</PRE>
|
||||||
<PRE>
|
<PRE>
|
||||||
float32_t f128M_to_f32( const float128_t *<I>aPtr</I> );
|
float32_t f128M_to_f32( const float128_t *<I>aPtr</I> );
|
||||||
</PRE>
|
</PRE>
|
||||||
<PRE>
|
<PRE>
|
||||||
void f32_to_f128M( float32_t <I>a</I>, float128_t *<I>destPtr</I> );
|
void f32_to_f128M( float32_t <I>a</I>, float128_t *<I>destPtr</I> );
|
||||||
</PRE>
|
</PRE>
|
||||||
<PRE>
|
<PRE>
|
||||||
void extF80M_to_f128M( const extFloat80_t *<I>aPtr</I>, float128_t *<I>destPtr</I> );
|
void extF80M_to_f128M( const extFloat80_t *<I>aPtr</I>, float128_t *<I>destPtr</I> );
|
||||||
</PRE>
|
</PRE>
|
||||||
|
</BLOCKQUOTE>
|
||||||
</P>
|
</P>
|
||||||
|
|
||||||
<P>
|
<P>
|
||||||
|
@ -823,22 +826,22 @@ Each floating-point operation takes two operands, except for <CODE>sqrt</CODE>
|
||||||
(square root) which takes only one.
|
(square root) which takes only one.
|
||||||
The operands and result are all of the same floating-point format.
|
The operands and result are all of the same floating-point format.
|
||||||
Signatures for these functions take the following forms:
|
Signatures for these functions take the following forms:
|
||||||
|
<BLOCKQUOTE>
|
||||||
<PRE>
|
<PRE>
|
||||||
float64_t f64_add( float64_t <I>a</I>, float64_t <I>b</I> );
|
float64_t f64_add( float64_t <I>a</I>, float64_t <I>b</I> );
|
||||||
</PRE>
|
</PRE>
|
||||||
<PRE>
|
<PRE>
|
||||||
void
|
void
|
||||||
f128M_add(
|
f128M_add(
|
||||||
const float128_t *<I>aPtr</I>, const float128_t *<I>bPtr</I>, float128_t *<I>destPtr</I> );
|
const float128_t *<I>aPtr</I>, const float128_t *<I>bPtr</I>, float128_t *<I>destPtr</I> );
|
||||||
</PRE>
|
</PRE>
|
||||||
</P>
|
|
||||||
<P>
|
|
||||||
<PRE>
|
<PRE>
|
||||||
float64_t f64_sqrt( float64_t <I>a</I> );
|
float64_t f64_sqrt( float64_t <I>a</I> );
|
||||||
</PRE>
|
</PRE>
|
||||||
<PRE>
|
<PRE>
|
||||||
void f128M_sqrt( const float128_t *<I>aPtr</I>, float128_t *<I>destPtr</I> );
|
void f128M_sqrt( const float128_t *<I>aPtr</I>, float128_t *<I>destPtr</I> );
|
||||||
</PRE>
|
</PRE>
|
||||||
|
</BLOCKQUOTE>
|
||||||
When floating-point values are passed indirectly through pointers, arguments
|
When floating-point values are passed indirectly through pointers, arguments
|
||||||
<CODE><I>aPtr</I></CODE> and <CODE><I>bPtr</I></CODE> point to the input
|
<CODE><I>aPtr</I></CODE> and <CODE><I>bPtr</I></CODE> point to the input
|
||||||
operands, and the last argument, <CODE><I>destPtr</I></CODE>, points to the
|
operands, and the last argument, <CODE><I>destPtr</I></CODE>, points to the
|
||||||
|
@ -850,7 +853,7 @@ Rounding of the <NOBR>80-bit</NOBR> double-extended-precision
|
||||||
(<CODE>extFloat80_t</CODE>) functions is affected by variable
|
(<CODE>extFloat80_t</CODE>) functions is affected by variable
|
||||||
<CODE>extF80_roundingPrecision</CODE>, as explained earlier in
|
<CODE>extF80_roundingPrecision</CODE>, as explained earlier in
|
||||||
<NOBR>section 6.3</NOBR>,
|
<NOBR>section 6.3</NOBR>,
|
||||||
<I>Rounding Precision for <NOBR>80-Bit</NOBR> Extended Format</I>.
|
<I>Rounding Precision for the <NOBR>80-Bit</NOBR> Extended Format</I>.
|
||||||
</P>
|
</P>
|
||||||
|
|
||||||
<H3>8.5. Fused Multiply-Add Functions</H3>
|
<H3>8.5. Fused Multiply-Add Functions</H3>
|
||||||
|
@ -873,11 +876,12 @@ No fused multiple-add function is currently provided for the
|
||||||
<P>
|
<P>
|
||||||
Depending on whether floating-point values are passed by value or via pointers,
|
Depending on whether floating-point values are passed by value or via pointers,
|
||||||
the fused multiply-add functions have signatures of these forms:
|
the fused multiply-add functions have signatures of these forms:
|
||||||
|
<BLOCKQUOTE>
|
||||||
<PRE>
|
<PRE>
|
||||||
float64_t f64_mulAdd( float64_t <I>a</I>, float64_t <I>b</I>, float64_t <I>c</I> );
|
float64_t f64_mulAdd( float64_t <I>a</I>, float64_t <I>b</I>, float64_t <I>c</I> );
|
||||||
</PRE>
|
</PRE>
|
||||||
<PRE>
|
<PRE>
|
||||||
void
|
void
|
||||||
f128M_mulAdd(
|
f128M_mulAdd(
|
||||||
const float128_t *<I>aPtr</I>,
|
const float128_t *<I>aPtr</I>,
|
||||||
const float128_t *<I>bPtr</I>,
|
const float128_t *<I>bPtr</I>,
|
||||||
|
@ -885,6 +889,7 @@ the fused multiply-add functions have signatures of these forms:
|
||||||
float128_t *<I>destPtr</I>
|
float128_t *<I>destPtr</I>
|
||||||
);
|
);
|
||||||
</PRE>
|
</PRE>
|
||||||
|
</BLOCKQUOTE>
|
||||||
The functions compute
|
The functions compute
|
||||||
<NOBR>(<CODE><I>a</I></CODE> × <CODE><I>b</I></CODE>)
|
<NOBR>(<CODE><I>a</I></CODE> × <CODE><I>b</I></CODE>)
|
||||||
+ <CODE><I>c</I></CODE></NOBR>
|
+ <CODE><I>c</I></CODE></NOBR>
|
||||||
|
@ -915,14 +920,16 @@ Each remainder operation takes two floating-point operands of the same format
|
||||||
and returns a result in the same format.
|
and returns a result in the same format.
|
||||||
Depending on whether floating-point values are passed by value or via pointers,
|
Depending on whether floating-point values are passed by value or via pointers,
|
||||||
the remainder functions have signatures of these forms:
|
the remainder functions have signatures of these forms:
|
||||||
|
<BLOCKQUOTE>
|
||||||
<PRE>
|
<PRE>
|
||||||
float64_t f64_rem( float64_t <I>a</I>, float64_t <I>b</I> );
|
float64_t f64_rem( float64_t <I>a</I>, float64_t <I>b</I> );
|
||||||
</PRE>
|
</PRE>
|
||||||
<PRE>
|
<PRE>
|
||||||
void
|
void
|
||||||
f128M_rem(
|
f128M_rem(
|
||||||
const float128_t *<I>aPtr</I>, const float128_t *<I>bPtr</I>, float128_t *<I>destPtr</I> );
|
const float128_t *<I>aPtr</I>, const float128_t *<I>bPtr</I>, float128_t *<I>destPtr</I> );
|
||||||
</PRE>
|
</PRE>
|
||||||
|
</BLOCKQUOTE>
|
||||||
When floating-point values are passed indirectly through pointers, arguments
|
When floating-point values are passed indirectly through pointers, arguments
|
||||||
<CODE><I>aPtr</I></CODE> and <CODE><I>bPtr</I></CODE> point to operands
|
<CODE><I>aPtr</I></CODE> and <CODE><I>bPtr</I></CODE> point to operands
|
||||||
<CODE><I>a</I></CODE> and <CODE><I>b</I></CODE> respectively, and
|
<CODE><I>a</I></CODE> and <CODE><I>b</I></CODE> respectively, and
|
||||||
|
@ -938,8 +945,8 @@ where <I>n</I> is the integer closest to
|
||||||
If <NOBR><CODE><I>a</I></CODE> ÷ <CODE><I>b</I></CODE></NOBR> is exactly
|
If <NOBR><CODE><I>a</I></CODE> ÷ <CODE><I>b</I></CODE></NOBR> is exactly
|
||||||
halfway between two integers, <I>n</I> is the <EM>even</EM> integer closest to
|
halfway between two integers, <I>n</I> is the <EM>even</EM> integer closest to
|
||||||
<NOBR><CODE><I>a</I></CODE> ÷ <CODE><I>b</I></CODE></NOBR>.
|
<NOBR><CODE><I>a</I></CODE> ÷ <CODE><I>b</I></CODE></NOBR>.
|
||||||
The IEEE Standard's remainder operation is always exact and so requires no
|
The IEEE Standard’s remainder operation is always exact and so requires
|
||||||
rounding.
|
no rounding.
|
||||||
</P>
|
</P>
|
||||||
|
|
||||||
<P>
|
<P>
|
||||||
|
@ -968,11 +975,12 @@ and the resulting integer value is returned in the same floating-point format.
|
||||||
<P>
|
<P>
|
||||||
The signatures of the round-to-integer functions are similar to those for
|
The signatures of the round-to-integer functions are similar to those for
|
||||||
conversions to an integer type:
|
conversions to an integer type:
|
||||||
|
<BLOCKQUOTE>
|
||||||
<PRE>
|
<PRE>
|
||||||
float64_t f64_roundToInt( float64_t <I>a</I>, uint_fast8_t <I>roundingMode</I>, bool <I>exact</I> );
|
float64_t f64_roundToInt( float64_t <I>a</I>, uint_fast8_t <I>roundingMode</I>, bool <I>exact</I> );
|
||||||
</PRE>
|
</PRE>
|
||||||
<PRE>
|
<PRE>
|
||||||
void
|
void
|
||||||
f128M_roundToInt(
|
f128M_roundToInt(
|
||||||
const float128_t *<I>aPtr</I>,
|
const float128_t *<I>aPtr</I>,
|
||||||
uint_fast8_t <I>roundingMode</I>,
|
uint_fast8_t <I>roundingMode</I>,
|
||||||
|
@ -980,6 +988,7 @@ conversions to an integer type:
|
||||||
float128_t *<I>destPtr</I>
|
float128_t *<I>destPtr</I>
|
||||||
);
|
);
|
||||||
</PRE>
|
</PRE>
|
||||||
|
</BLOCKQUOTE>
|
||||||
The <CODE><I>roundingMode</I></CODE> argument specifies the rounding mode to
|
The <CODE><I>roundingMode</I></CODE> argument specifies the rounding mode to
|
||||||
apply.
|
apply.
|
||||||
The variable that usually indicates rounding mode,
|
The variable that usually indicates rounding mode,
|
||||||
|
@ -1005,17 +1014,19 @@ provided:
|
||||||
<CODE><<I>float</I>>_lt</CODE>
|
<CODE><<I>float</I>>_lt</CODE>
|
||||||
</BLOCKQUOTE>
|
</BLOCKQUOTE>
|
||||||
Each comparison takes two operands of the same type and returns a Boolean.
|
Each comparison takes two operands of the same type and returns a Boolean.
|
||||||
The abbreviation <CODE>eq</CODE> stands for ``equal'' (=);
|
The abbreviation <CODE>eq</CODE> stands for “equal” (=);
|
||||||
<CODE>le</CODE> stands for ``less than or equal'' (≤);
|
<CODE>le</CODE> stands for “less than or equal” (≤);
|
||||||
and <CODE>lt</CODE> stands for ``less than'' (<).
|
and <CODE>lt</CODE> stands for “less than” (<).
|
||||||
Depending on whether the floating-point operands are passed by value or via
|
Depending on whether the floating-point operands are passed by value or via
|
||||||
pointers, the comparison functions have signatures of these forms:
|
pointers, the comparison functions have signatures of these forms:
|
||||||
|
<BLOCKQUOTE>
|
||||||
<PRE>
|
<PRE>
|
||||||
bool f64_eq( float64_t <I>a</I>, float64_t <I>b</I> );
|
bool f64_eq( float64_t <I>a</I>, float64_t <I>b</I> );
|
||||||
</PRE>
|
</PRE>
|
||||||
<PRE>
|
<PRE>
|
||||||
bool f128M_eq( const float128_t *<I>aPtr</I>, const float128_t *<I>bPtr</I> );
|
bool f128M_eq( const float128_t *<I>aPtr</I>, const float128_t *<I>bPtr</I> );
|
||||||
</PRE>
|
</PRE>
|
||||||
|
</BLOCKQUOTE>
|
||||||
</P>
|
</P>
|
||||||
|
|
||||||
<P>
|
<P>
|
||||||
|
@ -1058,21 +1069,25 @@ provided with these names:
|
||||||
The functions take one floating-point operand and return a Boolean indicating
|
The functions take one floating-point operand and return a Boolean indicating
|
||||||
whether the operand is a signaling NaN.
|
whether the operand is a signaling NaN.
|
||||||
Accordingly, the functions have the forms
|
Accordingly, the functions have the forms
|
||||||
|
<BLOCKQUOTE>
|
||||||
<PRE>
|
<PRE>
|
||||||
bool f64_isSignalingNaN( float64_t <I>a</I> );
|
bool f64_isSignalingNaN( float64_t <I>a</I> );
|
||||||
</PRE>
|
</PRE>
|
||||||
<PRE>
|
<PRE>
|
||||||
bool f128M_isSignalingNaN( const float128_t *<I>aPtr</I> );
|
bool f128M_isSignalingNaN( const float128_t *<I>aPtr</I> );
|
||||||
</PRE>
|
</PRE>
|
||||||
|
</BLOCKQUOTE>
|
||||||
</P>
|
</P>
|
||||||
|
|
||||||
<H3>8.10. Raise-Exception Function</H3>
|
<H3>8.10. Raise-Exception Function</H3>
|
||||||
|
|
||||||
<P>
|
<P>
|
||||||
SoftFloat provides a single function for raising floating-point exceptions:
|
SoftFloat provides a single function for raising floating-point exceptions:
|
||||||
|
<BLOCKQUOTE>
|
||||||
<PRE>
|
<PRE>
|
||||||
void softfloat_raise( uint_fast8_t <I>exceptions</I> );
|
void softfloat_raise( uint_fast8_t <I>exceptions</I> );
|
||||||
</PRE>
|
</PRE>
|
||||||
|
</BLOCKQUOTE>
|
||||||
The <CODE><I>exceptions</I></CODE> argument is a mask indicating the set of
|
The <CODE><I>exceptions</I></CODE> argument is a mask indicating the set of
|
||||||
exceptions to raise.
|
exceptions to raise.
|
||||||
(See earlier section 7, <I>Exceptions and Exception Flags</I>.)
|
(See earlier section 7, <I>Exceptions and Exception Flags</I>.)
|
||||||
|
@ -1084,6 +1099,11 @@ function may cause a trap or abort appropriate for the current system.
|
||||||
|
|
||||||
<H2>9. Changes from SoftFloat <NOBR>Release 2</NOBR></H2>
|
<H2>9. Changes from SoftFloat <NOBR>Release 2</NOBR></H2>
|
||||||
|
|
||||||
|
<P>
|
||||||
|
Apart from the change in the legal use license, there are numerous technical
|
||||||
|
differences between <NOBR>Release 3</NOBR> of SoftFloat and earlier releases.
|
||||||
|
</P>
|
||||||
|
|
||||||
<H3>9.1. Name Changes</H3>
|
<H3>9.1. Name Changes</H3>
|
||||||
|
|
||||||
<P>
|
<P>
|
||||||
|
@ -1214,17 +1234,17 @@ Lastly, there are a few other changes to function names:
|
||||||
<TR>
|
<TR>
|
||||||
<TD><CODE>_round_to_zero</CODE></TD>
|
<TD><CODE>_round_to_zero</CODE></TD>
|
||||||
<TD><CODE>_r_minMag</CODE></TD>
|
<TD><CODE>_r_minMag</CODE></TD>
|
||||||
<TD>conversions from floating-point to integer, section 8.2</TD>
|
<TD>conversions from floating-point to integer (<NOBR>section 8.2</NOBR>)</TD>
|
||||||
</TR>
|
</TR>
|
||||||
<TR>
|
<TR>
|
||||||
<TD><CODE>round_to_int</CODE></TD>
|
<TD><CODE>round_to_int</CODE></TD>
|
||||||
<TD><CODE>roundToInt</CODE></TD>
|
<TD><CODE>roundToInt</CODE></TD>
|
||||||
<TD>round-to-integer functions, section 8.7</TD>
|
<TD>round-to-integer functions (<NOBR>section 8.7</NOBR>)</TD>
|
||||||
</TR>
|
</TR>
|
||||||
<TR>
|
<TR>
|
||||||
<TD><CODE>is_signaling_nan </CODE></TD>
|
<TD><CODE>is_signaling_nan </CODE></TD>
|
||||||
<TD><CODE>isSignalingNaN</CODE></TD>
|
<TD><CODE>isSignalingNaN</CODE></TD>
|
||||||
<TD>signaling NaN test functions, section 8.9</TD>
|
<TD>signaling NaN test functions (<NOBR>section 8.9</NOBR>)</TD>
|
||||||
</TR>
|
</TR>
|
||||||
</TABLE>
|
</TABLE>
|
||||||
</BLOCKQUOTE>
|
</BLOCKQUOTE>
|
||||||
|
@ -1296,7 +1316,7 @@ argument <CODE><I>exact</I></CODE>.
|
||||||
<P>
|
<P>
|
||||||
With <NOBR>Release 3</NOBR>, a port of SoftFloat can now define any of the
|
With <NOBR>Release 3</NOBR>, a port of SoftFloat can now define any of the
|
||||||
floating-point types <CODE>float32_t</CODE>, <CODE>float64_t</CODE>,
|
floating-point types <CODE>float32_t</CODE>, <CODE>float64_t</CODE>,
|
||||||
<CODE>extFloat80_t</CODE>, and <CODE>float128_t</CODE> as aliases for C's
|
<CODE>extFloat80_t</CODE>, and <CODE>float128_t</CODE> as aliases for C’s
|
||||||
standard floating-point types <CODE>float</CODE>, <CODE>double</CODE>, and
|
standard floating-point types <CODE>float</CODE>, <CODE>double</CODE>, and
|
||||||
<CODE>long</CODE> <CODE>double</CODE>, using either <CODE>#define</CODE> or
|
<CODE>long</CODE> <CODE>double</CODE>, using either <CODE>#define</CODE> or
|
||||||
<CODE>typedef</CODE>.
|
<CODE>typedef</CODE>.
|
||||||
|
@ -1304,9 +1324,9 @@ This potential convenience was not supported under <NOBR>Release 2</NOBR>.
|
||||||
</P>
|
</P>
|
||||||
|
|
||||||
<P>
|
<P>
|
||||||
(Note, however, that there may be a performance cost to defining SoftFloat's
|
(Note, however, that there may be a performance cost to defining
|
||||||
floating-point types this way, depending on the platform and the applications
|
SoftFloat’s floating-point types this way, depending on the platform and
|
||||||
using SoftFloat.
|
the applications using SoftFloat.
|
||||||
Ports of SoftFloat may choose to forgo the convenience in favor of better
|
Ports of SoftFloat may choose to forgo the convenience in favor of better
|
||||||
speed.)
|
speed.)
|
||||||
</P>
|
</P>
|
||||||
|
@ -1338,7 +1358,7 @@ Fused multiply-add functions have been added for the non-extended formats,
|
||||||
|
|
||||||
<P>
|
<P>
|
||||||
<NOBR>Release 3</NOBR> of SoftFloat is written to conform better to the ISO C
|
<NOBR>Release 3</NOBR> of SoftFloat is written to conform better to the ISO C
|
||||||
Standard's rules for portability.
|
Standard’s rules for portability.
|
||||||
For example, older releases of SoftFloat employed type conversions in ways
|
For example, older releases of SoftFloat employed type conversions in ways
|
||||||
that, while commonly practiced, are not fully defined by the C Standard.
|
that, while commonly practiced, are not fully defined by the C Standard.
|
||||||
Such problematic type conversions have generally been replaced by the use of
|
Such problematic type conversions have generally been replaced by the use of
|
||||||
|
@ -1387,8 +1407,8 @@ Some loss of speed has been observed due to this change.
|
||||||
The following improvements are anticipated for future releases of SoftFloat:
|
The following improvements are anticipated for future releases of SoftFloat:
|
||||||
<UL>
|
<UL>
|
||||||
<LI>
|
<LI>
|
||||||
support for the common <NOBR>16-bit</NOBR> ``half-precision'' floating-point
|
support for the common <NOBR>16-bit</NOBR> “half-precision”
|
||||||
format;
|
floating-point format;
|
||||||
<LI>
|
<LI>
|
||||||
more functions from the 2008 version of the IEEE Floating-Point Standard;
|
more functions from the 2008 version of the IEEE Floating-Point Standard;
|
||||||
<LI>
|
<LI>
|
||||||
|
|
Loading…
Reference in New Issue