浮動小数点数の10進表記についてまとめてみた

リソースをまとめたのみで、これといってアルゴリズムを解説しているわけではない。後々時間をみつけて気になったものは記事化するかもしれない(しないかもしれない)。

はじめに

以下の前提をおく。

IEEE754準拠

IEEE 754-2008

IEEE754とかなにかというのは、IEEE 754(wikipedia)をみるとだいたいわかる。

符号化形式

D言語でdouble型の値から指数部・仮数部・符号を抜き出す疑似コードは以下のようになる。

// 浮動小数点数を整数表現で持つ.
ulong ival = *cast(ulong*) &val;

// 指数部は11ビットで表される.
enum log2_max_exp = cast(int) log2(1024);

// 下位52ビットは仮数部を表す.
ulong mantissa = ival & ((1L << 52) - 1);

// 真ん中11ビットは指数部:
//   52ビット右にシフトして仮数部を除いたものと,
//   (1 << 11) - 1
//      = 1 << 11 - 1
//      = 1023
//   でorをとって指数部を取り出す.
int exponent = (ival >> 52) & ((1L << 11) - 1);

// 先頭の1ビットは符号を表す.
bool negative = (ival >> (53 + 10)) & 1;

// 指数部が0のとき(0 or 非正規化数),もしくは2047のとき(無限大 or NaN)は
// 仮数部はケチ表現であり(正規化数), 先頭の1を省略している.
// そのため先頭の1を復活させる.
if (exponent != 0 && exponent != 2 * 1024 - 1)
    mantissa |= 1L << 52;
// 非正規化数の場合は指数を1にする.
else if (exponent == 0)
    exponent = 1;
// バイアスを引く. doubleの場合は1023固定.
exponent -= 1023;

double: 1.23の場合を考える。

まず単純にdoubleの整数から1ビット・11ビット・52ビットを取り出すと以下のような値となる。

s0: 0
e0: 1023
m0: 10011101011100001010001111010111000010100011110101110

s0は符号ビットで計算に関係ないので今後無視する。

次にバイアスや正規化を行う。

上記で示したようにexponentはバイアス(doubleの場合は-1023だが、exponentが0の場合は非正規化数となるので-1022になる)がかかり、正規化数であるので仮数部も1.10011101011100001010001111010111000010100011110101110となる。

e1: 0
m1: 1.10011101011100001010001111010111000010100011110101110

m * 2 ^^ (e-bias) は 1.10011101011100001010001111010111000010100011110101110 * 2 ^^ 0 となる。

これを整数表現のみにすると

11001110101110000101000111101011100001010001111010111 * (2 ^^ -51)

のようになる。

アルゴリズム

浮動小数点数の10進表記変換を行うアルゴリズムはかなり昔から存在している。

Dragon4 (1990)

論文: How to print floating-point numbers accurately

実際にはこのアルゴリズム(の元になったもの)はずいぶん以前から存在していたようだが、論文として体系化されたものは1990年の上述した論文が最初のようだ。

Grisu(Grisu1/Grisu2/Grisu3) (2010)

論文: Printing Floating-Point Numbers Quickly and Accurately with Integers
実装例(double-conversion): https://github.com/google/double-conversion

長らくDragon4(もしくは近しいアルゴリズム)が用いられてきた中で、2010年のGrisuの登場は当時としては画期的だったようだ。正確な情報はわからないが、内部的にfloatを多用するJavaScriptの普及がこういった最適化研究の後押しとなったようだ。

Grisu2/Grisu3は高速だが、正しくない結果となるようなケースもあるためフォールバックとしてDragon4を用いる実装が一般的なようだ。

Errol (2016)

論文: Printing floating-point numbers: a faster, always correct method

こちらはDragon4よりははやいがGrisu3よりは遅いらしい。

Ryu (2018)

論文: Ryū: fast float-to-string conversion
リファレンス実装(C/Java): https://github.com/ulfjack/ryu

基本的な手法はGrisu3に近い(10進リテラルへの置き換えを行う際に上限と下限から最も近似している最短の10進表示ものを探す)が、よりシンプルな整数の演算への置き換えでパフォーマンスが上がり正確な結果を得ることができるらしい。

Grisu-Exact (2020)

論文: Grisu-Exact: A Fast and Exact Floating-Point Printing Algorithm
リファレンス実装(C): https://github.com/jk-jeon/Grisu-Exact
作者のreddit投稿: https://www.reddit.com/r/cpp/comments/elhev9/grisuexact_a_variant_of_grisu_always_producing/

Grisu2 + Ryu のアプローチのようだ。作者は後述するDragonboxと同一の人。

Schubfach (2017~2018?)

論文: The Schubfach way to render doubles
実装(C): https://github.com/abolz/Drachennest/blob/master/src/schubfach_64.cc

Dragonboxのpaperで絶賛されている。

The beauty of Schubfach is that it does not perform any iterative search for the shortest representation. Rather, it just magically finds the "optimal number" at only one trial.

という点が画期的であるとのこと。

元々Javaのアルゴリズム(不正確な結果になってしまう問題が以前から知られていた)の改善案として提案されていたもので、2022年にようやく本家に取り込まれた。

Dragonbox (2020)

論文: https://github.com/jk-jeon/dragonbox/blob/master/other_files/Dragonbox.pdf
リファレンス実装(C++): https://github.com/jk-jeon/dragonbox
作者のReddit投稿: https://www.reddit.com/r/cpp/comments/ika6ml/dragonbox_yet_another_floattostring_conversion/

おそらく現在最も新規の手法。Schubfachの弱点にGrisuの手法を組み合わせることで改良したアルゴリズム。

Schubfachの弱点として以下をあげている。

there are three 128bit x 64bit -> high 64bit multiplications which are expensive. (This is for the case of double; for float, we instead have 64bit x 32bit -> high 32bit multiplications, but this is still expensive.)

Grisuの手法を使うことでより計算効率のよい形に変換可能

by applying some Grisu-like method, it is possible to reduce the amortized number of 128bit x 64bit -> high 64bit multiplications into one, at the cost of having some branches and divisions-by-constants.

言語・ライブラリ実装

実際にどういった言語で標準ライブラリであったり3rd patry製ライブラリであったりで用いられているかを調べてみた。

C++

fmtlib/fmt: Dragonboxを使用
- Implement Dragonbox
- tweet: almost ~18x faster than sprintf on macOS/clang

Java

openjdk: Schubfachを使用
- 4511638: Double.toString(double) sometimes produces incorrect results
  - commit
https://github.com/openjdk/jdk/pull/3402#issuecomment-935036651
- 作者のコメント。Dragonbox的なアプローチは考えて実際に試したが、性能が悪化しコードも複雑になったので断念したとこのこと。

Go

標準ライブラリ: Ryuベースのものを使用 / 多倍長整数(for big)
- strconv: implement Ryū-like algorithm for fixed precision ftoa

Rust

標準ライブラリ: Grisu3 + Dragon4(fallback)
- New floating-to-decimal formatting routine
serde_json: Ryuを使用
- Switch from dtoa to ryu
- 実装: https://github.com/dtolnay/ryu
lexical: Dragonbox(default) / Grisu(compact-feature) / power-of-two(power-of-two feature, for formatting) / Radix (rare-case, slow but good for full round-trip)
- 実装: https://github.com/Alexhuszagh/rust-lexical/blob/main/lexical-write-float/docs/Algorithm.md
  - Dragonboxの実装は本家からさらに改良されている

Swift

標準ライブラリ: SwiftDtoa v2 (based on Grisu2,Ryu,Errol3)
- SwiftDtoa v2: Better, Smaller, Faster floating-point formatting

Zig

標準ライブラリ: Errol3を使用
- partial implementation of printing floating point numbers with errol3

Nim

標準ライブラリ: Dragonbox(既存のC実装をc2nimで変換)
- use dragonbox algorithm; alternative to #18008

Python

pyjson5: Dragonboxを使用
- Faster double encoding

Hare

標準ライブラリ: Ryuを使用
- strconv: implement f32 to string conversion

Crystal

標準ライブラリ: Dragonboxを仕様
- Implement the Dragonbox algorithm for Float#to_s

その他参考

各種アルゴリズム実装(Dragonbox/Ryu/Schubfach/Grisu/Dragon4)
- https://github.com/abolz/Drachennest
Russ Cox氏のブログ
- https://research.swtch.com/ftoa
BlenderでDragonboxを使って速くなった話
- https://aras-p.info/blog/2022/02/03/Speeding-up-Blender-.obj-export/
Printing Floating-Point Numbers シリーズ
mod_poppo氏による浮動小数点数の文字列化手法まとめ
- https://zenn.dev/mod_poppo/scraps/eee7c774216cbb

kubo39's blog

ただの雑記です。