Age | Commit message (Collapse) | Author |
|
|
|
Just by sheer luck this had no actual effect because the decimal format
prefix has the same length as the percent format prefix.
|
|
Most locales have a single grouping size (the number of integer digits
to be written before inserting a grouping separator). However some have
a primary and secondary size. We parse the primary size as the size used
for the least significant integer digits, and the secondary size for the
most significant.
|
|
In order to implement Intl.NumberFormat.prototype.formatToParts, do not
replace {currency} keys in the format pattern before ECMA-402 tells us
to. Otherwise, the array return by formatToParts will not contain the
expected currency key.
Early replacement was done to avoid resolving the currency display more
than once, as it involves a couple of round trips to search through
LibUnicode data. So this adds a non-standard method to NumberFormat to
do this resolution and cache the result.
Another side effect of this change is that LibUnicode must replace unit
format patterns of the form "{0} {1}" during code generation. These were
previously skipped during code generation because LibJS would just
replace the keys with the currency display at runtime. But now that the
currency display injection is delayed, any {0} or {1} keys in the format
pattern will cause PartitionNumberPattern to abort.
|
|
Currencies are a bit strange; the layout of currency data in the CLDR is
not particularly compatible with what ECMA-402 expects. For example, the
currency format in the "en" and "ar" locales for the Latin script are:
en: "¤#,##0.00"
ar: "¤\u00A0#,##0.00"
Note how the "ar" locale has a non-breaking space after the currency
symbol (¤), but "en" does not. This does not mean that this space will
appear in the "ar"-formatted string, nor does it mean that a space won't
appear in the "en"-formatted string. This is a runtime decision based on
the currency display chosen by the user ("$" vs. "USD" vs. "US dollar")
and other rules in the Unicode TR-35 spec.
ECMA-402 shies away from the nuances here with "implementation-defined"
steps. LibUnicode will store the data parsed from the CLDR however it is
presented; making decisions about spacing, etc. will occur at runtime
based on user input.
|
|
For example, there isn't a unique set of data for the en-US locale;
rather, it defaults to the data for the en locale. See this commit for
much more detail: 357c97dfa864dbd779d517bac502858aa2618b96
|
|
These are used when formatting a number as currency with a display
option of "name" (e.g. for USD, the name is "US Dollars" in en-US).
These patterns appear in the CLDR in a different manner than other
number formats that are pluralized. They are of the form "{0} {1}",
therefore do not undergo subpattern replacements.
|
|
Currently, LibUnicode is only parsing and generating the "long" style of
currency display names. However, the CLDR contains "short" and "narrow"
forms as well that need to be handled. Parse these, and update LibJS to
actually respect the "style" option provided by the user for displaying
currencies with Intl.DisplayNames.
Note: There are some discrepencies between the engines on how style is
handled. In particular, running:
new Intl.DisplayNames('en', {type:'currency', style:'narrow'}).of('usd')
Gives:
SpiderMoney: "USD"
V8: "US Dollar"
LibJS: "$"
And running:
new Intl.DisplayNames('en', {type:'currency', style:'short'}).of('usd')
Gives:
SpiderMonkey: "$"
V8: "US Dollar"
LibJS: "$"
My best guess is V8 isn't handling style, and just returning the long
form (which is what LibJS did before this commit). And SpiderMoney can
handle some styles, but if they don't have a value for the requested
style, they fall back to the canonicalized code passed into of().
|
|
The parser was previously expecting number sections within a pattern to
start with "#", but they may also begin with "0".
|
|
libc++ uses a Pthread condition variable in one of its initialization
functions. This means that Pthread forwarding has to be set up in LibC
before libc++ can be initialized. Also, because LibPthread is written in
C++, (at least some) parts of the C++ standard library have to be linked
against it.
This is a circular dependency, which means that the order in which these
two libraries' initialization functions are called is undefined. In some
cases, libc++ will come first, which will then trigger an assert due to
the missing Pthread forwarding.
This issue isn't necessarily unique to LibPthread, as all libraries that
libc++ depends on exhibit the same circular dependency issue.
The reason why this issue didn't affect the GNU toolchain is that
libstdc++ is always linked statically. If we were to change that, I
believe that we would run into the same issue.
|
|
|
|
|
|
The data used for number formatting is going to grow quite a bit when
the cldr-units package is parsed. To prevent the generated UnicodeLocale
file from growing outrageously large, the number formatting data can go
into its own file. To prepare for this, move code that will be common
between the generators for UnicodeLocale and UnicodeNumberFormat to the
utility header.
|
|
|
|
This will be needed for the ComputeExponentForMagnitude AO for compact
formatting, namely step 5b:
Let exponent be an implementation- and locale-dependent (ILD) integer
by which to scale a number of the given magnitude in compact notation
for the current locale.
|
|
A number formatting pattern in the CLDR contains one or two entries,
delimited by a semi-colon. Previously, LibUnicode was just storing the
entire pattern as one string. This changes the generator to split the
pattern on that delimiter and generate the 3 unique patterns expected by
ECMA-402.
The rules for generating the 3 patterns are as follows:
* If the pattern contains 1 entry, it is the zero pattern. The positive
pattern is the zero pattern prepended with {plusSign}. The negative
pattern is the zero pattern prepended with {minusSign}.
* If the pattern contains 2 entries, the first is the zero pattern, and
the second is the negative pattern. The positive pattern is the zero
pattern prepended with {plusSign}.
|
|
Also known as "currency-accounting" in some CLDR documentation.
|
|
|
|
|
|
|
|
|
|
|
|
The number system data in the CLDR contains information on how to format
numbers in a locale-dependent manner. Start parsing this data, beginning
with numeric symbol strings. For example the symbol NaN maps to "NaN" in
the en-US locale, and "非數值" in the zh-Hant locale.
|
|
Some locales in the CLDR have alternate default numbering systems listed
under "defaultNumberingSystem-alt-*", e.g.:
"defaultNumberingSystem": "arab",
"defaultNumberingSystem-alt-latn": "latn",
"otherNumberingSystems": {
"native": "arab"
},
We were previously only parsing "defaultNumberingSystem" and
"otherNumberingSystems". This odd format appears to be an artifact of
converting from XML.
|
|
This isn't particularly important because this generates code that is
quite hidden from outside callers. But when viewing the generated code,
it's a bit nicer to read e.g. enum identifiers such as "MinusSign"
rather than "Minussign".
|
|
|
|
The 'master' branch is no longer updated, they've switched to 'main'.
|
|
|
|
The finale! Users can now be sure that the value is valid, which makes
things simpler.
|
|
First off, this verifies that an initial value is always provided in
Properties.json for each property.
Second, it verifies that parsing that initial value succeeds.
This means that a call to `property_initial_value()` will always return
a valid StyleValue. :^)
|
|
This allows libraries and binaries to explicitly link against
`<library>.so.serenity`, which avoids some confusion if there are other
libraries with the same name, such as OpenSSL's `libcrypto`.
|
|
|
|
We don't need them any more, so they're gone. :^)
|
|
Release notes:
https://github.com/unicode-org/cldr-json/releases/tag/40.0.0
|
|
This file contains the list of locales which default to their parent
locale's values. In the core CLDR dataset, these locales have their own
files, but they are empty (except for identity data). For example:
https://github.com/unicode-org/cldr/blob/main/common/main/en_US.xml
In the JSON export, these files are excluded, so we currently are not
recognizing these locales just by iterating the locale files.
This is a prerequisite for upgrading to CLDR version 40. One of these
default-content locales is the popular "en-US" locale, which defaults to
"en" values. We were previously inferring the existence of this locale
from the "en-US-POSIX" locale (many implementations, including ours,
strip variants such as POSIX). However, v40 removes the "en-US-POSIX"
locale entirely, meaning that without this change, we wouldn't know that
"en-US" exists (we would default to "en").
For more detail on this and other v40 changes, see:
https://cldr.unicode.org/index/downloads/cldr-40#h.nssoo2lq3cba
|
|
This script was silently broken in commit
62af6cd4f9637b8937e59832f5af00f4556c496b.
|
|
|
|
|
|
|
|
grep -P does not work on macOS, but grep -E does.
|
|
macOS's find requires a leading search scope. Without this change this
lint step fails.
|
|
|
|
macOS's `find` does not support the '-executable' flag, nor does it
support the '-perm /' syntax, but we can make it work with a special
case.
|
|
Using `xargs -i <cmd> {}` is just doing the default behavior of xargs,
but with extra steps that also don't work on macOS.
|
|
grep -E and -F are POSIX standard, and meets all our matching needs.
|
|
|
|
This is just silly :^)
$ serenity run lagom js
WARNING: unknown toolchain 'js'. Defaulting to GNU.
Valid values are 'Clang', 'GNU' (default)
|
|
This is the last usage of old-style exceptions in the WrapperGenerator.
|
|
|
|
|