MIRROR: javascript for ๐Ÿœ's, a tiny runtime with big ambitions
1
fork

Configure Feed

Select the types of activity you want to include in your feed.

migrate Intl to dedicated module

+576 -163
+48
docs/exec-plans/tech-debt.md
··· 29 29 - Impact: One-shot CLI compiles are fine, but a REPL, watch mode, embedder, or other long-lived process cannot yet recycle compiler scratch space across compiles. 30 30 - Proposed fix: Add a real `compile_pool` scratch allocator after the `compile_ctx` extraction. Pool the resizable arrays for `locals`, `local_lookup_heads`, `code`, `constants`, `atoms`, `upval_descs`, `loops`, `srcpos`, and potentially `slot_types`. Keep `line_table` separate or make it poolable scratch, since it is derived from the current source buffer rather than a semantic cache. 31 31 - Status: backlog 32 + 33 + - Area: Shared helper utilities 34 + - Issue: Small helper logic such as ASCII character classification, casing, and similar utility code is duplicated across multiple runtime and support modules with local one-off implementations. 35 + - Impact: Repeated copies drift over time, make bug fixes harder to apply consistently, and add noise when adding or reviewing new modules. 36 + - Proposed fix: Audit duplicated helper patterns across `src/` and `include/`, identify the stable cross-cutting utilities, and centralize them in a small shared header or utility module with repo-wide call sites migrated incrementally. 37 + - Status: backlog 38 + 39 + - Area: `src/modules/intl.c` 40 + - Issue: `Intl` is now present and passes the current compat-table target, but several behaviors are still simplified compatibility implementations rather than fuller ECMA-402 semantics. 41 + - Impact: `Intl.Collator`, `Intl.NumberFormat`, `Intl.DateTimeFormat`, and `Intl.Segmenter` can still diverge from web or Node behavior for anything beyond the currently covered compat surface. 42 + - Proposed fix: Continue expanding `Intl` incrementally: replace `strcoll`-only collation, deepen `resolvedOptions()`, make `DateTimeFormat` actually honor stored timezone and locale options, and move `Segmenter` closer to the expected iterable/result object shape. 43 + - Status: backlog 44 + 45 + - Area: `src/modules/timer.c` 46 + - Issue: `node:timers/promises setInterval()` is still explicitly unimplemented. 47 + - Impact: Promise-based timer APIs remain incomplete and can block compatibility with code that expects the Node timers/promises interval surface. 48 + - Proposed fix: Implement `setInterval()` on top of the existing timer promise scheduling machinery, including cancellation and signal handling behavior consistent with the existing `setTimeout()` and `setImmediate()` support. 49 + - Status: backlog 50 + 51 + - Area: `src/modules/dns.c` 52 + - Issue: `node:dns` is still a minimal shim centered on `dns.promises.lookup`. 53 + - Impact: Tooling or apps that expect more of the Node DNS surface still need polyfills or will fail outright. 54 + - Proposed fix: Expand the module incrementally from the existing lookup path, prioritizing the most commonly used sync, callback, and `promises` APIs needed by current ecosystem packages. 55 + - Status: backlog 56 + 57 + - Area: `src/modules/crypto.c` 58 + - Issue: `crypto.subtle` is only partially implemented and still marked for extension beyond the current digest-oriented support. 59 + - Impact: Web Crypto compatibility is incomplete, which blocks packages and runtime features that expect a broader `SubtleCrypto` surface. 60 + - Proposed fix: Extend `crypto.subtle` method coverage incrementally, starting with the highest-value operations after digest and preserving the existing algorithm parsing entrypoints. 61 + - Status: backlog 62 + 63 + - Area: `src/modules/worker_threads.c` 64 + - Issue: `node:worker_threads` is still a minimal compatibility implementation, and `Worker.postMessage` remains explicitly unimplemented. 65 + - Impact: Build tools and libraries that rely on real worker thread messaging or broader worker lifecycle behavior still cannot use the native surface directly. 66 + - Proposed fix: Expand worker thread support incrementally, starting with message passing and the most commonly used worker APIs, while preserving the existing lightweight process-backed architecture where practical. 67 + - Status: backlog 68 + 69 + - Area: `src/modules/async_hooks.c` 70 + - Issue: `node:async_hooks` is still a minimal compatibility layer intended mainly to satisfy framework expectations. 71 + - Impact: Async context tracking semantics remain shallow, which can break libraries that rely on realistic async IDs, resources, or hook lifecycle behavior. 72 + - Proposed fix: Replace the placeholder async ID and resource behavior with real runtime-backed tracking, while keeping `AsyncLocalStorage` compatibility stable during the transition. 73 + - Status: backlog 74 + 75 + - Area: `src/streams/readable.c` 76 + - Issue: `ReadableStreamBYOBReader` is still explicitly unimplemented, and byte-source support is still called out as incomplete. 77 + - Impact: Web Streams byte-oriented consumers cannot rely on BYOB reader semantics, leaving an important platform feature gap for stream-heavy or browser-compatible code. 78 + - Proposed fix: Add real byte-source plumbing and implement `ReadableStreamBYOBReader` on top of it instead of routing byte sources through the default reader path. 79 + - Status: backlog
+6
include/modules/intl.h
··· 1 + #ifndef ANT_INTL_MODULE_H 2 + #define ANT_INTL_MODULE_H 3 + 4 + void init_intl_module(void); 5 + 6 + #endif
+2
src/main.c
··· 80 80 #include "modules/domexception.h" 81 81 #include "modules/abort.h" 82 82 #include "modules/globals.h" 83 + #include "modules/intl.h" 83 84 #include "modules/wasm.h" 84 85 #include "modules/string_decoder.h" 85 86 #include "modules/stream.h" ··· 592 593 init_timer_module(); 593 594 init_domexception_module(); 594 595 init_globals_module(); 596 + init_intl_module(); 595 597 init_wasm_module(); 596 598 init_builtin_module(); 597 599 init_buffer_module();
-163
src/modules/globals.c
··· 1 1 #include <compat.h> // IWYU pragma: keep 2 2 3 3 #include <string.h> 4 - #include <stdio.h> 5 - #include <stdlib.h> 6 4 #include <stdbool.h> 7 - #include <time.h> 8 5 9 6 #include "ant.h" 10 7 #include "errors.h" ··· 104 101 sv_vm_call(js->vm, js, handler, global, call_args, 1, NULL, false); 105 102 } 106 103 107 - // stub: minimal Intl 108 - static ant_value_t intl_dtf_format(ant_t *js, ant_value_t *args, int nargs) { 109 - time_t t; 110 - 111 - if (nargs >= 1 && vtype(args[0]) == T_NUM) { 112 - t = (time_t)(js_getnum(args[0]) / 1000.0); 113 - } else t = time(NULL); 114 - 115 - struct tm local; 116 - #ifdef _WIN32 117 - localtime_s(&local, &t); 118 - #else 119 - localtime_r(&t, &local); 120 - #endif 121 - 122 - char buf[64]; 123 - int hour12 = local.tm_hour % 12; 124 - if (hour12 == 0) hour12 = 12; 125 - 126 - const char *ampm = local.tm_hour < 12 ? "AM" : "PM"; 127 - snprintf(buf, sizeof(buf), "%d:%02d:%02d %s", hour12, local.tm_min, local.tm_sec, ampm); 128 - 129 - return js_mkstr(js, buf, strlen(buf)); 130 - } 131 - 132 - static ant_value_t intl_dtf_resolvedOptions(ant_t *js, ant_value_t *args, int nargs) { 133 - ant_value_t obj = js_mkobj(js); 134 - js_set(js, obj, "locale", js_mkstr(js, "en-US", 5)); 135 - js_set(js, obj, "timeZone", js_mkstr(js, "UTC", 3)); 136 - return obj; 137 - } 138 - 139 - static ant_value_t intl_dtf_formatToParts(ant_t *js, ant_value_t *args, int nargs) { 140 - return js_mkarr(js); 141 - } 142 - 143 - static ant_value_t intl_dtf_constructor(ant_t *js, ant_value_t *args, int nargs) { 144 - ant_value_t this = js_getthis(js); 145 - 146 - ant_value_t format_fn = js_heavy_mkfun(js, intl_dtf_format, js_mkundef()); 147 - js_set(js, this, "format", format_fn); 148 - js_set(js, this, "resolvedOptions", js_mkfun(intl_dtf_resolvedOptions)); 149 - js_set(js, this, "formatToParts", js_mkfun(intl_dtf_formatToParts)); 150 - 151 - return this; 152 - } 153 - 154 - static size_t intl_utf8_segment_len(const char *input, size_t remaining) { 155 - if (remaining == 0) return 0; 156 - 157 - const unsigned char *s = (const unsigned char *)input; 158 - unsigned char c = s[0]; 159 - size_t len = 1; 160 - 161 - if ((c & 0x80) == 0) return 1; 162 - if ((c & 0xe0) == 0xc0) len = 2; 163 - else if ((c & 0xf0) == 0xe0) len = 3; 164 - else if ((c & 0xf8) == 0xf0) len = 4; 165 - 166 - if (len > remaining) return 1; 167 - for (size_t i = 1; i < len; i++) if ((s[i] & 0xc0) != 0x80) return 1; 168 - 169 - return len; 170 - } 171 - 172 - static bool intl_ascii_is_word_byte(const char *segment, size_t len) { 173 - if (len != 1) return true; 174 - 175 - unsigned char c = (unsigned char)segment[0]; 176 - return 177 - (c >= '0' && c <= '9') || 178 - (c >= 'A' && c <= 'Z') || 179 - (c >= 'a' && c <= 'z') || 180 - c == '_'; 181 - } 182 - 183 - static const char *intl_segmenter_granularity(ant_t *js, ant_value_t segmenter, size_t *len) { 184 - ant_value_t granularity = js_get(js, segmenter, "granularity"); 185 - if (vtype(granularity) != T_STR) { 186 - if (len) *len = 8; 187 - return "grapheme"; 188 - } 189 - 190 - return js_getstr(js, granularity, len); 191 - } 192 - 193 - static ant_value_t intl_segmenter_segment(ant_t *js, ant_value_t *args, int nargs) { 194 - ant_value_t input = nargs > 0 ? js_tostring_val(js, args[0]) : js_mkstr(js, "", 0); 195 - if (is_err(input)) return input; 196 - 197 - size_t input_len = 0; 198 - char *input_str = js_getstr(js, input, &input_len); 199 - ant_value_t segments = js_mkarr(js); 200 - 201 - ant_value_t this = js_getthis(js); 202 - size_t granularity_len = 0; 203 - const char *granularity = intl_segmenter_granularity(js, this, &granularity_len); 204 - bool word_granularity = granularity_len == 4 && memcmp(granularity, "word", 4) == 0; 205 - 206 - for (size_t offset = 0; offset < input_len;) { 207 - size_t segment_len = intl_utf8_segment_len(input_str + offset, input_len - offset); 208 - ant_value_t record = js_mkobj(js); 209 - 210 - js_set(js, record, "segment", js_mkstr(js, input_str + offset, segment_len)); 211 - js_set(js, record, "index", js_mknum((double)offset)); 212 - js_set(js, record, "input", input); 213 - 214 - if (word_granularity) js_set( 215 - js, record, "isWordLike", 216 - js_bool(intl_ascii_is_word_byte(input_str + offset, segment_len)) 217 - ); 218 - 219 - js_arr_push(js, segments, record); 220 - offset += segment_len; 221 - } 222 - 223 - return segments; 224 - } 225 - 226 - static ant_value_t intl_segmenter_resolvedOptions(ant_t *js, ant_value_t *args, int nargs) { 227 - ant_value_t obj = js_mkobj(js); 228 - ant_value_t this = js_getthis(js); 229 - 230 - size_t granularity_len = 0; 231 - const char *granularity = intl_segmenter_granularity(js, this, &granularity_len); 232 - 233 - js_set(js, obj, "locale", js_mkstr(js, "en-US", 5)); 234 - js_set(js, obj, "granularity", js_mkstr(js, granularity, granularity_len)); 235 - 236 - return obj; 237 - } 238 - 239 - static ant_value_t intl_segmenter_constructor(ant_t *js, ant_value_t *args, int nargs) { 240 - ant_value_t this = js_getthis(js); 241 - const char *granularity = "grapheme"; 242 - size_t granularity_len = 8; 243 - 244 - if (nargs >= 2 && vtype(args[1]) == T_OBJ) { 245 - ant_value_t option = js_get(js, args[1], "granularity"); 246 - if (vtype(option) == T_STR) granularity = js_getstr(js, option, &granularity_len); 247 - } 248 - 249 - js_set(js, this, "granularity", js_mkstr(js, granularity, granularity_len)); 250 - js_set(js, this, "segment", js_mkfun(intl_segmenter_segment)); 251 - js_set(js, this, "resolvedOptions", js_mkfun(intl_segmenter_resolvedOptions)); 252 - 253 - return this; 254 - } 255 - 256 104 void init_globals_module(void) { 257 105 ant_t *js = rt->js; 258 106 ant_value_t global = js_glob(js); 259 107 260 108 js_set(js, global, "reportError", js_mkfun(js_report_error)); 261 109 js_set_descriptor(js, global, "reportError", 11, JS_DESC_W | JS_DESC_C); 262 - 263 - ant_value_t intl = js_mkobj(js); 264 - ant_value_t dtf_ctor = js_heavy_mkfun(js, intl_dtf_constructor, js_mkundef()); 265 - ant_value_t segmenter_ctor = js_heavy_mkfun(js, intl_segmenter_constructor, js_mkundef()); 266 - 267 - js_mark_constructor(dtf_ctor, true); 268 - js_mark_constructor(segmenter_ctor, true); 269 - 270 - js_set(js, intl, "DateTimeFormat", dtf_ctor); 271 - js_set(js, intl, "Segmenter", segmenter_ctor); 272 - js_set(js, global, "Intl", intl); 273 110 }
+478
src/modules/intl.c
··· 1 + #include <compat.h> // IWYU pragma: keep 2 + 3 + #include <stdbool.h> 4 + #include <math.h> 5 + #include <stdio.h> 6 + #include <string.h> 7 + #include <time.h> 8 + 9 + #include "ant.h" 10 + #include "errors.h" 11 + #include "runtime.h" 12 + #include "internal.h" 13 + #include "descriptors.h" 14 + 15 + #include "modules/intl.h" 16 + #include "modules/symbol.h" 17 + 18 + static ant_value_t g_intl_collator_proto = 0; 19 + static ant_value_t g_intl_numberformat_proto = 0; 20 + static ant_value_t g_intl_datetimeformat_proto = 0; 21 + static ant_value_t g_intl_segmenter_proto = 0; 22 + 23 + typedef struct { 24 + int hour12; 25 + int minute; 26 + int second; 27 + const char *day_period; 28 + } intl_dtf_fields_t; 29 + 30 + static ant_value_t intl_create_instance(ant_t *js, ant_value_t fallback_proto) { 31 + ant_value_t obj = js_mkobj(js); 32 + ant_value_t proto = js_instance_proto_from_new_target(js, fallback_proto); 33 + if (is_object_type(proto)) js_set_proto_init(obj, proto); 34 + return obj; 35 + } 36 + 37 + // TODO: docs/exec-plans/tech-debt.md 38 + static inline bool intl_ascii_is_alpha(char c) { 39 + return (c >= 'A' && c <= 'Z') || (c >= 'a' && c <= 'z'); 40 + } 41 + 42 + static inline bool intl_ascii_is_digit(char c) { 43 + return c >= '0' && c <= '9'; 44 + } 45 + 46 + static inline bool intl_ascii_is_alnum(char c) { 47 + return intl_ascii_is_alpha(c) || intl_ascii_is_digit(c); 48 + } 49 + 50 + static inline char intl_ascii_lower(char c) { 51 + return (c >= 'A' && c <= 'Z') ? (char)(c + ('a' - 'A')) : c; 52 + } 53 + 54 + static bool intl_ascii_all(const char *s, size_t len, bool (*pred)(char)) { 55 + if (!s || len == 0) return false; 56 + for (size_t i = 0; i < len; i++) if (!pred(s[i])) return false; 57 + return true; 58 + } 59 + 60 + static bool intl_is_valid_language_tag(const char *tag, size_t len) { 61 + if (!tag || len == 0) return false; 62 + 63 + size_t start = 0; 64 + size_t end = 0; 65 + while (end < len && tag[end] != '-') end++; 66 + 67 + size_t first_len = end - start; 68 + if (first_len < 2 || first_len > 8) return false; 69 + if (!intl_ascii_all(tag, first_len, intl_ascii_is_alpha)) return false; 70 + 71 + bool need_extension_subtag = false; 72 + bool in_private_use = false; 73 + bool saw_private_use_subtag = false; 74 + 75 + while (end < len) { 76 + start = end + 1; 77 + if (start >= len) return false; 78 + 79 + end = start; 80 + while (end < len && tag[end] != '-') end++; 81 + 82 + size_t subtag_len = end - start; 83 + if (subtag_len == 0 || subtag_len > 8) return false; 84 + 85 + const char *subtag = tag + start; 86 + if (!intl_ascii_all(subtag, subtag_len, intl_ascii_is_alnum)) return false; 87 + 88 + if (in_private_use) { 89 + saw_private_use_subtag = true; 90 + continue; 91 + } 92 + 93 + if (need_extension_subtag) { 94 + if (subtag_len < 2) return false; 95 + need_extension_subtag = false; 96 + continue; 97 + } 98 + 99 + if (subtag_len == 1) { 100 + char singleton = intl_ascii_lower(subtag[0]); 101 + if (singleton == 'x') in_private_use = true; 102 + else need_extension_subtag = true; 103 + } 104 + } 105 + 106 + if (need_extension_subtag) return false; 107 + if (in_private_use && !saw_private_use_subtag) return false; 108 + 109 + return true; 110 + } 111 + 112 + static ant_value_t intl_resolve_locale(ant_t *js, ant_value_t input) { 113 + if (vtype(input) == T_ARR) input = js_get(js, input, "0"); 114 + if (vtype(input) == T_UNDEF) return js_mkstr(js, "en-US", 5); 115 + 116 + ant_value_t locale = js_tostring_val(js, input); 117 + if (is_err(locale)) return locale; 118 + 119 + size_t len = 0; 120 + const char *tag = js_getstr(js, locale, &len); 121 + if (!intl_is_valid_language_tag(tag, len)) 122 + return js_mkerr_typed(js, JS_ERR_RANGE, "Invalid language tag"); 123 + 124 + return locale; 125 + } 126 + 127 + static ant_value_t intl_get_option_string(ant_t *js, ant_value_t options, const char *key, const char *fallback) { 128 + if (vtype(options) != T_OBJ) return js_mkstr(js, fallback, strlen(fallback)); 129 + 130 + ant_value_t value = js_get(js, options, key); 131 + if (vtype(value) == T_UNDEF) return js_mkstr(js, fallback, strlen(fallback)); 132 + 133 + ant_value_t str = js_tostring_val(js, value); 134 + if (is_err(str)) return str; 135 + 136 + size_t len = 0; 137 + const char *ptr = js_getstr(js, str, &len); 138 + if (!ptr || len == 0) return js_mkstr(js, fallback, strlen(fallback)); 139 + 140 + return str; 141 + } 142 + 143 + static ant_value_t intl_collator_compare(ant_t *js, ant_value_t *args, int nargs) { 144 + ant_value_t left = js_tostring_val(js, nargs > 0 ? args[0] : js_mkstr(js, "", 0)); 145 + if (is_err(left)) return left; 146 + 147 + ant_value_t right = js_tostring_val(js, nargs > 1 ? args[1] : js_mkstr(js, "", 0)); 148 + if (is_err(right)) return right; 149 + 150 + const char *left_str = js_getstr(js, left, NULL); 151 + const char *right_str = js_getstr(js, right, NULL); 152 + 153 + int result = strcoll(left_str ? left_str : "", right_str ? right_str : ""); 154 + if (result < 0) return js_mknum(-1); 155 + if (result > 0) return js_mknum(1); 156 + 157 + return js_mknum(0); 158 + } 159 + 160 + static ant_value_t intl_collator_resolved_options(ant_t *js, ant_value_t *args, int nargs) { 161 + ant_value_t obj = js_mkobj(js); 162 + ant_value_t this_obj = js_getthis(js); 163 + 164 + ant_value_t locale = is_object_type(this_obj) 165 + ? js_get(js, this_obj, "locale") 166 + : js_mkundef(); 167 + 168 + if (vtype(locale) != T_STR) locale = js_mkstr(js, "en-US", 5); 169 + js_set(js, obj, "locale", locale); 170 + 171 + return obj; 172 + } 173 + 174 + static ant_value_t intl_numberformat_format(ant_t *js, ant_value_t *args, int nargs) { 175 + double number = nargs > 0 ? js_to_number(js, args[0]) : 0.0; 176 + ant_value_t raw_val = js_tostring_val(js, js_mknum(number)); 177 + if (is_err(raw_val)) return raw_val; 178 + 179 + size_t raw_len = 0; 180 + const char *raw = js_getstr(js, raw_val, &raw_len); 181 + if (!raw || raw_len == 0) return js_mkstr(js, "0", 1); 182 + 183 + if ( 184 + !isfinite(number) || 185 + memchr(raw, 'e', raw_len) || 186 + memchr(raw, 'E', raw_len) 187 + ) return raw_val; 188 + 189 + const char *dot = memchr(raw, '.', raw_len); 190 + size_t int_len = dot ? (size_t)(dot - raw) : raw_len; 191 + size_t start = raw[0] == '-' ? 1 : 0; 192 + size_t frac_len = dot ? (raw_len - int_len) : 0; 193 + 194 + char buf[128]; 195 + size_t pos = 0; 196 + if (start) buf[pos++] = '-'; 197 + 198 + for (size_t i = start; i < int_len; i++) { 199 + buf[pos++] = raw[i]; 200 + size_t remaining = int_len - 1 - i; 201 + if (remaining > 0 && remaining % 3 == 0) buf[pos++] = ','; 202 + } 203 + 204 + if (dot && frac_len > 0) { 205 + memcpy(buf + pos, dot, frac_len); 206 + pos += frac_len; 207 + } 208 + 209 + buf[pos] = '\0'; 210 + return js_mkstr(js, buf, pos); 211 + } 212 + 213 + static ant_value_t intl_numberformat_resolved_options(ant_t *js, ant_value_t *args, int nargs) { 214 + return intl_collator_resolved_options(js, args, nargs); 215 + } 216 + 217 + static void intl_dtf_extract_fields(ant_t *js, ant_value_t *args, int nargs, intl_dtf_fields_t *out) { 218 + time_t t = time(NULL); 219 + if (nargs >= 1) t = (time_t)(js_to_number(js, args[0]) / 1000.0); 220 + 221 + struct tm local; 222 + #ifdef _WIN32 223 + localtime_s(&local, &t); 224 + #else 225 + localtime_r(&t, &local); 226 + #endif 227 + 228 + out->hour12 = local.tm_hour % 12; 229 + if (out->hour12 == 0) out->hour12 = 12; 230 + out->minute = local.tm_min; 231 + out->second = local.tm_sec; 232 + out->day_period = local.tm_hour < 12 ? "AM" : "PM"; 233 + } 234 + 235 + static ant_value_t intl_dtf_format(ant_t *js, ant_value_t *args, int nargs) { 236 + intl_dtf_fields_t fields; 237 + intl_dtf_extract_fields(js, args, nargs, &fields); 238 + 239 + char buf[64]; 240 + snprintf( 241 + buf, sizeof(buf), "%d:%02d:%02d %s", 242 + fields.hour12, fields.minute, fields.second, fields.day_period 243 + ); 244 + 245 + return js_mkstr(js, buf, strlen(buf)); 246 + } 247 + 248 + static ant_value_t intl_dtf_resolved_options(ant_t *js, ant_value_t *args, int nargs) { 249 + ant_value_t obj = js_mkobj(js); 250 + ant_value_t this_obj = js_getthis(js); 251 + 252 + ant_value_t locale = is_object_type(this_obj) ? js_get(js, this_obj, "locale") : js_mkundef(); 253 + ant_value_t time_zone = is_object_type(this_obj) ? js_get(js, this_obj, "timeZone") : js_mkundef(); 254 + 255 + if (vtype(locale) != T_STR) locale = js_mkstr(js, "en-US", 5); 256 + if (vtype(time_zone) != T_STR) time_zone = js_mkstr(js, "UTC", 3); 257 + 258 + js_set(js, obj, "locale", locale); 259 + js_set(js, obj, "timeZone", time_zone); 260 + 261 + return obj; 262 + } 263 + 264 + static ant_value_t intl_dtf_make_part(ant_t *js, const char *type, const char *value) { 265 + ant_value_t obj = js_mkobj(js); 266 + js_set(js, obj, "type", js_mkstr(js, type, strlen(type))); 267 + js_set(js, obj, "value", js_mkstr(js, value, strlen(value))); 268 + return obj; 269 + } 270 + 271 + static ant_value_t intl_dtf_format_to_parts(ant_t *js, ant_value_t *args, int nargs) { 272 + intl_dtf_fields_t fields; 273 + intl_dtf_extract_fields(js, args, nargs, &fields); 274 + 275 + char hour[8]; 276 + char minute[8]; 277 + char second[8]; 278 + 279 + snprintf(hour, sizeof(hour), "%d", fields.hour12); 280 + snprintf(minute, sizeof(minute), "%02d", fields.minute); 281 + snprintf(second, sizeof(second), "%02d", fields.second); 282 + 283 + ant_value_t parts = js_mkarr(js); 284 + js_arr_push(js, parts, intl_dtf_make_part(js, "hour", hour)); 285 + js_arr_push(js, parts, intl_dtf_make_part(js, "literal", ":")); 286 + js_arr_push(js, parts, intl_dtf_make_part(js, "minute", minute)); 287 + js_arr_push(js, parts, intl_dtf_make_part(js, "literal", ":")); 288 + js_arr_push(js, parts, intl_dtf_make_part(js, "second", second)); 289 + js_arr_push(js, parts, intl_dtf_make_part(js, "literal", " ")); 290 + js_arr_push(js, parts, intl_dtf_make_part(js, "dayPeriod", fields.day_period)); 291 + 292 + return parts; 293 + } 294 + 295 + static size_t intl_utf8_segment_len(const char *input, size_t remaining) { 296 + if (remaining == 0) return 0; 297 + 298 + const unsigned char *s = (const unsigned char *)input; 299 + unsigned char c = s[0]; 300 + size_t len = 1; 301 + 302 + if ((c & 0x80) == 0) return 1; 303 + if ((c & 0xe0) == 0xc0) len = 2; 304 + else if ((c & 0xf0) == 0xe0) len = 3; 305 + else if ((c & 0xf8) == 0xf0) len = 4; 306 + 307 + if (len > remaining) return 1; 308 + for (size_t i = 1; i < len; i++) if ((s[i] & 0xc0) != 0x80) return 1; 309 + 310 + return len; 311 + } 312 + 313 + static bool intl_ascii_is_word_byte(const char *segment, size_t len) { 314 + if (len != 1) return true; 315 + 316 + unsigned char c = (unsigned char)segment[0]; 317 + return 318 + (c >= '0' && c <= '9') || 319 + (c >= 'A' && c <= 'Z') || 320 + (c >= 'a' && c <= 'z') || 321 + c == '_'; 322 + } 323 + 324 + static const char *intl_segmenter_granularity(ant_t *js, ant_value_t segmenter, size_t *len) { 325 + ant_value_t granularity = js_get(js, segmenter, "granularity"); 326 + if (vtype(granularity) != T_STR) { 327 + if (len) *len = 8; 328 + return "grapheme"; 329 + } 330 + 331 + return js_getstr(js, granularity, len); 332 + } 333 + 334 + static ant_value_t intl_segmenter_segment(ant_t *js, ant_value_t *args, int nargs) { 335 + ant_value_t input = nargs > 0 ? js_tostring_val(js, args[0]) : js_mkstr(js, "", 0); 336 + if (is_err(input)) return input; 337 + 338 + size_t input_len = 0; 339 + char *input_str = js_getstr(js, input, &input_len); 340 + ant_value_t segments = js_mkarr(js); 341 + 342 + ant_value_t this_obj = js_getthis(js); 343 + size_t granularity_len = 0; 344 + const char *granularity = intl_segmenter_granularity(js, this_obj, &granularity_len); 345 + bool word_granularity = granularity_len == 4 && memcmp(granularity, "word", 4) == 0; 346 + 347 + for (size_t offset = 0; offset < input_len;) { 348 + size_t segment_len = intl_utf8_segment_len(input_str + offset, input_len - offset); 349 + ant_value_t record = js_mkobj(js); 350 + 351 + js_set(js, record, "segment", js_mkstr(js, input_str + offset, segment_len)); 352 + js_set(js, record, "index", js_mknum((double)offset)); 353 + js_set(js, record, "input", input); 354 + 355 + if (word_granularity) js_set( 356 + js, record, "isWordLike", 357 + js_bool(intl_ascii_is_word_byte(input_str + offset, segment_len)) 358 + ); 359 + 360 + js_arr_push(js, segments, record); 361 + offset += segment_len; 362 + } 363 + 364 + return segments; 365 + } 366 + 367 + static ant_value_t intl_segmenter_resolved_options(ant_t *js, ant_value_t *args, int nargs) { 368 + ant_value_t obj = js_mkobj(js); 369 + ant_value_t this_obj = js_getthis(js); 370 + 371 + size_t granularity_len = 0; 372 + const char *granularity = intl_segmenter_granularity(js, this_obj, &granularity_len); 373 + 374 + ant_value_t locale = is_object_type(this_obj) ? js_get(js, this_obj, "locale") : js_mkundef(); 375 + if (vtype(locale) != T_STR) locale = js_mkstr(js, "en-US", 5); 376 + 377 + js_set(js, obj, "locale", locale); 378 + js_set(js, obj, "granularity", js_mkstr(js, granularity, granularity_len)); 379 + 380 + return obj; 381 + } 382 + 383 + static ant_value_t intl_collator_constructor(ant_t *js, ant_value_t *args, int nargs) { 384 + ant_value_t locale = intl_resolve_locale(js, nargs > 0 ? args[0] : js_mkundef()); 385 + if (is_err(locale)) return locale; 386 + 387 + ant_value_t obj = intl_create_instance(js, g_intl_collator_proto); 388 + js_set(js, obj, "locale", locale); 389 + 390 + return obj; 391 + } 392 + 393 + static ant_value_t intl_numberformat_constructor(ant_t *js, ant_value_t *args, int nargs) { 394 + ant_value_t locale = intl_resolve_locale(js, nargs > 0 ? args[0] : js_mkundef()); 395 + if (is_err(locale)) return locale; 396 + 397 + ant_value_t obj = intl_create_instance(js, g_intl_numberformat_proto); 398 + js_set(js, obj, "locale", locale); 399 + 400 + return obj; 401 + } 402 + 403 + static ant_value_t intl_dtf_constructor(ant_t *js, ant_value_t *args, int nargs) { 404 + ant_value_t locale = intl_resolve_locale(js, nargs > 0 ? args[0] : js_mkundef()); 405 + if (is_err(locale)) return locale; 406 + 407 + ant_value_t time_zone = intl_get_option_string( 408 + js, nargs > 1 ? args[1] : js_mkundef(), 409 + "timeZone", "UTC" 410 + ); 411 + if (is_err(time_zone)) return time_zone; 412 + 413 + ant_value_t obj = intl_create_instance(js, g_intl_datetimeformat_proto); 414 + js_set(js, obj, "locale", locale); 415 + js_set(js, obj, "timeZone", time_zone); 416 + 417 + return obj; 418 + } 419 + 420 + static ant_value_t intl_segmenter_constructor(ant_t *js, ant_value_t *args, int nargs) { 421 + ant_value_t locale = intl_resolve_locale(js, nargs > 0 ? args[0] : js_mkundef()); 422 + if (is_err(locale)) return locale; 423 + 424 + ant_value_t granularity = intl_get_option_string( 425 + js, nargs > 1 ? args[1] : js_mkundef(), 426 + "granularity", "grapheme" 427 + ); 428 + if (is_err(granularity)) return granularity; 429 + 430 + ant_value_t obj = intl_create_instance(js, g_intl_segmenter_proto); 431 + js_set(js, obj, "locale", locale); 432 + js_set(js, obj, "granularity", granularity); 433 + 434 + return obj; 435 + } 436 + 437 + void init_intl_module(void) { 438 + ant_t *js = rt->js; 439 + 440 + ant_value_t global = js_glob(js); 441 + ant_value_t intl = js_mkobj(js); 442 + ant_value_t object_proto = js->sym.object_proto; 443 + 444 + if (is_object_type(object_proto)) js_set_proto_init(intl, object_proto); 445 + js_set_sym(js, intl, get_toStringTag_sym(), js_mkstr(js, "Intl", 4)); 446 + 447 + g_intl_collator_proto = js_mkobj(js); 448 + js_set(js, g_intl_collator_proto, "compare", js_mkfun(intl_collator_compare)); 449 + js_set(js, g_intl_collator_proto, "resolvedOptions", js_mkfun(intl_collator_resolved_options)); 450 + js_set_sym(js, g_intl_collator_proto, get_toStringTag_sym(), js_mkstr(js, "Intl.Collator", 13)); 451 + ant_value_t collator_ctor = js_make_ctor(js, intl_collator_constructor, g_intl_collator_proto, "Collator", 8); 452 + js_set(js, intl, "Collator", collator_ctor); 453 + 454 + g_intl_numberformat_proto = js_mkobj(js); 455 + js_set(js, g_intl_numberformat_proto, "format", js_mkfun(intl_numberformat_format)); 456 + js_set(js, g_intl_numberformat_proto, "resolvedOptions", js_mkfun(intl_numberformat_resolved_options)); 457 + js_set_sym(js, g_intl_numberformat_proto, get_toStringTag_sym(), js_mkstr(js, "Intl.NumberFormat", 17)); 458 + ant_value_t numberformat_ctor = js_make_ctor(js, intl_numberformat_constructor, g_intl_numberformat_proto, "NumberFormat", 12); 459 + js_set(js, intl, "NumberFormat", numberformat_ctor); 460 + 461 + g_intl_datetimeformat_proto = js_mkobj(js); 462 + js_set(js, g_intl_datetimeformat_proto, "format", js_mkfun(intl_dtf_format)); 463 + js_set(js, g_intl_datetimeformat_proto, "resolvedOptions", js_mkfun(intl_dtf_resolved_options)); 464 + js_set(js, g_intl_datetimeformat_proto, "formatToParts", js_mkfun(intl_dtf_format_to_parts)); 465 + js_set_sym(js, g_intl_datetimeformat_proto, get_toStringTag_sym(), js_mkstr(js, "Intl.DateTimeFormat", 19)); 466 + ant_value_t dtf_ctor = js_make_ctor(js, intl_dtf_constructor, g_intl_datetimeformat_proto, "DateTimeFormat", 14); 467 + js_set(js, intl, "DateTimeFormat", dtf_ctor); 468 + 469 + g_intl_segmenter_proto = js_mkobj(js); 470 + js_set(js, g_intl_segmenter_proto, "segment", js_mkfun(intl_segmenter_segment)); 471 + js_set(js, g_intl_segmenter_proto, "resolvedOptions", js_mkfun(intl_segmenter_resolved_options)); 472 + js_set_sym(js, g_intl_segmenter_proto, get_toStringTag_sym(), js_mkstr(js, "Intl.Segmenter", 14)); 473 + ant_value_t segmenter_ctor = js_make_ctor(js, intl_segmenter_constructor, g_intl_segmenter_proto, "Segmenter", 9); 474 + js_set(js, intl, "Segmenter", segmenter_ctor); 475 + 476 + js_set(js, global, "Intl", intl); 477 + js_set_descriptor(js, global, "Intl", 4, JS_DESC_W | JS_DESC_C); 478 + }
+42
tests/test_intl.cjs
··· 1 + function assert(condition, message) { 2 + if (!condition) throw new Error(message); 3 + } 4 + 5 + assert(typeof Intl === 'object', 'expected Intl global'); 6 + assert(Intl.constructor === Object, 'expected Intl to inherit from Object'); 7 + 8 + const collator = Intl.Collator('de-DE'); 9 + assert(collator instanceof Intl.Collator, 'expected Intl.Collator() to create an instance'); 10 + assert(typeof collator.compare === 'function', 'expected collator.compare'); 11 + assert(typeof collator.resolvedOptions === 'function', 'expected collator.resolvedOptions'); 12 + 13 + const numberFormat = new Intl.NumberFormat('en-US'); 14 + assert(numberFormat instanceof Intl.NumberFormat, 'expected NumberFormat instance'); 15 + assert(numberFormat.format(1234567.5) === '1,234,567.5', `unexpected formatted number: ${numberFormat.format(1234567.5)}`); 16 + 17 + const dateTimeFormat = Intl.DateTimeFormat('en-US', { timeZone: 'Australia/Sydney' }); 18 + assert(dateTimeFormat instanceof Intl.DateTimeFormat, 'expected DateTimeFormat() to create an instance'); 19 + const dtfOptions = dateTimeFormat.resolvedOptions(); 20 + assert(dtfOptions.timeZone === 'Australia/Sydney', `unexpected timeZone: ${dtfOptions.timeZone}`); 21 + const dtfFormatted = dateTimeFormat.format(0); 22 + const dtfParts = dateTimeFormat.formatToParts(0); 23 + assert(Array.isArray(dtfParts), 'expected formatToParts() to return an array'); 24 + assert(dtfParts.length === 7, `unexpected formatToParts() length: ${dtfParts.length}`); 25 + assert(dtfParts.map(part => part.value).join('') === dtfFormatted, 'expected formatToParts() values to match format() output'); 26 + assert(dtfParts[0].type === 'hour', `unexpected first part type: ${dtfParts[0].type}`); 27 + assert(dtfParts[6].type === 'dayPeriod', `unexpected last part type: ${dtfParts[6].type}`); 28 + 29 + let rejected = false; 30 + try { 31 + Intl.Collator('x-en-US-12345'); 32 + } catch (error) { 33 + rejected = true; 34 + } 35 + assert(rejected, 'expected invalid language tags to throw'); 36 + 37 + const segmenter = Intl.Segmenter('en-US', { granularity: 'word' }); 38 + const segments = segmenter.segment('ok'); 39 + assert(Array.isArray(segments), 'expected Intl.Segmenter().segment() to return an array'); 40 + assert(segments.length === 2, `unexpected segment count: ${segments.length}`); 41 + 42 + console.log('intl test passed');