The unpac monorepo manager self-hosting as a monorepo using unpac
0
fork

Configure Feed

Select the types of activity you want to include in your feed.

Harden startup of -custom executables

By default, ocamlrun first tries to resolve argv[0] to determine where
the bytecode image is and then tries opening the executable image
itself. This is obviously correct for ocamlrun, when being called using
a shebang or executable header, but it's not correct for -custom
executables where we _know_ that the bytecode image should be with the
executable. To achieve this, a new mode is added to
caml_byte_program_mode (and the existing ones renamed) such that
caml_byte_program_mode is now STANDARD (for ocamlrun - the existing
behaviour), APPENDED (for -custom executables - the new behaviour) and
EMBEDDED (for -output-complete-exe/-output-obj - the original use of
it).

The mode is also set directly by the linker, rather than having a
default in libcamlrun which is then overridden by the startup code for
-output-complete-exe.

In the new APPENDED mode, if caml_executable_name is implemented (i.e.
it returns a string) then this file _must_ contain the bytecode image
and no other mechanisms are used. On platforms where
caml_executable_name is not implemented, APPENDED falls back to STANDARD
for compatibility.

Technically, this stops an argv[0] injection attack on setuid/setgid
-custom bytecode executables, although setuid should be used with
-output-complete-exe, if at all.

+51 -25
+5
Changes
··· 82 82 normalised on both Windows and Unix. 83 83 (David Allsopp, review by Jonah Beckford, Damien Doligez and Hugo Heuzard) 84 84 85 + - #14244: Executables linked with `ocamlc -custom` now always attempt to load 86 + bytecode from the executable itself, rather than first trying `argv[0]`. 87 + (David Allsopp, review by Jonah Beckford, Antonin Décimo, Damien Doligez and 88 + Samuel Hym) 89 + 85 90 - #12269, #12410, #13063: Fix unsafety, deadlocks, and/or leaks should 86 91 rare errors happen during domain creation and thread 87 92 creation/registration.
+7 -1
bytecomp/bytelink.ml
··· 618 618 #include <caml/sys.h> 619 619 #include <caml/misc.h> 620 620 621 + const enum caml_byte_program_mode caml_byte_program_mode = EMBEDDED; 622 + 621 623 static int caml_code[] = { 622 624 |}; 623 625 Symtable.init(); ··· 662 664 output_string outchan {| 663 665 int main_os(int argc, char_os **argv) 664 666 { 665 - caml_byte_program_mode = COMPLETE_EXE; 666 667 caml_startup_code(caml_code, sizeof(caml_code), 667 668 caml_data, sizeof(caml_data), 668 669 caml_sections, sizeof(caml_sections), ··· 804 805 extern "C" { 805 806 #endif 806 807 808 + #define CAML_INTERNALS 807 809 #define CAML_INTERNALS_NO_PRIM_DECLARATIONS 810 + 808 811 #include <caml/mlvalues.h> 812 + #include <caml/startup.h> 813 + 814 + const enum caml_byte_program_mode caml_byte_program_mode = APPENDED; 809 815 810 816 |}; 811 817 Symtable.output_primitive_table poc;
+2 -2
runtime/backtrace_byt.c
··· 451 451 CAMLassert(di->already_read == 0); 452 452 di->already_read = 1; 453 453 454 - /* At the moment, bytecode programs built with --output-complete-exe 454 + /* At the moment, bytecode programs built with -output-complete-exe 455 455 do not contain any debug info. 456 456 457 457 See https://github.com/ocaml/ocaml/issues/9344 for details. 458 458 */ 459 - if (caml_params->cds_file == NULL && caml_byte_program_mode == COMPLETE_EXE) 459 + if (caml_params->cds_file == NULL && caml_byte_program_mode == EMBEDDED) 460 460 CAMLreturn0; 461 461 462 462 if (caml_params->cds_file != NULL) {
+6 -6
runtime/caml/startup.h
··· 48 48 extern int32_t caml_seek_section(int fd, struct exec_trailer *trail, 49 49 const char *name); 50 50 51 - enum caml_byte_program_mode 52 - { 53 - STANDARD /* normal bytecode program requiring "ocamlrun" */, 54 - COMPLETE_EXE /* embedding the vm, i.e. compiled with --output-complete-exe */ 55 - }; 51 + enum caml_byte_program_mode { 52 + STANDARD, /* Default mode for ocamlrun */ 53 + APPENDED, /* bytecode must be appended (i.e. -custom) */ 54 + EMBEDDED /* bytecode embedded in C (e.g. -output-complete-exe/-output-obj) */ 55 + }; 56 56 57 - extern enum caml_byte_program_mode caml_byte_program_mode; 57 + extern const enum caml_byte_program_mode caml_byte_program_mode; 58 58 59 59 #endif /* CAML_INTERNALS */ 60 60
+7
runtime/gen_primsc.sh
··· 31 31 #define CAML_INTERNALS 32 32 #include "caml/mlvalues.h" 33 33 #include "caml/prims.h" 34 + #include "caml/startup.h" 34 35 35 36 EOF 36 37 ··· 61 62 echo 'const char * const caml_names_of_builtin_cprim[] = {' 62 63 sed -e 's/.*/ "&",/' "$primitives" 63 64 echo ' 0 };' 65 + 66 + # ocamlrun is able to use any of the mechanisms to load the bytecode 67 + cat <<'EOF' 68 + 69 + const enum caml_byte_program_mode caml_byte_program_mode = STANDARD; 70 + EOF
+22 -10
runtime/startup_byt.c
··· 113 113 ? 0 : WRONG_MAGIC; 114 114 } 115 115 116 - enum caml_byte_program_mode caml_byte_program_mode = STANDARD; 117 - 118 116 int caml_attempt_open(char_os **name, struct exec_trailer *trail, 119 117 int do_open_script) 120 118 { ··· 462 460 463 461 CAMLexport void caml_main(char_os **argv) 464 462 { 465 - int fd, pos; 463 + int fd = -1, pos; 466 464 struct exec_trailer trail; 467 465 struct channel * chan; 468 466 value res; ··· 489 487 /* Determine position of bytecode file */ 490 488 pos = 0; 491 489 492 - /* First, try argv[0] (when ocamlrun is called by a bytecode program) */ 493 - exe_name = argv[0]; 494 - fd = caml_attempt_open(&exe_name, &trail, 0); 495 - 496 490 proc_self_exe = caml_executable_name(); 497 491 492 + /* In APPENDED mode (i.e. with -custom), we always want to load the bytecode 493 + from the running executable, and argv[0] should never be used. However, 494 + some platforms still don't implement caml_executable_name, so there is an 495 + escape hatch here to fallback to checking argv[0] if proc_self_exe is 496 + NULL. 497 + For STANDARD mode (i.e. the current executable is ocamlrun), argv[0] is 498 + tried first, as this should be the path to shebang-script/executable 499 + originally executed by the user. */ 500 + CAMLassert(caml_byte_program_mode != EMBEDDED); 501 + if (caml_byte_program_mode != APPENDED || proc_self_exe == NULL) { 502 + exe_name = argv[0]; 503 + fd = caml_attempt_open(&exe_name, &trail, 0); 504 + } 505 + 498 506 /* Little grasshopper wonders why we do that at all, since 499 507 "The current executable is ocamlrun itself, it's never a bytecode 500 508 program". Little grasshopper "ocamlc -custom" in mind should keep. 501 509 With -custom, we have an executable that is ocamlrun itself 502 510 concatenated with the bytecode. So, if the attempt with argv[0] 503 511 failed, it is worth trying again with executable_name. */ 504 - if (fd < 0 && proc_self_exe != NULL) { 505 - exe_name = proc_self_exe; 506 - fd = caml_attempt_open(&exe_name, &trail, 0); 512 + if (caml_byte_program_mode == APPENDED || fd < 0) { 513 + if (proc_self_exe != NULL) { 514 + exe_name = proc_self_exe; 515 + fd = caml_attempt_open(&exe_name, &trail, 0); 516 + } 517 + if (fd < 0 && caml_byte_program_mode == APPENDED) 518 + error("unable to open file '%s'", caml_stat_strdup_of_os(exe_name)); 507 519 } 508 520 509 521 if (fd < 0) {
+2 -6
testsuite/tools/testLinkModes.ml
··· 329 329 else 330 330 Success {executable_name = argv0_resolved; argv0} 331 331 else 332 - if Sys.win32 || argv0_not_ocaml then 333 - (* SearchPath will resolve the relative/implicit arguments to 334 - absolute paths *) 335 - Success {executable_name = test_program_path; argv0} 336 - else 337 - Success {executable_name = argv0_resolved; argv0} 332 + (* -custom executables use caml_executable_name *) 333 + Success {executable_name = test_program_path; argv0} 338 334 | Vanilla -> 339 335 if Harness.no_caml_executable_name then 340 336 Success {executable_name = argv0_resolved; argv0}