···11+(*
22+ * ISC License
33+ *
44+ * Copyright (c) 2025 Anil Madhavapeddy <anil@recoil.org>
55+ *
66+ * Permission to use, copy, modify, and distribute this software for any
77+ * purpose with or without fee is hereby granted, provided that the above
88+ * copyright notice and this permission notice appear in all copies.
99+ *
1010+ * THE SOFTWARE IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL WARRANTIES
1111+ * WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF
1212+ * MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR
1313+ * ANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES
1414+ * WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN
1515+ * ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF
1616+ * OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE.
1717+ *
1818+ *)
+163-1
README.md
···11-A sha256 experiment
11+# Oxsha - Fast SHA256 Hashing for OCaml
22+33+A high-performance SHA256 hashing library for OCaml with zero-copy C bindings using bigarrays.
44+55+## Features
66+77+- **Zero-copy performance**: Uses bigarrays for efficient data transfer to C
88+- **Hardware acceleration**: Automatically detects and uses CPU SHA extensions (Intel SHA-NI, ARM Crypto)
99+- **Streaming API**: Incremental hashing with init/update/finalize pattern
1010+- **Multiple interfaces**: Support for bigarrays, bytes, and strings
1111+- **Memory-mapped files**: sha256sum example uses `Unix.map_file` for true zero-copy file hashing
1212+- **Minimal dependencies**: Standalone library with no external dependencies
1313+- **Well-documented**: Comprehensive API documentation
1414+1515+## Installation
1616+1717+```bash
1818+opam install oxsha
1919+```
2020+2121+Or build from source:
2222+2323+```bash
2424+dune build
2525+dune install
2626+```
2727+2828+## Quick Start
2929+3030+```ocaml
3131+(* One-shot hashing *)
3232+let digest = Oxsha.hash_string "Hello, World!" in
3333+Printf.printf "SHA256: %s\n" (hex_of_bytes digest)
3434+3535+(* Streaming API for large data *)
3636+let ctx = Oxsha.create () in
3737+Oxsha.update_string ctx "Hello, ";
3838+Oxsha.update_string ctx "World!";
3939+let digest = Oxsha.final ctx in
4040+Printf.printf "SHA256: %s\n" (hex_of_bytes digest)
4141+```
4242+4343+## API Overview
4444+4545+### Low-Level API (`Oxsha.Raw`)
4646+4747+The Raw module provides direct access to the C implementation:
4848+4949+- `create : unit -> t` - Create a new SHA256 context
5050+- `update : t -> bigarray -> unit` - Update with bigarray data (zero-copy)
5151+- `update_bytes : t -> bytes -> unit` - Update with bytes
5252+- `update_string : t -> string -> unit` - Update with string
5353+- `final : t -> bytes` - Finalize and get 32-byte digest
5454+- `hash : bigarray -> bytes` - One-shot hash for bigarrays
5555+- `hash_bytes : bytes -> bytes` - One-shot hash for bytes
5656+- `hash_string : string -> bytes` - One-shot hash for strings
5757+5858+### High-Level API
5959+6060+All Raw module functions are re-exported at the top level for convenience:
6161+6262+```ocaml
6363+Oxsha.create ()
6464+Oxsha.update_string ctx "data"
6565+Oxsha.final ctx
6666+```
6767+6868+## Performance Considerations
6969+7070+For maximum performance:
7171+1. Use the bigarray API directly when possible to avoid copying
7272+2. Use the streaming API for large files to avoid loading everything in memory
7373+3. Use `Unix.map_file` for hashing large files (see sha256sum example)
7474+4. The C implementation is optimized and allocation-free
7575+5. Hardware SHA extensions are automatically enabled when available
7676+7777+### Hardware Acceleration
7878+7979+The library automatically detects your CPU architecture at build time and enables hardware SHA acceleration:
8080+8181+- **x86/x86_64**: Uses Intel SHA Extensions (`-msse4.1 -msha`)
8282+- **ARM64/AArch64**: Uses ARM Crypto Extensions (`-march=armv8-a+crypto`)
8383+8484+This is handled transparently by a dune configurator script in `lib/discover/`.
8585+8686+## Examples
8787+8888+### Basic Usage
8989+9090+See `examples/basic_usage.ml` for complete examples:
9191+9292+```bash
9393+dune exec examples/basic_usage.exe
9494+```
9595+9696+### SHA256sum Utility
9797+9898+A drop-in replacement for the `sha256sum` command that uses memory-mapped files for zero-copy hashing:
9999+100100+```bash
101101+# Hash one or more files
102102+dune exec examples/sha256sum.exe -- README.md lib/oxsha.mli
103103+104104+# Output format is identical to sha256sum
105105+5663d62d52903366546603da52d18ccbf36ef7265653b641b980ec36891c7afe README.md
106106+b4cbb3d0d18b90cc63c0e3e8c95f4e933d1a361a5eae142e64caf17724a1447f lib/oxsha.mli
107107+```
108108+109109+The sha256sum example demonstrates true zero-copy file hashing by memory-mapping files directly into bigarrays.
110110+111111+## Building
112112+113113+```bash
114114+# Clean build
115115+opam exec -- dune clean
116116+117117+# Build library
118118+opam exec -- dune build @check
119119+120120+# Build documentation
121121+opam exec -- dune build @doc
122122+123123+# Build ignoring warnings (release mode)
124124+opam exec -- dune build @check --profile=release
125125+```
126126+127127+## Project Structure
128128+129129+```
130130+oxsha/
131131+├── lib/
132132+│ ├── oxsha.ml # OCaml implementation
133133+│ ├── oxsha.mli # Public interface
134134+│ ├── oxsha_stubs.c # C FFI bindings
135135+│ ├── sha256.c # SHA256 C implementation
136136+│ ├── sha256.h # C header
137137+│ ├── dune # Build rules
138138+│ └── discover/
139139+│ ├── discover.ml # Architecture detection for C flags
140140+│ └── dune # Configurator build rules
141141+├── examples/
142142+│ ├── basic_usage.ml # API usage examples
143143+│ ├── sha256sum.ml # sha256sum utility with mmap
144144+│ └── dune
145145+├── dune-project # Project metadata
146146+└── README.md
147147+```
148148+149149+## C Implementation
150150+151151+The library uses Brad Conte's public domain SHA256 implementation. The C context is allocated on the C heap and wrapped in an OCaml custom block with proper finalization.
152152+153153+### Build-Time Configuration
154154+155155+The build system uses `dune-configurator` to detect the CPU architecture and automatically add the appropriate compiler flags for hardware SHA acceleration. The configurator script (`lib/discover/discover.ml`) runs during the build and generates a `c_flags.sexp` file that dune includes in the C compilation flags.
156156+157157+## License
158158+159159+ISC License
160160+161161+## Contributing
162162+163163+Contributions welcome! Please ensure all tests pass before submitting PRs.
-173
bench/bench_sha256.ml
···11-open Sha256
22-33-(* Memory allocation tracking *)
44-let measure_allocations f =
55- let before = Gc.allocated_bytes () in
66- let result = f () in
77- let after = Gc.allocated_bytes () in
88- (result, after -. before)
99-1010-(* Benchmark different scenarios *)
1111-let bench_sizes () =
1212- print_endline "Benchmarking various input sizes:";
1313- print_endline "Size (B) | Iterations | Time (s) | Throughput (MB/s) | Allocations (B)";
1414- print_endline "---------|------------|----------|-------------------|----------------";
1515-1616- let sizes = [
1717- (16, 100000);
1818- (64, 100000);
1919- (256, 50000);
2020- (1024, 20000);
2121- (4096, 5000);
2222- (16384, 1000);
2323- (65536, 250);
2424- (262144, 60);
2525- (1048576, 15);
2626- ] in
2727-2828- List.iter (fun (size, iterations) ->
2929- let data = String.make size 'x' in
3030-3131- (* Warmup *)
3232- for _ = 1 to 10 do
3333- ignore (hash_string data)
3434- done;
3535-3636- (* Benchmark *)
3737- let start = Unix.gettimeofday () in
3838- let _, allocs = measure_allocations (fun () ->
3939- for _ = 1 to iterations do
4040- ignore (hash_string data)
4141- done
4242- ) in
4343- let elapsed = Unix.gettimeofday () -. start in
4444-4545- let throughput = (float_of_int (size * iterations)) /. elapsed /. 1_000_000.0 in
4646- let allocs_per_op = allocs /. float_of_int iterations in
4747-4848- Printf.printf "%8d | %10d | %8.3f | %17.1f | %14.0f\n"
4949- size iterations elapsed throughput allocs_per_op
5050- ) sizes
5151-5252-let bench_parallel_scaling () =
5353- print_endline "\nParallel scaling benchmark:";
5454- print_endline "Threads | Hashes | Time (s) | Hashes/sec | Speedup";
5555- print_endline "--------|--------|----------|------------|--------";
5656-5757- let num_hashes = 10000 in
5858- let data_size = 1024 in
5959- let inputs = List.init num_hashes (fun i ->
6060- Bytes.of_string (String.make data_size (Char.chr (65 + (i mod 26))))
6161- ) in
6262-6363- (* Sequential baseline *)
6464- let start_seq = Unix.gettimeofday () in
6565- let _ = List.map hash_bytes inputs in
6666- let time_seq = Unix.gettimeofday () -. start_seq in
6767- let hashes_per_sec_seq = float_of_int num_hashes /. time_seq in
6868-6969- Printf.printf "%7d | %6d | %8.3f | %10.0f | %7.2fx\n"
7070- 1 num_hashes time_seq hashes_per_sec_seq 1.0;
7171-7272- (* Parallel with different thread counts *)
7373- let thread_counts = [2; 4; 8] in
7474- List.iter (fun threads ->
7575- (* Simulate parallel execution with multiple Parallel.fork_join2 calls *)
7676- let par = Parallel.create () in
7777- let chunk_size = num_hashes / threads in
7878-7979- let start_par = Unix.gettimeofday () in
8080-8181- (* Process in parallel chunks *)
8282- let rec process_chunks remaining acc =
8383- match remaining with
8484- | [] -> acc
8585- | chunk :: [] -> (List.map hash_bytes chunk) :: acc
8686- | chunk1 :: chunk2 :: rest ->
8787- let r1, r2 = Parallel.fork_join2 par
8888- (fun _ -> List.map hash_bytes chunk1)
8989- (fun _ -> List.map hash_bytes chunk2)
9090- in
9191- process_chunks rest (r2 :: r1 :: acc)
9292- in
9393-9494- (* Split inputs into chunks *)
9595- let rec split_into_chunks lst n acc =
9696- if n <= 0 || lst = [] then List.rev acc
9797- else
9898- let rec take k lst acc =
9999- if k = 0 || lst = [] then (List.rev acc, lst)
100100- else match lst with
101101- | h::t -> take (k-1) t (h::acc)
102102- | [] -> (List.rev acc, [])
103103- in
104104- let (chunk, rest) = take chunk_size lst [] in
105105- split_into_chunks rest (n-1) (chunk :: acc)
106106- in
107107-108108- let chunks = split_into_chunks inputs threads [] in
109109- let _ = process_chunks chunks [] in
110110-111111- let time_par = Unix.gettimeofday () -. start_par in
112112- let hashes_per_sec_par = float_of_int num_hashes /. time_par in
113113- let speedup = time_seq /. time_par in
114114-115115- Printf.printf "%7d | %6d | %8.3f | %10.0f | %7.2fx\n"
116116- threads num_hashes time_par hashes_per_sec_par speedup
117117- ) thread_counts
118118-119119-let bench_zero_allocation () =
120120- print_endline "\nZero-allocation verification:";
121121-122122- (* Create aligned buffer *)
123123- let size = 1024 in
124124- let buffer = Bigarray.Array1.create Bigarray.int8_unsigned Bigarray.c_layout size in
125125- for i = 0 to size - 1 do
126126- Bigarray.Array1.set buffer i (65 + (i mod 26))
127127- done;
128128-129129- (* Measure allocations for direct oneshot call *)
130130- Gc.full_major ();
131131- let before = Gc.allocated_bytes () in
132132-133133- for _ = 1 to 1000 do
134134- ignore (oneshot buffer (Int64.of_int size))
135135- done;
136136-137137- let after = Gc.allocated_bytes () in
138138- let allocs_per_hash = (after -. before) /. 1000.0 in
139139-140140- Printf.printf " Direct oneshot (bigarray): %.1f bytes/hash\n" allocs_per_hash;
141141-142142- (* Compare with string version *)
143143- let str = String.make size 'x' in
144144- Gc.full_major ();
145145- let before_str = Gc.allocated_bytes () in
146146-147147- for _ = 1 to 1000 do
148148- ignore (hash_string str)
149149- done;
150150-151151- let after_str = Gc.allocated_bytes () in
152152- let allocs_per_hash_str = (after_str -. before_str) /. 1000.0 in
153153-154154- Printf.printf " String wrapper: %.1f bytes/hash\n" allocs_per_hash_str;
155155-156156- if allocs_per_hash < 100.0 then
157157- print_endline " ✓ Near-zero allocation achieved!"
158158- else
159159- print_endline " ⚠ Higher than expected allocations"
160160-161161-let () =
162162- print_endline "SHA256 Performance Benchmark Suite";
163163- print_endline "===================================\n";
164164-165165- (* Check CPU support *)
166166- print_endline "System Information:";
167167- Printf.printf " OCaml version: %s\n" Sys.ocaml_version;
168168- Printf.printf " Word size: %d bits\n" Sys.word_size;
169169- Printf.printf " OS: %s\n\n" Sys.os_type;
170170-171171- bench_sizes ();
172172- bench_parallel_scaling ();
173173- bench_zero_allocation ()
···11+(** Fast SHA256 hashing library with zero-copy C bindings. *)
22+33+module Raw = struct
44+ (** The SHA256 context type wrapping the C SHA256_CTX structure. *)
55+ type t
66+77+ (** External C functions *)
88+ external create : unit -> t = "oxsha_create"
99+1010+ external update :
1111+ t ->
1212+ (char, Bigarray.int8_unsigned_elt, Bigarray.c_layout) Bigarray.Array1.t ->
1313+ unit
1414+ = "oxsha_update"
1515+1616+ external final : t -> bytes = "oxsha_final"
1717+1818+ (** Convenience function: update with bytes *)
1919+ let update_bytes ctx data =
2020+ let len = Bytes.length data in
2121+ let ba = Bigarray.Array1.create Bigarray.char Bigarray.c_layout len in
2222+ for i = 0 to len - 1 do
2323+ Bigarray.Array1.unsafe_set ba i (Bytes.unsafe_get data i)
2424+ done;
2525+ update ctx ba
2626+2727+ (** Convenience function: update with string *)
2828+ let update_string ctx data =
2929+ let len = String.length data in
3030+ let ba = Bigarray.Array1.create Bigarray.char Bigarray.c_layout len in
3131+ for i = 0 to len - 1 do
3232+ Bigarray.Array1.unsafe_set ba i (String.unsafe_get data i)
3333+ done;
3434+ update ctx ba
3535+3636+ (** One-shot hash function for bigarrays *)
3737+ let hash data =
3838+ let ctx = create () in
3939+ update ctx data;
4040+ final ctx
4141+4242+ (** One-shot hash function for bytes *)
4343+ let hash_bytes data =
4444+ let ctx = create () in
4545+ update_bytes ctx data;
4646+ final ctx
4747+4848+ (** One-shot hash function for strings *)
4949+ let hash_string data =
5050+ let ctx = create () in
5151+ update_string ctx data;
5252+ final ctx
5353+end
5454+5555+(** Re-export Raw module contents at top level *)
5656+include Raw
+89
lib/oxsha.mli
···11+(** Fast SHA256 hashing library with zero-copy C bindings.
22+33+ This library provides OCaml bindings to a C SHA256 implementation
44+ using bigarrays for efficient, zero-copy hashing. *)
55+66+(** {1 Raw C Bindings} *)
77+88+module Raw : sig
99+ (** Low-level bindings to the C SHA256 implementation.
1010+1111+ This module provides direct access to the C functions with minimal
1212+ overhead. All operations work with bigarrays for zero-copy performance. *)
1313+1414+ (** The SHA256 context type. This is an abstract type wrapping the C
1515+ SHA256_CTX structure. *)
1616+ type t
1717+1818+ (** [create ()] allocates and initializes a new SHA256 context.
1919+2020+ @return A fresh context ready for hashing. *)
2121+ val create : unit -> t
2222+2323+ (** [update ctx data] updates the hash state with new data.
2424+2525+ This function processes the input data incrementally. It can be called
2626+ multiple times to hash data in chunks.
2727+2828+ @param ctx The SHA256 context to update
2929+ @param data A bigarray containing the data to hash. Uses bigarrays for
3030+ zero-copy access from the C side. *)
3131+ val update :
3232+ t ->
3333+ (char, Bigarray.int8_unsigned_elt, Bigarray.c_layout) Bigarray.Array1.t ->
3434+ unit
3535+3636+ (** [update_bytes ctx data] updates the hash state with bytes data.
3737+3838+ This is a convenience function that wraps bytes in a bigarray view.
3939+4040+ @param ctx The SHA256 context to update
4141+ @param data Bytes to hash *)
4242+ val update_bytes : t -> bytes -> unit
4343+4444+ (** [update_string ctx data] updates the hash state with string data.
4545+4646+ This is a convenience function for hashing strings.
4747+4848+ @param ctx The SHA256 context to update
4949+ @param data String to hash *)
5050+ val update_string : t -> string -> unit
5151+5252+ (** [final ctx] finalizes the hash computation and returns the digest.
5353+5454+ After calling this function, the context should not be used again.
5555+5656+ @param ctx The SHA256 context to finalize
5757+ @return A 32-byte digest as a bytes value *)
5858+ val final : t -> bytes
5959+6060+ (** [hash data] is a convenience function that performs a complete hash
6161+ in one operation: create, update, and final.
6262+6363+ @param data The bigarray data to hash
6464+ @return A 32-byte digest *)
6565+ val hash :
6666+ (char, Bigarray.int8_unsigned_elt, Bigarray.c_layout) Bigarray.Array1.t ->
6767+ bytes
6868+6969+ (** [hash_bytes data] hashes bytes data in one operation.
7070+7171+ @param data The bytes to hash
7272+ @return A 32-byte digest *)
7373+ val hash_bytes : bytes -> bytes
7474+7575+ (** [hash_string data] hashes string data in one operation.
7676+7777+ @param data The string to hash
7878+ @return A 32-byte digest *)
7979+ val hash_string : string -> bytes
8080+end
8181+8282+(** {1 High-Level Interface} *)
8383+8484+(** Re-export the Raw module as the main interface.
8585+8686+ The Raw module provides the most efficient interface using bigarrays.
8787+ Higher-level abstractions can be added in the future if needed. *)
8888+8989+include module type of Raw
+92
lib/oxsha_stubs.c
···11+/*
22+ * OCaml bindings for SHA256 C implementation.
33+ * Uses custom blocks and bigarrays for zero-copy performance.
44+ */
55+66+#include <string.h>
77+#include <caml/mlvalues.h>
88+#include <caml/memory.h>
99+#include <caml/alloc.h>
1010+#include <caml/custom.h>
1111+#include <caml/fail.h>
1212+#include <caml/bigarray.h>
1313+1414+#include "sha256.h"
1515+1616+/* Custom block operations for SHA256_CTX */
1717+1818+static void oxsha_ctx_finalize(value v_ctx)
1919+{
2020+ SHA256_CTX *ctx = (SHA256_CTX *)Data_custom_val(v_ctx);
2121+ /* Just clear the memory for security */
2222+ memset(ctx, 0, sizeof(SHA256_CTX));
2323+}
2424+2525+static struct custom_operations oxsha_ctx_ops = {
2626+ "com.oxsha.sha256_ctx",
2727+ oxsha_ctx_finalize,
2828+ custom_compare_default,
2929+ custom_hash_default,
3030+ custom_serialize_default,
3131+ custom_deserialize_default,
3232+ custom_compare_ext_default,
3333+ custom_fixed_length_default
3434+};
3535+3636+/* Allocate and wrap a SHA256_CTX in an OCaml custom block */
3737+static value alloc_oxsha_ctx(void)
3838+{
3939+ value v_ctx = caml_alloc_custom(&oxsha_ctx_ops, sizeof(SHA256_CTX), 0, 1);
4040+ return v_ctx;
4141+}
4242+4343+/* Extract SHA256_CTX pointer from OCaml value */
4444+#define Oxsha_ctx_val(v) ((SHA256_CTX *)Data_custom_val(v))
4545+4646+/* FFI Functions */
4747+4848+/* oxsha_create : unit -> t */
4949+CAMLprim value oxsha_create(value unit)
5050+{
5151+ CAMLparam1(unit);
5252+ CAMLlocal1(v_ctx);
5353+5454+ v_ctx = alloc_oxsha_ctx();
5555+ SHA256_CTX *ctx = Oxsha_ctx_val(v_ctx);
5656+ sha256_init(ctx);
5757+5858+ CAMLreturn(v_ctx);
5959+}
6060+6161+/* oxsha_update : t -> bigarray -> unit */
6262+CAMLprim value oxsha_update(value v_ctx, value v_data)
6363+{
6464+ CAMLparam2(v_ctx, v_data);
6565+6666+ SHA256_CTX *ctx = Oxsha_ctx_val(v_ctx);
6767+6868+ /* Extract bigarray data pointer and length */
6969+ unsigned char *data = (unsigned char *)Caml_ba_data_val(v_data);
7070+ size_t len = Caml_ba_array_val(v_data)->dim[0];
7171+7272+ sha256_update(ctx, data, len);
7373+7474+ CAMLreturn(Val_unit);
7575+}
7676+7777+/* oxsha_final : t -> bytes */
7878+CAMLprim value oxsha_final(value v_ctx)
7979+{
8080+ CAMLparam1(v_ctx);
8181+ CAMLlocal1(v_digest);
8282+8383+ SHA256_CTX *ctx = Oxsha_ctx_val(v_ctx);
8484+8585+ /* Allocate bytes for the 32-byte digest */
8686+ v_digest = caml_alloc_string(SHA256_BLOCK_SIZE);
8787+ unsigned char *digest = (unsigned char *)String_val(v_digest);
8888+8989+ sha256_final(ctx, digest);
9090+9191+ CAMLreturn(v_digest);
9292+}
···11+/*********************************************************************
22+* Filename: sha256.h
33+* Author: Brad Conte (brad AT bradconte.com)
44+* Copyright:
55+* Disclaimer: This code is presented "as is" without any guarantees.
66+* Details: Defines the API for the corresponding SHA1 implementation.
77+*********************************************************************/
88+99+#ifndef SHA256_H
1010+#define SHA256_H
1111+1212+/*************************** HEADER FILES ***************************/
1313+#include <stddef.h>
1414+#include <stdint.h>
1515+1616+/****************************** MACROS ******************************/
1717+#define SHA256_BLOCK_SIZE 32 // SHA256 outputs a 32 byte digest
1818+1919+/**************************** DATA TYPES ****************************/
2020+typedef uint8_t BYTE; // 8-bit byte
2121+typedef uint32_t WORD; // 32-bit word, change to "long" for 16-bit machines
2222+2323+typedef struct {
2424+ BYTE data[64];
2525+ WORD datalen;
2626+ unsigned long long bitlen;
2727+ WORD state[8];
2828+} SHA256_CTX;
2929+3030+/*********************** FUNCTION DECLARATIONS **********************/
3131+void sha256_init(SHA256_CTX *ctx);
3232+void sha256_update(SHA256_CTX *ctx, const BYTE data[], size_t len);
3333+void sha256_final(SHA256_CTX *ctx, BYTE hash[]);
3434+3535+#endif // SHA256_H
-96
lib/sha256.ml
···11-open Bigarray
22-33-type state = (int32, int32_elt, c_layout) Array1.t
44-type digest = (int, int8_unsigned_elt, c_layout) Array1.t
55-type buffer = (int, int8_unsigned_elt, c_layout) Array1.t
66-77-(* External C functions *)
88-external init : unit -> state = "oxcaml_sha256_init"
99-external process_block : state -> buffer -> unit = "oxcaml_sha256_process_block" [@@noalloc]
1010-external finalize : state -> buffer -> int64 -> digest = "oxcaml_sha256_finalize"
1111-external oneshot : buffer -> int64 -> digest = "oxcaml_sha256_oneshot"
1212-1313-(* High-level interface *)
1414-1515-let hash_bytes bytes =
1616- let len = Bytes.length bytes in
1717- let buffer = Array1.create int8_unsigned c_layout len in
1818- for i = 0 to len - 1 do
1919- Array1.set buffer i (Char.code (Bytes.get bytes i))
2020- done;
2121- oneshot buffer (Int64.of_int len)
2222-2323-let hash_string str =
2424- let len = String.length str in
2525- let buffer = Array1.create int8_unsigned c_layout len in
2626- for i = 0 to len - 1 do
2727- Array1.set buffer i (Char.code str.[i])
2828- done;
2929- oneshot buffer (Int64.of_int len)
3030-3131-(* Utilities *)
3232-3333-let digest_to_hex digest =
3434- let hex_of_byte b =
3535- Printf.sprintf "%02x" b
3636- in
3737- let buf = Buffer.create 64 in
3838- for i = 0 to 31 do
3939- Buffer.add_string buf (hex_of_byte (Array1.get digest i))
4040- done;
4141- Buffer.contents buf
4242-4343-let digest_to_bytes digest =
4444- let bytes = Bytes.create 32 in
4545- for i = 0 to 31 do
4646- Bytes.set bytes i (Char.chr (Array1.get digest i))
4747- done;
4848- bytes
4949-5050-let digest_equal d1 d2 =
5151- let rec compare i =
5252- if i >= 32 then true
5353- else if Array1.get d1 i <> Array1.get d2 i then false
5454- else compare (i + 1)
5555- in
5656- compare 0
5757-5858-(* Zero-allocation variants using OxCaml features *)
5959-6060-module Fast = struct
6161- (* Stack-allocated processing for temporary computations *)
6262- let[@inline] [@zero_alloc assume] process_block_local state block =
6363- process_block state block
6464-6565- (* Process multiple blocks efficiently *)
6666- let[@zero_alloc assume] process_blocks state blocks num_blocks =
6767- for i = 0 to num_blocks - 1 do
6868- let offset = i * 64 in
6969- let block = Array1.sub blocks offset 64 in
7070- process_block state block
7171- done
7272-7373- (* Parallel hashing for multiple inputs *)
7474- let parallel_hash_many par inputs =
7575- match inputs with
7676- | [] -> []
7777- | [x] -> [hash_bytes x]
7878- | _ ->
7979- let process_batch batch =
8080- List.map hash_bytes batch
8181- in
8282- let mid = List.length inputs / 2 in
8383- let rec split n lst =
8484- if n = 0 then ([], lst)
8585- else match lst with
8686- | [] -> ([], [])
8787- | h::t -> let (l1, l2) = split (n-1) t in (h::l1, l2)
8888- in
8989- let (left, right) = split mid inputs in
9090- let left_results, right_results =
9191- Parallel.fork_join2 par
9292- (fun _ -> process_batch left)
9393- (fun _ -> process_batch right)
9494- in
9595- left_results @ right_results
9696-end
-47
lib/sha256.mli
···11-(** SHA256 hardware-accelerated implementation using AMD SHA-NI instructions *)
22-33-open Bigarray
44-55-(** {1 Types} *)
66-77-(** SHA256 state (8 x int32) *)
88-type state = (int32, int32_elt, c_layout) Array1.t
99-1010-(** SHA256 digest (32 bytes) *)
1111-type digest = (int, int8_unsigned_elt, c_layout) Array1.t
1212-1313-(** Input data buffer *)
1414-type buffer = (int, int8_unsigned_elt, c_layout) Array1.t
1515-1616-(** {1 Low-level interface} *)
1717-1818-(** Initialize a new SHA256 state *)
1919-val init : unit -> state
2020-2121-(** Process a single 512-bit (64 byte) block. Buffer must be exactly 64 bytes. *)
2222-val process_block : state -> buffer -> unit
2323-2424-(** Finalize the hash computation with padding and return digest *)
2525-val finalize : state -> buffer -> int64 -> digest
2626-2727-(** {1 High-level interface} *)
2828-2929-(** Compute SHA256 hash in one shot (fastest for single use) *)
3030-val oneshot : buffer -> int64 -> digest
3131-3232-(** Compute SHA256 hash from bytes *)
3333-val hash_bytes : bytes -> digest
3434-3535-(** Compute SHA256 hash from string *)
3636-val hash_string : string -> digest
3737-3838-(** {1 Utilities} *)
3939-4040-(** Convert digest to hexadecimal string *)
4141-val digest_to_hex : digest -> string
4242-4343-(** Convert digest to bytes *)
4444-val digest_to_bytes : digest -> bytes
4545-4646-(** Compare two digests for equality *)
4747-val digest_equal : digest -> digest -> bool