No LLM
Address autofill, smart paste, many spellcheck cases, and common formula edits often resolve locally without model inference.
Short answer: Dhamaka tries not to use an LLM unless it has to. Most demo fast paths are rules, regexes, fuzzy matching, or structural formula rewrites. When a model is needed, the runtime picks the best local backend available.
Address autofill, smart paste, many spellcheck cases, and common formula edits often resolve locally without model inference.
If rules are uncertain, Dhamaka can use a resident browser model, a Transformers.js model, the WASM runtime, or a test mock.
The runtime factory prefers browser-native local inference first, then cross-browser Transformers.js, then Dhamaka's Rust WASM runtime, then MockEngine for Node and tests.
window.ai.languageModel.dhamaka-runtime.wasm path is tested and wired in, but it is a v2 target until real weights, quantization, and SIMD are production-ready.HuggingFaceTB/SmolLM2-135M-Instruct is the Transformers.js default for generic completion and chat-like output.Xenova/LaMini-Flan-T5-248M is the default for instruction-following transform work.Xenova/distilbert-base-uncased is the default masked language model for contextual spellcheck.Xenova/all-MiniLM-L6-v2 is the default feature-extraction model for semantic search and fuzzy matching work.dhamaka-micro, based on HuggingFaceTB/SmolLM2-360M-Instruct, as the default packaged model target.Dhamaka's engine contract is intentionally small: completion, streaming, masked-token prediction, or embeddings. That means models can be swapped by task as long as they run locally in the browser runtime or are wrapped by a compatible adapter.
window.ai, or a Dhamaka engine adapter.Most users let backend: "auto" choose. Advanced users can force a task and Hugging Face model id through the runtime factory.
import { createEngine } from "@dhamaka/runtime";
const engine = createEngine({
backend: "transformers",
task: "text-generation",
model: "HuggingFaceTB/SmolLM2-135M-Instruct",
});
await engine.load();
const answer = await engine.complete("Rewrite this sentence.");
Dhamaka's thesis is local-first: no server call, no per-token cost, and no user data leaving the tab. A product can still write a custom Engine adapter for a cloud LLM, but that changes the privacy and cost model and should be a conscious product choice.