llama integration with exception safety by reeshabh90 · Pull Request #2 · as-ascii/docwire

reeshabh90 · 2026-02-18T10:25:15Z

Features worked:

Llama.cpp integration as one of the engines in docwire SDK.
Ensured exception safety for llama_runner class

- changes made in docwire.cpp for default installation purpose.

1. Added configurable model load and unload feature, which gives sdk an option to decide whether to unload to the model after pipeline usage or keep it persistent for next usage. 2. Added files for local summarize and translate

features

api_tests.cpp for GitHub Build

Adjustment in local_ai.cmake for the same flags tests/CMakeLists.txt gets a minimal conditional added

build processing.

as-ascii · 2026-04-16T11:53:33Z

+	if (vm.count("local-ai-prompt") || vm.count("local-ai-embed"))
+	{
+		std::cerr << "Error: Local AI features requested, but this build does not include "
+		             "DOCWIRE_LOCAL_CT2 support.\n"


It's a little bit counter-intuitive that we need ct2 to use --local-ai-prompt if we have llama.cpp engine enabled. Maybe this option should work with llama.cpp as well?

as-ascii

Good work

as-ascii · 2026-04-24T13:01:00Z

+
+    add_library(docwire_ai_ct2 SHARED ct2_runner.cpp tokenizer.cpp)
+
+    target_compile_definitions(docwire_ai_ct2 PUBLIC DOCWIRE_LOCAL_CT2)


Do we need this? It will always be true i think.

as-ascii

Changes are good in general but not complete yet I think.

as-ascii · 2026-05-22T12:11:16Z

+ * The appropriate prefix for the underlying model (e.g. "passage: " for multilingual-e5-small)
+ * is applied automatically. No model-specific knowledge required at the call site.
+ */
+class DOCWIRE_LOCAL_AI_EXPORT embed_passage : public docwire::ai::embed


We need to think about best name. "passage" is not well known in developers community, only for AI experts. I propose "document" or "content" or "index". And maybe embed::query not embed_query? But the concept of two classes instead of enum argument is interesting. Probably shorter: embed::query{} instead of embed{embed::query} so nicer in examples.

Current namespace is: docwire::ai::local, may be that needs to be changed to docwire::ai::local:embed and then the two classes. But, is it required? an additional nesting in namespace?

You are doing it now but in old C style: embed_X, embed_Y ;-) Its kind of a namespace in my opinion, no? Do you see any cons of converting it to embed::X and embed::Y? I have other doubts about naming: "embed" suggests a function rather than class/object, maybe it should be named "embedder" or "vectorizer"? Maybe it should be query::embedder and index::vectorizer ? Naming is hard for creators but very important for users.

understood.

I think that we can follow content_type namespace a little bit, for example index::embedder and query::embedder and then there is a possiblity to add for example index::embed and query::embed functions if user is not building a parsing chain just wanted to use function. I will try to add this kind of things into our "design rules" that are created currently.

So, these index::embed and query::embed functions will be separate from usual constructors?
So proposed namespaces can be: docwire::ai:embedder::passage and docwire::ai:embedder::query.

Even content_type namespace, as I see, has main classification name first, and then various iteration of utilities like: docwire::content_type::by_signature, docwire::content_type::asp, docwire::content_type::by_file_extension

Yes but for example by_signature is a namespace and there is class there "detector" and function "detect". If you will stay with class/object API for now than embedder::index and embedder::query have sense of course but "index" and "query" classes does not look like something that generate index or query but something that is index or query. Naming is hard... "embedder" looks like object that generates embeddings.

llama integration with exception safety

295bf64

as-ascii requested changes Feb 19, 2026

View reviewed changes

reeshabh90 added 3 commits February 20, 2026 09:44

PR review suggested changes

430c854

renaming model-runner to c2t_runner

63546cc

- llama-embedding integration with local-ai::embed.

04443d2

- changes made in docwire.cpp for default installation purpose.

as-ascii requested changes Feb 25, 2026

View reviewed changes

reeshabh90 added 3 commits February 27, 2026 04:51

PR Suggested changes <minor fixes>

ce6c8ba

This commit includes following major changes:

4048bb7

1. Added configurable model load and unload feature, which gives sdk an option to decide whether to unload to the model after pipeline usage or keep it persistent for next usage. 2. Added files for local summarize and translate

minor - added documentation

f591e3f

as-ascii requested changes Mar 9, 2026

View reviewed changes

reeshabh90 added 5 commits March 12, 2026 05:32

PR review suggested changes and necessary documentations.

8d75dd4

Removing destructor as it is not doing anything user provided

f1269f9

Noticed one bug, hence fixed. Variable was getting shadowed.

12c195a

Renaming c2t_runner to ct2_runner

b9e36fe

VCPKG Features introduction for custom installation of AI specific

fd4e8a7

features

as-ascii requested changes Mar 17, 2026

View reviewed changes

Comment thread ports/docwire/vcpkg.json Outdated

Comment thread ports/docwire/vcpkg.json Outdated

Comment thread ports/docwire/vcpkg.json Outdated

Comment thread ports/qwen2-7b-instruct-q4-k-m/portfile.cmake

Comment thread src/cosine_similarity.cpp

Comment thread ports/docwire/vcpkg.json

reeshabh90 added 7 commits March 21, 2026 04:54

Feature based AI capability installation via VCPKG.

51412ae

Merge tag '2026.03.26' into llama-integration

535b884

Qwen port package renaming

5e7068f

changes relate to test execution for build

6bcba54

Docwire CLI feature customization code for correct build process

937b6d3

Attempt to resolve tokenizer usage based on Docwire Local CT2 flag in

61b6b0f

api_tests.cpp for GitHub Build

Inclusion of headers in docwire.h based on Local CT2 and LLama flags

f55e5ab

Adjustment in local_ai.cmake for the same flags tests/CMakeLists.txt gets a minimal conditional added

as-ascii requested changes Apr 1, 2026

View reviewed changes

reeshabh90 added 4 commits April 2, 2026 09:46

re-design of build architecture for local ai usage, and for efficient

eed0fda

build processing.

making API changes in integration test

ea10198

a small fix for installation of docwire_local_ai INTERFACE

19b79a6

llama.cpp integration of chat template for inference

2c42539

as-ascii requested changes Apr 16, 2026

View reviewed changes

Adding/restoring integration example for ct2

2e07529

reeshabh90 added 2 commits April 21, 2026 08:10

Build Architecture revamp.

17f5457

newly added files

53760c7

as-ascii requested changes Apr 24, 2026

View reviewed changes

Updated code changes based on feedback

b38aa2d

as-ascii reviewed May 12, 2026

View reviewed changes

Comment thread src/local_ai_embed.h Outdated

Comment thread tests/local_ai_translate.cpp Outdated

Comment thread tests/local_embedding_similarity.cpp Outdated

Comment thread README.md Outdated

reeshabh90 added 2 commits May 21, 2026 05:14

Changes related to embed implementation with default prefixes

183b577

minor update related to removing e5 specific default.

89da06d

as-ascii reviewed May 22, 2026

View reviewed changes

reeshabh90 added 2 commits May 26, 2026 06:47

Minor code changes based on review suggestions

b80e807

embed namespace naming convention changes

e8a0078


		add_library(docwire_ai_ct2 SHARED ct2_runner.cpp tokenizer.cpp)

		target_compile_definitions(docwire_ai_ct2 PUBLIC DOCWIRE_LOCAL_CT2)

Conversation

reeshabh90 commented Feb 18, 2026

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

as-ascii left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

as-ascii left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!