-
Notifications
You must be signed in to change notification settings - Fork 28
Open
Description
Running on Ubuntu, 32GB RAM.
I get a segmentation fault by running the following code:
import sys
import llamacpp
def progress_callback(progress):
print("Progress: {:.2f}%".format(progress * 100))
sys.stdout.flush()
params = llamacpp.InferenceParams.default_with_callback(progress_callback)
params.path_model = '/home/captdishwasher/horenbergerb/llama/llama.cpp/models/30Bnew/ggml-model-q4_0-ggjt.bin'
model = llamacpp.LlamaInference(params)
prompt = "1"*500
prompt_tokens = model.tokenize(prompt, True)
print('Prompt tokens: {}'.format(len(prompt_tokens)))
model.add_bos()
model.update_input(prompt_tokens)
model.ingest_all_pending_input()
print(model.system_info())
for i in range(20):
model.eval()
token = model.sample()
text = model.token_to_str(token)
print(text, end="", flush=True)
# Flush stdout
sys.stdout.flush()
model.print_timings()
Output:
...
Prompt tokens: 501
AVX = 1 | AVX2 = 1 | AVX512 = 0 | FMA = 1 | NEON = 0 | ARM_FMA = 0 | F16C = 1 | FP16_VA = 0 | WASM_SIMD = 0 | BLAS = 0 | SSE3 = 1 | VSX = 0 |
111111101111Segmentation fault (core dumped)
Possibly related to context or something? The number 512 matches the default n_ctx, but raising n_ctx didn't fix the problem... This has been coming up for users of text-generation-web-ui, which uses this package: oobabooga/text-generation-webui#690
Metadata
Metadata
Assignees
Labels
No labels