please adapt download_embeddings to support 'file://' url scheme

Hello, 


can you please adapt `download_embeddings` to also support `file://....` url schema.

this will allow us to provide centralized embedings files on our cluster, avoiding the download time.

something in this spirit. 
```
    log.info("Downloading embeddings from %s �~F~R %s", url, dest)

    if url.startswith("file://"):
        from urllib.request import urlopen
        with urlopen(url) as resp:
            content = resp.read()
            with open(dest, "wb")as f:
                f.write(content)
        return dest
    else:
        #else download from given URL
        with requests.get(url, stream=True, timeout=timeout) as resp:
            resp.raise_for_status()
```

problem, this will **duplicate** the file for each user // base_directory 


I would prefer to main.py to directly handle embeddings data file

something like:
```
    # Nuevo: obtener nombre del archivo desde la URL
    URL=conf["embeddings_url"]
    if URL.startswith('file://'):
        file_path=urllib.parse.urlparse(URL).path
        if os.path.exists(file_path):
            tar_path=file_path
    else:
        filename = os.path.basename(urllib.parse.urlparse(conf["embeddings_url"]).path)
        tar_path = os.path.join(embeddings_dir, filename)

        logger.info(f"Downloading reference embeddings to {tar_path}...")
        download_embeddings(conf["embeddings_url"], tar_path)

    logger.info("Loading embeddings into the database...")
    load_dump_to_db(tar_path, conf)
```
maybeed there is something I missed regarding the embeddings file needed in base_directory


lmk if this sounds acceptable and which method you prefer. I will then propose a PR

!hasta luego¡

Eric

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

please adapt download_embeddings to support 'file://' url scheme #54

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

please adapt download_embeddings to support 'file://' url scheme #54

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions