generated from Martin/Python_3.12_Template_devcontainer

No description

Python 90.6%
Shell 8.1%
Dockerfile 1.3%

Find a file

John Doe 101699ebd8 ich habs vergessen		2026-04-02 19:08:35 +00:00
.devcontainer	ich habs vergessen	2026-04-02 19:08:35 +00:00
.scripts	Initial commit	2026-04-01 09:21:45 +00:00
.vscode	Initial commit	2026-04-01 09:21:45 +00:00
data	Working	2026-04-02 09:39:52 +00:00
src/vergleich	ich habs vergessen	2026-04-02 19:08:35 +00:00
tests	changed name	2026-04-01 09:38:19 +00:00
.gitignore	Initial commit	2026-04-01 09:21:45 +00:00
.python-version	Initial commit	2026-04-01 09:21:45 +00:00
LICENSE	Initial commit	2026-04-01 09:21:45 +00:00
pyproject.toml	Working	2026-04-02 09:39:52 +00:00
README.md	added new text support	2026-04-02 14:07:39 +00:00
uv.lock	Working	2026-04-02 09:39:52 +00:00

README.md

vergleich

Excel comparison tool for the workbook in data/Auftreten_Ausprägung_Vergleich.xlsx.

The script reads the sheets Kristallin, Salz, and Christa and writes the result into the sheet Vergleich.

Current behavior:

Column B (Auftreten) is compared directly across the three source sheets.
Column D (Epoche) is compared directly across the three source sheets.
Columns C and E are compared with an LLM.
Salz is treated as the baseline for the LLM comparison.
The first LLM pass writes short migration-oriented comparison notes into Vergleich column C and E in the format Diff Kristallin: ... and Diff Christa: ....
A second LLM pass uses the comparison notes plus the original texts from Salz, Kristallin, and Christa to create a minimal migrated target text for Wirtsgestein Kristallin.
The migrated target texts are written to Vergleich column F and G.

Setup (Dev Container)

Open this repository in VS Code.
Run Dev Containers: Reopen in Container.
On a GPU host, the devcontainer starts two services:
- the workspace development container
- a llama.cpp server on http://localhost:8000/v1
Wait for postCreateCommand to complete:
- uv python install
- uv sync --extra dev

This creates .venv in the workspace and installs project + dev dependencies from pyproject.toml.

GPU devcontainer target

The devcontainer is configured for the future NVIDIA GPU server and starts a dedicated llama.cpp service with:

image: ghcr.io/ggml-org/llama.cpp:server-cuda
Hugging Face repo: mistralai/Ministral-3-14B-Instruct-2512-GGUF
default quant: Ministral-3-14B-Instruct-2512-Q5_K_M.gguf
API endpoint: http://localhost:8000/v1/chat/completions
persistent model volume: llama-models

Host prerequisites:

nvidia-smi must work on the host
Docker must have NVIDIA GPU support available
the NVIDIA Container Toolkit must be installed on the host

Important:

This GPU devcontainer setup is meant for the future graphics server.
On the current no-GPU machine, the llama-server service is not expected to start successfully with CUDA.
The model choice is centralized in docker-compose.yml via the top-level x-model-repo and x-model-file values.

Optional local setup (without Dev Container)

bash ./.scripts/bootstrap.sh

LLM configuration

For columns C and E, the script expects an OpenAI-compatible local chat completions endpoint.

Environment variables:

VERGLEICH_LLM_BASE_URL
- Default: http://localhost:8000/v1
VERGLEICH_LLM_MODEL
- Default: mistralai/Ministral-3-8B-Reasoning-2512
VERGLEICH_LLM_API_KEY
- Default: dummy
VERGLEICH_LLM_TIMEOUT
- Default: 300

Example:

export VERGLEICH_LLM_BASE_URL="http://localhost:8000/v1"
export VERGLEICH_LLM_MODEL="mistralai/Ministral-3-8B-Reasoning-2512"
export VERGLEICH_LLM_API_KEY="dummy"
uv run vergleich

If no compatible server is running, the script will fail when it reaches the LLM comparison for columns C and E.

When you open the GPU devcontainer, VERGLEICH_LLM_BASE_URL is set automatically to http://llama-server:8000/v1 inside the workspace container.

Run

uv run vergleich

The script updates the workbook in place:

input and output file: data/Auftreten_Ausprägung_Vergleich.xlsx
output sheet: Vergleich
output columns F and G: migrated target texts for Wirtsgestein Kristallin derived from source columns C and E

Notes

The script currently assumes the source sheets are named exactly Kristallin, Salz, and Christa.
It matches rows by the value in column A (Prozess).
Exact text differences in B and D are written as multiline sheet-by-sheet output.
The first LLM prompt identifies only fachlich notwendige Anpassungen for migrating Steinsalz text to Kristallin.
The second LLM prompt applies those instructions with minimal edits to the Salz text.

Project structure

src/
  vergleich/
    main.py