- Python 90.6%
- Shell 8.1%
- Dockerfile 1.3%
| .devcontainer | ||
| .scripts | ||
| .vscode | ||
| data | ||
| src/vergleich | ||
| tests | ||
| .gitignore | ||
| .python-version | ||
| LICENSE | ||
| pyproject.toml | ||
| README.md | ||
| uv.lock | ||
vergleich
Excel comparison tool for the workbook in data/Auftreten_Ausprägung_Vergleich.xlsx.
The script reads the sheets Kristallin, Salz, and Christa and writes the result into the sheet Vergleich.
Current behavior:
- Column
B(Auftreten) is compared directly across the three source sheets. - Column
D(Epoche) is compared directly across the three source sheets. - Columns
CandEare compared with an LLM. Salzis treated as the baseline for the LLM comparison.- The first LLM pass writes short migration-oriented comparison notes into
VergleichcolumnCandEin the formatDiff Kristallin: ...andDiff Christa: .... - A second LLM pass uses the comparison notes plus the original texts from
Salz,Kristallin, andChristato create a minimal migrated target text for Wirtsgestein Kristallin. - The migrated target texts are written to
VergleichcolumnFandG.
Setup (Dev Container)
- Open this repository in VS Code.
- Run
Dev Containers: Reopen in Container. - On a GPU host, the devcontainer starts two services:
- the
workspacedevelopment container - a
llama.cppserver onhttp://localhost:8000/v1
- the
- Wait for
postCreateCommandto complete:uv python installuv sync --extra dev
This creates .venv in the workspace and installs project + dev dependencies from pyproject.toml.
GPU devcontainer target
The devcontainer is configured for the future NVIDIA GPU server and starts a dedicated llama.cpp service with:
- image:
ghcr.io/ggml-org/llama.cpp:server-cuda - Hugging Face repo:
mistralai/Ministral-3-14B-Instruct-2512-GGUF - default quant:
Ministral-3-14B-Instruct-2512-Q5_K_M.gguf - API endpoint:
http://localhost:8000/v1/chat/completions - persistent model volume:
llama-models
Host prerequisites:
nvidia-smimust work on the host- Docker must have NVIDIA GPU support available
- the NVIDIA Container Toolkit must be installed on the host
Important:
- This GPU devcontainer setup is meant for the future graphics server.
- On the current no-GPU machine, the
llama-serverservice is not expected to start successfully with CUDA. - The model choice is centralized in
docker-compose.ymlvia the top-levelx-model-repoandx-model-filevalues.
Optional local setup (without Dev Container)
bash ./.scripts/bootstrap.sh
LLM configuration
For columns C and E, the script expects an OpenAI-compatible local chat completions endpoint.
Environment variables:
VERGLEICH_LLM_BASE_URL- Default:
http://localhost:8000/v1
- Default:
VERGLEICH_LLM_MODEL- Default:
mistralai/Ministral-3-8B-Reasoning-2512
- Default:
VERGLEICH_LLM_API_KEY- Default:
dummy
- Default:
VERGLEICH_LLM_TIMEOUT- Default:
300
- Default:
Example:
export VERGLEICH_LLM_BASE_URL="http://localhost:8000/v1"
export VERGLEICH_LLM_MODEL="mistralai/Ministral-3-8B-Reasoning-2512"
export VERGLEICH_LLM_API_KEY="dummy"
uv run vergleich
If no compatible server is running, the script will fail when it reaches the LLM comparison for columns C and E.
When you open the GPU devcontainer, VERGLEICH_LLM_BASE_URL is set automatically to http://llama-server:8000/v1 inside the workspace container.
Run
uv run vergleich
The script updates the workbook in place:
- input and output file:
data/Auftreten_Ausprägung_Vergleich.xlsx - output sheet:
Vergleich - output columns
FandG: migrated target texts for Wirtsgestein Kristallin derived from source columnsCandE
Notes
- The script currently assumes the source sheets are named exactly
Kristallin,Salz, andChrista. - It matches rows by the value in column
A(Prozess). - Exact text differences in
BandDare written as multiline sheet-by-sheet output. - The first LLM prompt identifies only fachlich notwendige Anpassungen for migrating Steinsalz text to Kristallin.
- The second LLM prompt applies those instructions with minimal edits to the
Salztext.
Project structure
src/
vergleich/
main.py