# marker-example **Repository Path**: felix-hua/marker-example ## Basic Information - **Project Name**: marker-example - **Description**: No description available - **Primary Language**: Unknown - **License**: Not specified - **Default Branch**: main - **Homepage**: None - **GVP Project**: No ## Statistics - **Stars**: 0 - **Forks**: 0 - **Created**: 2026-04-12 - **Last Updated**: 2026-04-12 ## Categories & Tags **Categories**: Uncategorized **Tags**: None ## README # file-convert-service Standalone FastAPI service for PDF-to-Markdown conversion. ## What This Repo Contains - A self-contained HTTP service for converting uploaded PDF files. - Local development support with `uvicorn`. - Optional containerized startup with `docker compose`. - Persistent model caches for marker-related downloads when running with Docker Compose. ## Local Development Install runtime dependencies: ```powershell python -m pip install -r requirements.txt ``` Install test dependencies: ```powershell python -m pip install -r requirements-dev.txt ``` Start the API locally: ```powershell uvicorn app.main:app --reload --host 0.0.0.0 --port 8000 ``` Run the test suite: ```powershell python -m pytest -q ``` For `docker compose`, create a local `.env` from the example file if you want to override the default port, log settings, or cache paths. For local `uvicorn` runs, set environment variables in your shell as needed. ## Docker Build the image from this repository root: ```powershell docker build -t file-convert-service . ``` The Dockerfile uses BuildKit pip cache mounts, so repeated builds can reuse downloaded Python packages instead of pulling them every time. Start the service with Docker Compose: ```powershell docker compose up --build ``` The API will be available on `http://localhost:8000`. ## Configuration The service reads these environment variables: - `APP_ENV` default `development` - `APP_LOG_LEVEL` default `INFO` - `APP_LOG_FORMAT` default `auto` - `APP_LOG_ACCESS_ENABLED` default `true` - `HF_HOME` default `/data/huggingface` - `MODEL_CACHE_DIR` default `/data/datalab/models` Example values are provided in `.env.example`. ## Model Cache Behavior - `docker compose` mounts named volumes for Hugging Face and Datalab model caches. - The first conversion request may still download marker-related models. - After the first download, subsequent container restarts and rebuilds reuse the persisted cache and should not download the same model files again. ## Dependencies - `marker-pdf==1.10.2` is declared explicitly in `requirements.txt`. ## API - `GET /health` - Returns `{"status": "ok"}` - `POST /internal/converters/pdf-to-markdown/file` - Converts an uploaded PDF and returns Markdown plus inline extracted images - Request type: `multipart/form-data` - Form fields: `file=`, optional `model=marker` ## Notes - Request ID compatibility is preserved with both `X-Request-ID` and `X-Convert-Task-Id`. - The service no longer depends on object storage and only supports direct file uploads. - If Docker BuildKit is disabled locally, the pip cache mount in the Dockerfile will not be used.