diff --git a/docs/howto/pdf_manipulation.md b/docs/howto/pdf_manipulation.md
index 6f7c2f7..ba26d45 100644
--- a/docs/howto/pdf_manipulation.md
+++ b/docs/howto/pdf_manipulation.md
@@ -102,7 +102,7 @@ parxy pdf:merge file1.pdf file2.pdf -o /output/dir/merged.pdf
 
 ## Splitting PDFs
 
-The `pdf:split` command divides a single PDF into individual pages, with each page becoming a separate PDF file.
+The `pdf:split` command divides a single PDF into individual pages, with each page becoming a separate PDF file. You can optionally limit which pages are extracted and combine them into a single output PDF.
 
 ### Basic Splitting
 
@@ -139,6 +139,51 @@ Creates files named:
 - `chapter_page_2.pdf`
 - etc.
 
+### Extracting a Page Range
+
+Use `--pages` to limit which pages are extracted (1-based indexing):
+
+**Single page:**
+```bash
+parxy pdf:split document.pdf --pages 3
+```
+
+**Page range:**
+```bash
+parxy pdf:split document.pdf --pages 2:5
+```
+
+**From start to page N:**
+```bash
+parxy pdf:split document.pdf --pages :5
+```
+
+**From page N to end:**
+```bash
+parxy pdf:split document.pdf --pages 3:
+```
+
+### Combining Pages into a Single PDF
+
+Use `--combine` to extract a page range into a single output PDF instead of one file per page:
+
+```bash
+# Extract pages 2–5 as a single PDF (auto-named)
+parxy pdf:split document.pdf --pages 2:5 --combine
+# Output: document_pages_2-5.pdf (next to the input file)
+
+# Specify a custom output path
+parxy pdf:split document.pdf --pages 2:5 --combine -o extracted.pdf
+
+# Extract a single page as a PDF
+parxy pdf:split document.pdf --pages 3 --combine -o page3.pdf
+
+# Combine all pages (equivalent to a copy)
+parxy pdf:split document.pdf --combine -o copy.pdf
+```
+
+> **Tip:** `--combine` pairs well with `--pages` to replace the `pdf:merge file.pdf[2:5]` pattern when working with a single source file.
+
 ### Complete Examples
 
 **Split with custom output directory:**
@@ -161,14 +206,25 @@ Creates:
 parxy pdf:split document.pdf -o ./individual_pages -p page
 ```
 
+**Extract pages 10–20 as individual files:**
+```bash
+parxy pdf:split document.pdf --pages 10:20 -o ./extracted_pages
+```
+
 ## Combining Merge and Split
 
 You can chain operations together using the CLI:
 
 **Example: Extract specific pages and split them:**
 ```bash
-# First, extract pages 10-20
-parxy pdf:merge document.pdf[10:20] -o extracted.pdf
+# Extract pages 10-20 as individual files
+parxy pdf:split document.pdf --pages 10:20 -o ./individual_pages
+```
+
+**Example: Extract a range into a single PDF, then split:**
+```bash
+# First, extract pages 10-20 into one PDF
+parxy pdf:split document.pdf --pages 10:20 --combine -o extracted.pdf
 
 # Then split into individual pages
 parxy pdf:split extracted.pdf -o ./individual_pages
@@ -232,17 +288,21 @@ parxy pdf:split INPUT_FILE [OPTIONS]
 ```
 
 **Arguments:**
-- `INPUT_FILE`: PDF file to split into individual pages
+- `INPUT_FILE`: PDF file to split
 
 **Options:**
-- `--output, -o`: Output directory (default: `{filename}_split/`)
-- `--prefix, -p`: Output filename prefix (default: input filename)
+- `--output, -o`: Without `--combine`: output directory (default: `{filename}_split/`). With `--combine`: output file path (default: `{filename}_pages_{from}-{to}.pdf` next to the input).
+- `--prefix, -p`: Output filename prefix for individual split files (default: input filename)
+- `--pages`: Page range to extract, 1-based. Formats: `3` (single page), `2:5` (range), `:5` (up to page 5), `3:` (from page 3 to end)
+- `--combine`: Combine extracted pages into a single PDF instead of one file per page
 
 **Examples:**
 ```bash
 parxy pdf:split document.pdf
 parxy pdf:split document.pdf -o ./pages
 parxy pdf:split document.pdf -o ./pages -p page
+parxy pdf:split document.pdf --pages 2:5
+parxy pdf:split document.pdf --pages 2:5 --combine -o extracted.pdf
 ```
 
 ## Getting Help
diff --git a/docs/tutorials/pdf_manipulation.md b/docs/tutorials/pdf_manipulation.md
index d101746..afa8604 100644
--- a/docs/tutorials/pdf_manipulation.md
+++ b/docs/tutorials/pdf_manipulation.md
@@ -81,6 +81,44 @@ for page_path in pages:
 # ...
 ```
 
+You can limit splitting to a page range using 0-based `from_page` / `to_page` indices:
+
+```python
+# Split only pages 2–5 (0-based: indices 1–4)
+pages = Parxy.pdf.split(
+    input_path=Path("document.pdf"),
+    output_dir=Path("./pages"),
+    prefix="doc",
+    from_page=1,
+    to_page=4,
+)
+# Creates: doc_page_2.pdf, doc_page_3.pdf, doc_page_4.pdf, doc_page_5.pdf
+```
+
+### Extracting Pages into a Single PDF
+
+Use `extract_pages` to pull a page range from a PDF into a new single-file PDF without splitting each page individually:
+
+```python
+from pathlib import Path
+from parxy_core.services.pdf_service import PdfService
+
+# Extract pages 3–7 (0-based: indices 2–6)
+PdfService.extract_pages(
+    input_path=Path("report.pdf"),
+    output_path=Path("summary.pdf"),
+    from_page=2,
+    to_page=6,
+)
+```
+
+Omit `from_page` / `to_page` to copy all pages:
+
+```python
+# Equivalent to a copy
+PdfService.extract_pages(Path("original.pdf"), Path("copy.pdf"))
+```
+
 ### Optimizing PDFs
 
 Reduce PDF file size using compression techniques:
@@ -302,6 +340,12 @@ try:
 except FileNotFoundError as e:
     print(f"File not found: {e}")
 
+# ValueError for invalid page ranges
+try:
+    Parxy.pdf.split(Path("doc.pdf"), Path("./out"), "doc", from_page=100)
+except ValueError as e:
+    print(f"Invalid page range: {e}")
+
 # ValueError for invalid parameters
 try:
     Parxy.pdf.optimize(
@@ -332,7 +376,8 @@ except RuntimeError as e:
 In this tutorial you learned:
 
 - **`Parxy.pdf.merge()`** - Combine multiple PDFs with optional page ranges
-- **`Parxy.pdf.split()`** - Split a PDF into individual page files
+- **`Parxy.pdf.split()`** - Split a PDF into individual page files, with optional page range
+- **`PdfService.extract_pages()`** - Extract a page range into a single output PDF
 - **`Parxy.pdf.optimize()`** - Reduce file size with compression options
 - **`PdfService` context manager** - Work with attachments (add, list, extract, remove)
 
@@ -344,6 +389,7 @@ In this tutorial you learned:
 | Splitting into pages | Extracting attachment content |
 | Optimizing file size | Multiple operations on one file |
 | One-shot operations | Need fine-grained control |
+| Splitting a page range | Extracting a page range into one PDF (`extract_pages`) |
 
 ## Next Steps
 
diff --git a/docs/tutorials/using_cli.md b/docs/tutorials/using_cli.md
index a350ed1..523c28e 100644
--- a/docs/tutorials/using_cli.md
+++ b/docs/tutorials/using_cli.md
@@ -14,7 +14,7 @@ The Parxy CLI lets you:
 | `parxy preview`  | Interactive document viewer with metadata, table of contents, and scrollable content preview                |
 | `parxy markdown` | Convert documents to Markdown files, with support for multiple drivers and folder processing                |
 | `parxy pdf:merge`| Merge multiple PDF files into one, with support for page ranges                                            |
-| `parxy pdf:split`| Split a PDF file into individual pages                                                                      |
+| `parxy pdf:split`| Split a PDF into individual pages, with optional page range and single-file extraction                      |
 | `parxy drivers`  | List available document processing drivers                                                                  |
 | `parxy env`      | Generate a default `.env` configuration file                                                                |
 | `parxy docker`   | Create a Docker Compose configuration for running Parxy-related services                                    |
@@ -218,6 +218,42 @@ parxy markdown document.pdf -d pymupdf -d llamaparse
 
 This produces `pymupdf-document.md` and `llamaparse-document.md`.
 
+### Converting Pre-parsed JSON Results
+
+If you have a JSON file produced by `parxy parse -m json`, you can convert it to Markdown directly without re-parsing:
+
+```bash
+parxy markdown result.json
+```
+
+This loads the `Document` model from the JSON and converts it immediately — no driver or API call required. You can mix JSON files and PDF files in the same invocation:
+
+```bash
+parxy markdown result.json document.pdf -d pymupdf -o output/
+```
+
+### Page Separator Comments
+
+Use `--page-separators` to insert HTML comments before each page's content:
+
+```bash
+parxy markdown document.pdf --page-separators
+```
+
+Output will contain markers like:
+
+```markdown
+<!-- page: 1 -->
+
+First page content...
+
+<!-- page: 2 -->
+
+Second page content...
+```
+
+This is useful for post-processing scripts that need to identify page boundaries.
+
 ### Inline Output
 
 Use `--inline` with a single file to print markdown directly to stdout with a YAML frontmatter header — useful for shell pipelines:
@@ -276,7 +312,7 @@ parxy pdf:merge cover.pdf /chapters doc.pdf[10:20] appendix.pdf -o book.pdf
 
 ### Splitting PDFs
 
-The `pdf:split` command divides a PDF file into individual pages, with each page becoming a separate PDF file.
+The `pdf:split` command divides a PDF file into individual pages, with optional page range extraction and single-file output.
 
 **Split into individual pages:**
 ```bash
@@ -290,7 +326,21 @@ This creates a `document_split/` folder containing `document_page_1.pdf`, `docum
 parxy pdf:split report.pdf -o ./pages -p page
 ```
 
-Creates `page_1.pdf`, `page_2.pdf`, etc. in the `./pages` directory.
+**Extract a page range as individual files:**
+```bash
+parxy pdf:split document.pdf --pages 2:5 -o ./pages
+```
+
+**Combine a page range into a single PDF:**
+```bash
+# Auto-named output next to the input file
+parxy pdf:split document.pdf --pages 2:5 --combine
+
+# Custom output path
+parxy pdf:split document.pdf --pages 2:5 --combine -o extracted.pdf
+```
+
+Page range formats (1-based): `3` · `2:5` · `:5` · `3:`
 
 For more detailed examples and use cases, see the [PDF Manipulation How-to Guide](../howto/pdf_manipulation.md).
 
@@ -358,9 +408,9 @@ With the CLI, you can use Parxy as a **standalone document parsing tool** — id
 |------------------|--------------------------------------------------------------|
 | `parxy parse`    | Extract text from documents with multiple formats & drivers  |
 | `parxy preview`  | Interactive document viewer with metadata and TOC            |
-| `parxy markdown` | Generate Markdown files with driver prefix naming            |
+| `parxy markdown` | Generate Markdown files; accepts JSON results and supports `--page-separators` |
 | `parxy pdf:merge`| Merge multiple PDF files with page range support             |
-| `parxy pdf:split`| Split PDF files into individual pages                        |
+| `parxy pdf:split`| Split PDF into individual pages; supports `--pages` and `--combine` |
 | `parxy drivers`  | List supported drivers                                       |
 | `parxy env`      | Create default configuration file                            |
 | `parxy docker`   | Generate Docker Compose setup                                |
diff --git a/src/parxy_cli/commands/markdown.py b/src/parxy_cli/commands/markdown.py
index dfba9f6..2ccd009 100644
--- a/src/parxy_cli/commands/markdown.py
+++ b/src/parxy_cli/commands/markdown.py
@@ -2,11 +2,13 @@
 
 from datetime import timedelta
 from pathlib import Path
-from typing import Optional, List, Annotated
+from typing import Optional, List, Annotated, Tuple
 
 import typer
+from pydantic import ValidationError
 
 from parxy_core.facade import Parxy
+from parxy_core.models import Document
 
 from parxy_cli.models import Level
 from parxy_cli.console.console import Console
@@ -91,14 +93,27 @@ def markdown(
             min=1,
         ),
     ] = None,
+    page_separators: Annotated[
+        bool,
+        typer.Option(
+            '--page-separators',
+            help="Insert <!-- page: N --> HTML comments before each page's content.",
+        ),
+    ] = False,
 ):
     """Parse documents to Markdown.
 
+    Accepts PDF files (parsed on-the-fly) or pre-parsed JSON result files
+    (loaded directly from the Document model without re-parsing).
+
     Examples:
 
         # Parse a single file
         parxy markdown document.pdf
 
+        # Convert a pre-parsed JSON result directly to markdown
+        parxy markdown result.json
+
         # Parse with a specific driver and output to a folder
         parxy markdown document.pdf -d pymupdf -o output/
 
@@ -110,6 +125,9 @@ def markdown(
 
         # Output to stdout as YAML-frontmattered markdown (single file only)
         parxy markdown document.pdf --inline
+
+        # Include page separator comments in the output
+        parxy markdown document.pdf --page-separators
     """
     console.action('Markdown export', space_after=False)
 
@@ -120,85 +138,118 @@ def markdown(
         console.warning('No suitable files found to process.', panel=True)
         raise typer.Exit(1)
 
-    if inline and len(files) > 1:
+    # Partition into pre-parsed JSON files and files to parse
+    json_files = [f for f in files if f.suffix.lower() == '.json']
+    parse_files = [f for f in files if f.suffix.lower() != '.json']
+
+    if inline and len(json_files) + len(parse_files) > 1:
         console.error('--inline can only be used with a single file')
         raise typer.Exit(1)
 
-    # Use default driver if none specified
+    # Use default driver if none specified (only needed for parse_files)
     if not drivers:
         drivers = [Parxy.default_driver()]
 
     output_path = Path(output_dir) if output_dir else None
 
-    total_tasks = len(files) * len(drivers)
+    total_tasks = len(json_files) + len(parse_files) * len(drivers)
     error_count = 0
+    elapsed_time = '0 sec'
+
+    def _write_markdown(
+        doc: Document, file_path: Path, driver_label: str | None
+    ) -> None:
+        """Write markdown content to file or stdout."""
+        content = doc.markdown(page_separators=page_separators)
+        if inline:
+            frontmatter = f'---\nfile: "{file_path}"\npages: {len(doc.pages)}\n---\n\n'
+            console.print(frontmatter + content)
+        else:
+            if output_path:
+                output_path.mkdir(parents=True, exist_ok=True)
+                save_dir = output_path
+            else:
+                save_dir = file_path.parent
+
+            base_name = file_path.stem
+            if driver_label:
+                base_name = f'{driver_label}-{base_name}'
+
+            out_file = save_dir / f'{base_name}.md'
+            out_file.write_text(content, encoding='utf-8')
+
+            via = f'via {driver_label} ' if driver_label else ''
+            console.print(
+                f'[faint]⎿ [/faint] {file_path.name} {via}to [success]{out_file}[/success] [faint]({len(doc.pages)} pages)[/faint]'
+            )
 
     try:
         with console.shimmer(
-            f'Processing {len(files)} file{"s" if len(files) > 1 else ""} with {len(drivers)} driver{"s" if len(drivers) > 1 else ""}...'
+            f'Processing {len(files)} file{"s" if len(files) > 1 else ""}...'
         ):
             with console.progress('Processing documents') as progress:
                 task = progress.add_task('', total=total_tasks)
 
-                batch_tasks = [str(f) for f in files]
-
-                for result in Parxy.batch_iter(
-                    tasks=batch_tasks,
-                    drivers=drivers,
-                    level=level.value,
-                    workers=workers,
-                ):
-                    file_name = (
-                        Path(result.file).name
-                        if isinstance(result.file, str)
-                        else 'document'
-                    )
-
-                    if result.success:
-                        doc = result.document
-                        file_path = (
-                            Path(result.file)
-                            if isinstance(result.file, str)
-                            else Path('document')
+                # Process pre-parsed JSON files directly
+                for json_file in json_files:
+                    try:
+                        doc = Document.model_validate_json(
+                            json_file.read_text(encoding='utf-8')
                         )
-
-                        content = doc.markdown()
-
-                        if inline:
-                            frontmatter = f'---\nfile: "{result.file}"\npages: {len(doc.pages)}\n---\n\n'
-                            console.print(frontmatter + content)
-                        else:
-                            if output_path:
-                                output_path.mkdir(parents=True, exist_ok=True)
-                                save_dir = output_path
-                            else:
-                                save_dir = file_path.parent
-
-                            base_name = file_path.stem
-                            if result.driver:
-                                base_name = f'{result.driver}-{base_name}'
-
-                            out_file = save_dir / f'{base_name}.md'
-                            out_file.write_text(content, encoding='utf-8')
-
-                            console.print(
-                                f'[faint]⎿ [/faint] {file_name} via {result.driver} to [success]{out_file}[/success] [faint]({len(doc.pages)} pages)[/faint]'
-                            )
-                    else:
+                        _write_markdown(
+                            doc, json_file.with_suffix(''), driver_label=None
+                        )
+                    except (ValidationError, ValueError) as e:
                         console.print(
-                            f'[faint]⎿ [/faint] {file_name} via {result.driver} error. [error]{result.error}[/error]'
+                            f'[faint]⎿ [/faint] {json_file.name} error. [error]{e}[/error]'
                         )
                         error_count += 1
-
                         if stop_on_failure:
                             console.newline()
                             console.info(
                                 'Stopping due to error (--stop-on-failure flag is set)'
                             )
                             raise typer.Exit(1)
-
                     progress.update(task, advance=1)
 
+                # Process files that need parsing
+                if parse_files:
+                    for result in Parxy.batch_iter(
+                        tasks=[str(f) for f in parse_files],
+                        drivers=drivers,
+                        level=level.value,
+                        workers=workers,
+                    ):
+                        file_name = (
+                            Path(result.file).name
+                            if isinstance(result.file, str)
+                            else 'document'
+                        )
+
+                        if result.success:
+                            file_path = (
+                                Path(result.file)
+                                if isinstance(result.file, str)
+                                else Path('document')
+                            )
+                            _write_markdown(
+                                result.document, file_path, driver_label=result.driver
+                            )
+                        else:
+                            console.print(
+                                f'[faint]⎿ [/faint] {file_name} via {result.driver} error. [error]{result.error}[/error]'
+                            )
+                            error_count += 1
+
+                            if stop_on_failure:
+                                console.newline()
+                                console.info(
+                                    'Stopping due to error (--stop-on-failure flag is set)'
+                                )
+                                raise typer.Exit(1)
+
+                        progress.update(task, advance=1)
+
                 elapsed_time = format_timedelta(
                     timedelta(seconds=max(0, progress.tasks[0].elapsed))
                 )
@@ -210,13 +261,13 @@ def markdown(
     if not inline:
         console.newline()
 
-    if error_count == len(files) * len(drivers):
+    if error_count == total_tasks:
         console.error('All files were not processed due to errors')
         return
 
     if error_count > 0:
         console.warning(
-            f'Processed {len(files)} file{"s" if len(files) > 1 else ""} with warnings using {len(drivers)} driver{"s" if len(drivers) > 1 else ""}'
+            f'Processed {len(files)} file{"s" if len(files) > 1 else ""} with warnings'
         )
         console.print(
             f'[faint]⎿ [/faint] [highlight]{error_count} files errored[/highlight]'
@@ -225,5 +276,5 @@ def markdown(
 
     if not inline:
         console.success(
-            f'Processed {len(files)} file{"s" if len(files) > 1 else ""} using {len(drivers)} driver{"s" if len(drivers) > 1 else ""} (took {elapsed_time})'
+            f'Processed {len(files)} file{"s" if len(files) > 1 else ""} (took {elapsed_time})'
         )
diff --git a/src/parxy_core/models/models.py b/src/parxy_core/models/models.py
index 258b847..b965c56 100644
--- a/src/parxy_core/models/models.py
+++ b/src/parxy_core/models/models.py
@@ -155,7 +155,52 @@ def text(self, page_separator: str = '---') -> str:
 
         return '\n'.join(texts)
 
-    def markdown(self) -> str:
+    def contentmd(
+        self,
+        title: Optional[str] = None,
+        description: Optional[str] = None,
+        date: Optional[str] = None,
+        license: Optional[str] = None,
+        author: Optional[str] = None,
+        page_separators: bool = False,
+    ) -> str:
+        """Get the document content formatted as content-md.
+
+        Delegates to :class:`~parxy_core.services.ContentMdService`.
+
+        Parameters
+        ----------
+        title : str, optional
+            Document title. Falls back to metadata.title, a heading inferred
+            from the first page, filename, then 'Untitled'.
+        description : str, optional
+            Short summary (~200 characters). Falls back to a doc-abstract block,
+            then the longest TextBlock across the first two pages.
+        date : str, optional
+            Creation/publication date in ISO 8601. Falls back to metadata dates.
+        license : str, optional
+            License name or SPDX identifier.
+        author : str, optional
+            Author name. Falls back to metadata.author.
+
+        Returns
+        -------
+        str
+            The document content formatted as content-md.
+        """
+        from parxy_core.services.contentmd_service import ContentMdService
+
+        return ContentMdService.render(
+            self,
+            title=title,
+            description=description,
+            date=date,
+            license=license,
+            author=author,
+            page_separators=page_separators,
+        )
+
+    def markdown(self, page_separators: bool = False) -> str:
         """Get the document content formatted as Markdown.
 
         The method attempts to preserve the document structure by:
@@ -163,6 +208,12 @@ def markdown(self) -> str:
         2. Preserving line breaks where meaningful
         3. Adding section headers based on block levels
 
+        Parameters
+        ----------
+        page_separators : bool, optional
+            When True, inserts an HTML comment ``<!-- page: N -->`` before
+            each page's content, by default False
+
         Returns
         -------
         str
@@ -174,48 +225,50 @@ def markdown(self) -> str:
         markdown_parts = []
 
         for page in self.pages:
-            if not page.blocks:
-                if page.text.strip():
-                    markdown_parts.append(page.text.strip())
-                continue
-
             page_parts = []
 
-            for block in page.blocks:
-                if isinstance(block, TextBlock):
-                    # Handle different block categories
-                    if block.category and block.category.lower() in [
-                        'heading',
-                        'title',
-                        'header',
-                    ]:
-                        # Determine heading level (h1-h6) based on block level or default to h2
-                        level = min(block.level or 2, 6)
-                        page_parts.append(f'{"#" * level} {block.text.strip()}')
-                    elif block.category and block.category.lower() == 'list':
-                        # Convert to bullet points
-                        for line in block.text.splitlines():
-                            if line.strip():
-                                page_parts.append(f'- {line.strip()}')
-                    else:
-                        # Regular paragraph
+            if page_separators:
+                page_parts.append(f'<!-- page: {page.number} -->')
+
+            if not page.blocks:
+                if page.text.strip():
+                    page_parts.append(page.text.strip())
+            else:
+                for block in page.blocks:
+                    if isinstance(block, TextBlock):
+                        # Handle different block categories
+                        if block.category and block.category.lower() in [
+                            'heading',
+                            'title',
+                            'header',
+                        ]:
+                            # Determine heading level (h1-h6) based on block level or default to h2
+                            level = min(block.level or 2, 6)
+                            page_parts.append(f'{"#" * level} {block.text.strip()}')
+                        elif block.category and block.category.lower() == 'list':
+                            # Convert to bullet points
+                            for line in block.text.splitlines():
+                                if line.strip():
+                                    page_parts.append(f'- {line.strip()}')
+                        else:
+                            # Regular paragraph
+                            if block.text.strip():
+                                page_parts.append(block.text.strip())
+
+                    elif isinstance(block, ImageBlock):
+                        ext = (
+                            block.name.rsplit('.', 1)[-1]
+                            if block.name and '.' in block.name
+                            else ''
+                        )
+                        lang = f'image:{ext}' if ext else 'image'
+                        alt = block.alt_text or ''
+                        page_parts.append(f'```{lang}\n{alt}\n```')
+
+                    elif isinstance(block, TableBlock):
                         if block.text.strip():
                             page_parts.append(block.text.strip())
 
-                elif isinstance(block, ImageBlock):
-                    ext = (
-                        block.name.rsplit('.', 1)[-1]
-                        if block.name and '.' in block.name
-                        else ''
-                    )
-                    lang = f'image:{ext}' if ext else 'image'
-                    alt = block.alt_text or ''
-                    page_parts.append(f'```{lang}\n{alt}\n```')
-
-                elif isinstance(block, TableBlock):
-                    if block.text.strip():
-                        page_parts.append(block.text.strip())
-
             if page_parts:
                 markdown_parts.append('\n\n'.join(page_parts))
 
diff --git a/src/parxy_core/services/__init__.py b/src/parxy_core/services/__init__.py
index 5071d08..5342a63 100644
--- a/src/parxy_core/services/__init__.py
+++ b/src/parxy_core/services/__init__.py
@@ -1,5 +1,6 @@
 """Services module for parxy_core."""
 
+from parxy_core.services.contentmd_service import ContentMdService
 from parxy_core.services.pdf_service import PdfService
 
-__all__ = ['PdfService']
+__all__ = ['ContentMdService', 'PdfService']
diff --git a/src/parxy_core/services/contentmd_service.py b/src/parxy_core/services/contentmd_service.py
new file mode 100644
index 0000000..039ab38
--- /dev/null
+++ b/src/parxy_core/services/contentmd_service.py
@@ -0,0 +1,273 @@
+"""Service for rendering documents as content-md."""
+
+from __future__ import annotations
+
+from typing import TYPE_CHECKING, Optional
+
+if TYPE_CHECKING:
+    from parxy_core.models.models import Document
+
+
+class ContentMdService:
+    """Render a :class:`Document` as a content-md string.
+
+    content-md is an open specification for optimised content exchange: a YAML
+    frontmatter section followed by CommonMark / GitHub-flavoured Markdown.
+    All methods are static; the class acts as a namespace.
+    """
+
+    # ------------------------------------------------------------------
+    # Private helpers
+    # ------------------------------------------------------------------
+
+    # Roles that provide structure or navigation rather than readable body text
+    _STRUCTURAL_ROLES: frozenset[str] = frozenset(
+        {
+            'heading',
+            'doc-title',
+            'doc-subtitle',
+            'doc-abstract',
+            'doc-toc',
+            'doc-pageheader',
+            'doc-pagefooter',
+            'caption',
+        }
+    )
+
+    @staticmethod
+    def _normalize(text: str) -> str:
+        """Collapse any run of whitespace to a single space and strip."""
+        return ' '.join(text.split())
+
+    @staticmethod
+    def _yaml_str(value: str) -> str:
+        """Wrap *value* in double quotes and escape internal quotes/backslashes."""
+        return '"' + value.replace('\\', '\\\\').replace('"', '\\"') + '"'
+
+    @staticmethod
+    def _guess_title(document: Document) -> Optional[str]:
+        """Infer a title from the first page blocks.
+
+        Prefers an explicit ``doc-title`` role; falls back to the
+        highest-ranking (lowest level number) ``heading`` block.
+        """
+        from parxy_core.models.models import TextBlock
+
+        if not document.pages:
+            return None
+        first_page = document.pages[0]
+        if not first_page.blocks:
+            return None
+
+        doc_title = next(
+            (
+                b
+                for b in first_page.blocks
+                if isinstance(b, TextBlock) and b.role == 'doc-title' and b.text.strip()
+            ),
+            None,
+        )
+        if doc_title:
+            return ContentMdService._normalize(doc_title.text)
+
+        headings = [
+            b
+            for b in first_page.blocks
+            if isinstance(b, TextBlock) and b.role == 'heading' and b.text.strip()
+        ]
+        if not headings:
+            return None
+        return ContentMdService._normalize(
+            min(headings, key=lambda b: b.level or 1).text
+        )
+
+    @staticmethod
+    def _infer_description(document: Document) -> Optional[str]:
+        """Infer a description from document content.
+
+        Uses the ``doc-abstract`` block when present. Otherwise concatenates
+        the first five body :class:`TextBlock` objects (non-structural, across
+        the first two pages), normalises whitespace, and returns at most 200
+        characters.
+        """
+        from parxy_core.models.models import TextBlock
+
+        blocks = [
+            b
+            for page in document.pages[:2]
+            if page.blocks
+            for b in page.blocks
+            if isinstance(b, TextBlock) and b.text.strip()
+        ]
+
+        abstract = next((b for b in blocks if b.role == 'doc-abstract'), None)
+        if abstract:
+            return ContentMdService._normalize(abstract.text)
+
+        body_blocks = [
+            b
+            for b in blocks
+            if (b.role or 'generic') not in ContentMdService._STRUCTURAL_ROLES
+        ]
+        if not body_blocks:
+            return None
+
+        combined = ' '.join(b.text for b in body_blocks[:5])
+        return ContentMdService._normalize(combined)[:200]
+
+    @staticmethod
+    def _build_frontmatter(
+        title: str,
+        description: Optional[str],
+        date: Optional[str],
+        license: Optional[str],
+        author: Optional[str],
+    ) -> str:
+        ys = ContentMdService._yaml_str
+        lines = ['---', f'title: {ys(title)}']
+        if description:
+            lines.append(f'description: {ys(description)}')
+        if date:
+            lines.append(f'date: {ys(date)}')
+        if license:
+            lines.append(f'license: {ys(license)}')
+        if author:
+            lines.append(f'author: {ys(author)}')
+        lines.append('---')
+        return '\n'.join(lines)
+
+    @staticmethod
+    def _build_body(
+        document: Document, title: str, page_separators: bool = False
+    ) -> str:
+        from parxy_core.models.models import ImageBlock, TableBlock, TextBlock
+
+        normalize = ContentMdService._normalize
+        parts = [f'# {title}']
+
+        for page in document.pages:
+            if page_separators:
+                parts.append(f'<!-- page: {page.number} -->')
+
+            if not page.blocks:
+                if page.text.strip():
+                    parts.append(normalize(page.text))
+                continue
+
+            for block in page.blocks:
+                role = (block.role or 'generic').lower()
+
+                if isinstance(block, TextBlock):
+                    if role == 'doc-title':
+                        # Already the top-level h1 — skip to avoid duplication
+                        pass
+                    elif role == 'heading':
+                        # Shift levels +1: h1 content → h2, per content-md spec
+                        shifted = min((block.level or 1) + 1, 6)
+                        parts.append(f'{"#" * shifted} {normalize(block.text)}')
+                    elif role in ('list', 'listitem'):
+                        for line in block.text.splitlines():
+                            if line.strip():
+                                parts.append(f'- {normalize(line)}')
+                    elif role == 'doc-abstract':
+                        lang_attr = (
+                            f' lang="{document.language}"' if document.language else ''
+                        )
+                        parts.append(
+                            f'<abstract{lang_attr}>\n{normalize(block.text)}\n</abstract>'
+                        )
+                    else:
+                        normalized = normalize(block.text)
+                        if normalized:
+                            parts.append(normalized)
+
+                elif isinstance(block, ImageBlock):
+                    parts.append(f'<figure>\n{block.alt_text or ""}\n</figure>')
+
+                elif isinstance(block, TableBlock):
+                    # Preserve table whitespace (column alignment, padding)
+                    if block.text.strip():
+                        parts.append(block.text.strip())
+
+        return '\n\n'.join(parts)
+
+    # ------------------------------------------------------------------
+    # Public API
+    # ------------------------------------------------------------------
+
+    @staticmethod
+    def render(
+        document: Document,
+        title: Optional[str] = None,
+        description: Optional[str] = None,
+        date: Optional[str] = None,
+        license: Optional[str] = None,
+        author: Optional[str] = None,
+        page_separators: bool = False,
+    ) -> str:
+        """Render *document* as a content-md string.
+
+        Parameters
+        ----------
+        document:
+            The document to render.
+        title:
+            Document title. Falls back to ``metadata.title``, a heading
+            inferred from the first page, then ``filename``. Raises
+            ``ValueError`` if no title can be resolved.
+        description:
+            Short summary (~200 characters). Falls back to a ``doc-abstract``
+            block, then the first five body blocks in the first two pages.
+        date:
+            Creation/publication date in ISO 8601. Falls back to
+            ``metadata.created_at`` / ``metadata.updated_at``.
+        license:
+            License name or SPDX identifier.
+        author:
+            Author name. Falls back to ``metadata.author``.
+        page_separators:
+            When True, inserts ``<!-- page: N -->`` before each page's
+            content in the body.
+
+        Returns
+        -------
+        str
+            The document formatted as content-md.
+        """
+        resolved_title = (
+            title
+            or (document.metadata.title if document.metadata else None)
+            or ContentMdService._guess_title(document)
+            or document.filename
+        )
+        if not resolved_title:
+            raise ValueError(
+                'Cannot render content-md: no title could be resolved. '
+                'Provide a title via metadata, a doc-title/heading block, '
+                'a filename, or pass title= explicitly.'
+            )
+        resolved_description = description or ContentMdService._infer_description(
+            document
+        )
+        resolved_date = date or (
+            (document.metadata.created_at or document.metadata.updated_at)
+            if document.metadata
+            else None
+        )
+        resolved_author = author or (
+            document.metadata.author if document.metadata else None
+        )
+
+        frontmatter = ContentMdService._build_frontmatter(
+            title=resolved_title,
+            description=resolved_description,
+            date=resolved_date,
+            license=license,
+            author=resolved_author,
+        )
+
+        if not document.pages:
+            return f'{frontmatter}\n\n# {resolved_title}\n'
+
+        body = ContentMdService._build_body(document, resolved_title, page_separators)
+        return f'{frontmatter}\n\n{body}\n'
diff --git a/tests/commands/test_markdown.py b/tests/commands/test_markdown.py
index 88b4d74..b4e772f 100644
--- a/tests/commands/test_markdown.py
+++ b/tests/commands/test_markdown.py
@@ -278,3 +278,136 @@ def test_markdown_command_no_files_found(runner, tmp_path):
         result = runner.invoke(app, [str(empty_dir)])
 
         assert result.exit_code == 1
+
+
+def test_markdown_command_json_input_converts_directly(runner, mock_document, tmp_path):
+    """Test that a valid JSON parse result is loaded directly without re-parsing."""
+
+    json_file = tmp_path / 'result.json'
+    json_file.write_text(mock_document.model_dump_json(), encoding='utf-8')
+
+    with patch('parxy_cli.commands.markdown.Parxy') as mock_parxy:
+        result = runner.invoke(app, [str(json_file)])
+
+        assert result.exit_code == 0
+        # batch_iter should NOT be called — no PDF to parse
+        mock_parxy.batch_iter.assert_not_called()
+
+        # Output file should be saved next to the JSON file, without driver prefix
+        expected_output = tmp_path / 'result.md'
+        assert expected_output.exists()
+        assert '# Test heading' in expected_output.read_text()
+
+
+def test_markdown_command_json_input_with_output_dir(runner, mock_document, tmp_path):
+    """Test that JSON input respects the --output directory."""
+
+    json_file = tmp_path / 'result.json'
+    json_file.write_text(mock_document.model_dump_json(), encoding='utf-8')
+    output_dir = tmp_path / 'out'
+
+    with patch('parxy_cli.commands.markdown.Parxy'):
+        result = runner.invoke(app, [str(json_file), '--output', str(output_dir)])
+
+        assert result.exit_code == 0
+        assert (output_dir / 'result.md').exists()
+
+
+def test_markdown_command_json_input_inline(runner, mock_document, tmp_path):
+    """Test that JSON input with --inline prints to stdout."""
+
+    json_file = tmp_path / 'result.json'
+    json_file.write_text(mock_document.model_dump_json(), encoding='utf-8')
+
+    with patch('parxy_cli.commands.markdown.Parxy'):
+        result = runner.invoke(app, [str(json_file), '--inline'])
+
+        assert result.exit_code == 0
+        cleaned = strip_ansi(result.stdout)
+        assert '---' in cleaned
+        assert 'pages:' in cleaned
+        assert '# Test heading' in cleaned
+        assert not (tmp_path / 'result.md').exists()
+
+
+def test_markdown_command_invalid_json_reports_error(runner, tmp_path):
+    """Test that a JSON file with invalid Document content reports an error."""
+
+    json_file = tmp_path / 'bad.json'
+    json_file.write_text('{"not": "a document"}', encoding='utf-8')
+
+    with patch('parxy_cli.commands.markdown.Parxy'):
+        result = runner.invoke(app, [str(json_file)])
+
+        cleaned = strip_ansi(result.stdout)
+        assert 'error' in cleaned.lower()
+
+
+def test_markdown_command_page_separators(runner, mock_document, pdf_file):
+    """Test that --page-separators injects HTML page comments into output."""
+
+    with patch('parxy_cli.commands.markdown.Parxy') as mock_parxy:
+        mock_parxy.default_driver.return_value = 'pymupdf'
+        mock_parxy.batch_iter.return_value = iter(
+            [
+                BatchResult(
+                    file=str(pdf_file),
+                    driver='pymupdf',
+                    document=mock_document,
+                    error=None,
+                )
+            ]
+        )
+
+        result = runner.invoke(app, [str(pdf_file), '--page-separators'])
+
+        assert result.exit_code == 0
+        expected_output = pdf_file.parent / 'pymupdf-test.md'
+        assert expected_output.exists()
+        assert '<!-- page:' in expected_output.read_text()
+
+
+def test_markdown_command_page_separators_json_input(runner, tmp_path):
+    """Test that --page-separators works for JSON inputs."""
+
+    doc = Document(pages=[Page(number=1, text='Hello')])
+    json_file = tmp_path / 'result.json'
+    json_file.write_text(doc.model_dump_json(), encoding='utf-8')
+
+    with patch('parxy_cli.commands.markdown.Parxy'):
+        result = runner.invoke(app, [str(json_file), '--page-separators'])
+
+        assert result.exit_code == 0
+        output = (tmp_path / 'result.md').read_text()
+        assert '<!-- page: 1 -->' in output
+
+
+def test_markdown_command_mixed_json_and_pdf(runner, mock_document, tmp_path):
+    """Test that JSON files and PDF files can be processed together."""
+
+    json_file = tmp_path / 'result.json'
+    json_file.write_text(mock_document.model_dump_json(), encoding='utf-8')
+
+    pdf_file = tmp_path / 'doc.pdf'
+    pdf_file.write_bytes(b'%PDF fake')
+
+    with patch('parxy_cli.commands.markdown.Parxy') as mock_parxy:
+        mock_parxy.default_driver.return_value = 'pymupdf'
+        mock_parxy.batch_iter.return_value = iter(
+            [
+                BatchResult(
+                    file=str(pdf_file),
+                    driver='pymupdf',
+                    document=mock_document,
+                    error=None,
+                )
+            ]
+        )
+
+        result = runner.invoke(app, [str(json_file), str(pdf_file)])
+
+        assert result.exit_code == 0
+        # JSON converted directly
+        assert (tmp_path / 'result.md').exists()
+        # PDF parsed via driver
+        assert (tmp_path / 'pymupdf-doc.md').exists()
diff --git a/tests/services/test_contentmd_service.py b/tests/services/test_contentmd_service.py
new file mode 100644
index 0000000..d0bb1a9
--- /dev/null
+++ b/tests/services/test_contentmd_service.py
@@ -0,0 +1,571 @@
+"""Test suite for ContentMdService."""
+
+import pytest
+
+from parxy_core.models.models import (
+    Document,
+    ImageBlock,
+    Metadata,
+    Page,
+    TableBlock,
+    TextBlock,
+)
+from parxy_core.services.contentmd_service import ContentMdService
+
+
+# ---------------------------------------------------------------------------
+# Helpers
+# ---------------------------------------------------------------------------
+
+
+def make_page(
+    number: int = 1,
+    text: str = '',
+    blocks: list | None = None,
+) -> Page:
+    return Page(number=number, text=text, blocks=blocks)
+
+
+def make_text_block(
+    text: str,
+    role: str = 'generic',
+    level: int | None = None,
+) -> TextBlock:
+    return TextBlock(type='text', text=text, role=role, level=level)
+
+
+def make_image_block(
+    alt_text: str | None = None, name: str | None = None
+) -> ImageBlock:
+    return ImageBlock(type='image', alt_text=alt_text, name=name)
+
+
+def make_table_block(text: str) -> TableBlock:
+    return TableBlock(type='table', text=text)
+
+
+def make_doc(
+    pages: list[Page],
+    metadata: Metadata | None = None,
+    filename: str | None = None,
+    language: str | None = None,
+) -> Document:
+    return Document(
+        pages=pages,
+        metadata=metadata,
+        filename=filename,
+        language=language,
+    )
+
+
+# ---------------------------------------------------------------------------
+# Fixtures
+# ---------------------------------------------------------------------------
+
+
+@pytest.fixture
+def minimal_doc():
+    """Document with a single page, no blocks, no metadata."""
+    return make_doc(pages=[make_page(text='Hello world')])
+
+
+@pytest.fixture
+def metadata_doc():
+    """Document with full metadata and one plain paragraph block."""
+    meta = Metadata(
+        title='Metadata Title',
+        author='Jane Doe',
+        created_at='2025-01-15',
+    )
+    page = make_page(
+        text='Paragraph text.',
+        blocks=[make_text_block('Paragraph text.')],
+    )
+    return make_doc(pages=[page], metadata=meta, filename='report.pdf')
+
+
+@pytest.fixture
+def all_blocks_doc():
+    """Document whose first page contains every supported block type."""
+    blocks = [
+        make_text_block('My Document', role='doc-title'),
+        make_text_block('Introduction', role='heading', level=1),
+        make_text_block('Background', role='heading', level=2),
+        make_text_block('First item\nSecond item', role='list'),
+        make_text_block('A plain paragraph.', role='paragraph'),
+        make_text_block('A brief overview.', role='doc-abstract'),
+        make_image_block(alt_text='A sunset over mountains', name='sunset.jpg'),
+        make_table_block('| Col A | Col B |\n| ----- | ----- |\n| 1     | 2     |'),
+    ]
+    page = make_page(text='My Document', blocks=blocks)
+    return make_doc(pages=[page], language='en')
+
+
+# ---------------------------------------------------------------------------
+# Frontmatter
+# ---------------------------------------------------------------------------
+
+
+class TestFrontmatter:
+    def test_frontmatter_delimiters_present(self, minimal_doc):
+        result = ContentMdService.render(minimal_doc, title='T', description='D')
+        lines = result.splitlines()
+        assert lines[0] == '---'
+        closing = lines.index('---', 1)
+        assert closing > 0
+
+    def test_explicit_title_in_frontmatter(self, minimal_doc):
+        result = ContentMdService.render(minimal_doc, title='Explicit Title')
+        assert 'title: "Explicit Title"' in result
+
+    def test_title_from_metadata(self, metadata_doc):
+        result = ContentMdService.render(metadata_doc)
+        assert 'title: "Metadata Title"' in result
+
+    def test_title_from_doc_title_role_preferred_over_heading(self):
+        blocks = [
+            make_text_block('Real Title', role='doc-title'),
+            make_text_block('Section One', role='heading', level=1),
+        ]
+        doc = make_doc(pages=[make_page(text='', blocks=blocks)])
+        result = ContentMdService.render(doc)
+        assert 'title: "Real Title"' in result
+
+    def test_title_from_heading_when_no_doc_title(self):
+        blocks = [
+            make_text_block('Section One', role='heading', level=2),
+            make_text_block('Section Two', role='heading', level=1),
+        ]
+        doc = make_doc(pages=[make_page(text='', blocks=blocks)])
+        result = ContentMdService.render(doc)
+        # Level 1 heading wins (lowest level = highest rank)
+        assert 'title: "Section Two"' in result
+
+    def test_title_from_filename_when_no_headings(self):
+        doc = make_doc(
+            pages=[make_page(text='body text')],
+            filename='my-report.pdf',
+        )
+        result = ContentMdService.render(doc)
+        assert 'title: "my-report.pdf"' in result
+
+    def test_title_raises_when_unresolvable(self):
+        doc = make_doc(pages=[make_page(text='body text')])
+        with pytest.raises(ValueError, match='no title could be resolved'):
+            ContentMdService.render(doc)
+
+    def test_description_from_explicit_param(self, minimal_doc):
+        result = ContentMdService.render(
+            minimal_doc, title='T', description='My summary.'
+        )
+        assert 'description: "My summary."' in result
+
+    def test_description_from_doc_abstract_block(self):
+        blocks = [
+            make_text_block('Abstract content here.', role='doc-abstract'),
+            make_text_block(
+                'A much longer paragraph that should not be picked.', role='paragraph'
+            ),
+        ]
+        doc = make_doc(pages=[make_page(text='', blocks=blocks)])
+        result = ContentMdService.render(doc, title='T')
+        assert 'description: "Abstract content here."' in result
+
+    def test_description_from_first_five_body_blocks(self):
+        blocks = [make_text_block(f'Sentence {i}.', role='paragraph') for i in range(7)]
+        doc = make_doc(pages=[make_page(text='', blocks=blocks)])
+        result = ContentMdService.render(doc, title='T')
+        # Only the first five contribute; the sixth and seventh are ignored
+        assert 'Sentence 5' not in result.split('---\n')[1].split('\n')[0]
+        assert 'Sentence 0' in result
+
+    def test_description_excludes_structural_roles(self):
+        blocks = [
+            make_text_block('Table of contents text.', role='doc-toc'),
+            make_text_block('Page header text.', role='doc-pageheader'),
+            make_text_block('A heading block.', role='heading'),
+            make_text_block('Body content.', role='paragraph'),
+        ]
+        doc = make_doc(pages=[make_page(text='', blocks=blocks)])
+        result = ContentMdService.render(doc)
+        assert 'description: "Body content."' in result
+
+    def test_description_truncated_to_200_chars(self):
+        long_text = 'word ' * 60  # well over 200 chars
+        blocks = [make_text_block(long_text, role='paragraph')]
+        doc = make_doc(pages=[make_page(text='', blocks=blocks)])
+        result = ContentMdService.render(doc, title='T')
+        fm_end = result.index('---\n', 4)
+        frontmatter = result[:fm_end]
+        desc_line = next(
+            l for l in frontmatter.splitlines() if l.startswith('description:')
+        )
+        # Strip the YAML quoting to measure the actual value length
+        value = desc_line[len('description: "') : -1]
+        assert len(value) <= 200
+
+    def test_description_contains_no_newlines(self):
+        blocks = [
+            make_text_block('Line one.\nLine two.\nLine three.', role='paragraph')
+        ]
+        doc = make_doc(pages=[make_page(text='', blocks=blocks)])
+        result = ContentMdService.render(doc, title='T')
+        fm_end = result.index('---\n', 4)
+        frontmatter = result[:fm_end]
+        desc_line = next(
+            l for l in frontmatter.splitlines() if l.startswith('description:')
+        )
+        assert '\n' not in desc_line
+
+    def test_description_searches_first_two_pages(self):
+        page1 = make_page(number=1, text='', blocks=[make_text_block('Page 1 text.')])
+        page2 = make_page(
+            number=2,
+            text='',
+            blocks=[make_text_block('Page 2 has a longer text block.')],
+        )
+        page3 = make_page(
+            number=3,
+            text='',
+            blocks=[make_text_block('Page 3 has the longest block of all by far.')],
+        )
+        doc = make_doc(pages=[page1, page2, page3])
+        result = ContentMdService.render(doc, title='T')
+        # Page 3 is out of the two-page window
+        assert 'Page 3' not in result.split('---')[1]  # not in frontmatter
+
+    def test_date_from_metadata_created_at(self, metadata_doc):
+        result = ContentMdService.render(metadata_doc)
+        assert 'date: "2025-01-15"' in result
+
+    def test_date_from_metadata_updated_at_when_no_created_at(self):
+        meta = Metadata(updated_at='2025-06-01')
+        doc = make_doc(pages=[make_page(text='')], metadata=meta)
+        result = ContentMdService.render(doc, title='T')
+        assert 'date: "2025-06-01"' in result
+
+    def test_explicit_date_overrides_metadata(self, metadata_doc):
+        result = ContentMdService.render(metadata_doc, date='2026-01-01')
+        assert 'date: "2026-01-01"' in result
+        assert '2025-01-15' not in result
+
+    def test_author_from_metadata(self, metadata_doc):
+        result = ContentMdService.render(metadata_doc)
+        assert 'author: "Jane Doe"' in result
+
+    def test_optional_fields_omitted_when_absent(self, minimal_doc):
+        result = ContentMdService.render(minimal_doc, title='T')
+        assert 'description:' not in result
+        assert 'date:' not in result
+        assert 'license:' not in result
+        assert 'author:' not in result
+
+    def test_license_included_when_provided(self, minimal_doc):
+        result = ContentMdService.render(minimal_doc, title='T', license='CC-BY-4.0')
+        assert 'license: "CC-BY-4.0"' in result
+
+    def test_yaml_values_escaped(self, minimal_doc):
+        result = ContentMdService.render(
+            minimal_doc,
+            title='Title with "quotes"',
+            description='Back\\slash',
+        )
+        assert r'title: "Title with \"quotes\""' in result
+        assert r'description: "Back\\slash"' in result
+
+
+# ---------------------------------------------------------------------------
+# Body – block rendering
+# ---------------------------------------------------------------------------
+
+
+class TestBodyBlocks:
+    def test_body_starts_with_h1_title(self, metadata_doc):
+        result = ContentMdService.render(metadata_doc)
+        body = result.split('---\n', 2)[-1]
+        assert body.lstrip().startswith('# Metadata Title')
+
+    def test_doc_title_block_skipped_in_body(self, all_blocks_doc):
+        result = ContentMdService.render(all_blocks_doc)
+        body = result.split('---\n', 2)[-1]
+        # Should appear exactly once (as the h1), not twice
+        assert body.count('My Document') == 1
+
+    def test_heading_level_shifted_by_one(self):
+        blocks = [make_text_block('Section', role='heading', level=1)]
+        doc = make_doc(pages=[make_page(text='', blocks=blocks)])
+        result = ContentMdService.render(doc, title='T')
+        assert '## Section' in result
+
+    def test_heading_level_2_becomes_3(self):
+        blocks = [make_text_block('Subsection', role='heading', level=2)]
+        doc = make_doc(pages=[make_page(text='', blocks=blocks)])
+        result = ContentMdService.render(doc, title='T')
+        assert '### Subsection' in result
+
+    def test_heading_without_level_defaults_to_h2(self):
+        blocks = [make_text_block('Heading', role='heading')]
+        doc = make_doc(pages=[make_page(text='', blocks=blocks)])
+        result = ContentMdService.render(doc, title='T')
+        assert '## Heading' in result
+
+    def test_heading_level_capped_at_6(self):
+        blocks = [make_text_block('Deep', role='heading', level=6)]
+        doc = make_doc(pages=[make_page(text='', blocks=blocks)])
+        result = ContentMdService.render(doc, title='T')
+        assert '###### Deep' in result
+        assert '####### Deep' not in result
+
+    def test_list_role_rendered_as_bullets(self):
+        blocks = [make_text_block('Alpha\nBeta\nGamma', role='list')]
+        doc = make_doc(pages=[make_page(text='', blocks=blocks)])
+        result = ContentMdService.render(doc, title='T')
+        assert '- Alpha' in result
+        assert '- Beta' in result
+        assert '- Gamma' in result
+
+    def test_listitem_role_rendered_as_bullet(self):
+        blocks = [make_text_block('Single item', role='listitem')]
+        doc = make_doc(pages=[make_page(text='', blocks=blocks)])
+        result = ContentMdService.render(doc, title='T')
+        assert '- Single item' in result
+
+    def test_doc_abstract_rendered_as_abstract_tag(self, all_blocks_doc):
+        result = ContentMdService.render(all_blocks_doc)
+        assert '<abstract lang="en">' in result
+        assert 'A brief overview.' in result
+        assert '</abstract>' in result
+
+    def test_doc_abstract_without_language_omits_lang_attr(self):
+        blocks = [make_text_block('Summary.', role='doc-abstract')]
+        doc = make_doc(pages=[make_page(text='', blocks=blocks)])
+        result = ContentMdService.render(doc, title='T')
+        assert '<abstract>\nSummary.\n</abstract>' in result
+
+    def test_generic_textblock_rendered_as_paragraph(self):
+        blocks = [make_text_block('Plain paragraph text.', role='generic')]
+        doc = make_doc(pages=[make_page(text='', blocks=blocks)])
+        result = ContentMdService.render(doc, title='T')
+        assert 'Plain paragraph text.' in result
+
+    def test_empty_textblock_not_rendered(self):
+        blocks = [make_text_block('   ', role='paragraph')]
+        doc = make_doc(pages=[make_page(text='', blocks=blocks)])
+        result = ContentMdService.render(doc, title='T')
+        # Body should only contain the h1 line
+        body = result.split('---\n', 2)[-1].strip()
+        assert body == '# T'
+
+    def test_image_block_rendered_as_figure(self):
+        blocks = [make_image_block(alt_text='A sunset over mountains')]
+        doc = make_doc(pages=[make_page(text='', blocks=blocks)])
+        result = ContentMdService.render(doc, title='T')
+        assert '<figure>\nA sunset over mountains\n</figure>' in result
+
+    def test_image_block_without_alt_text(self):
+        blocks = [make_image_block()]
+        doc = make_doc(pages=[make_page(text='', blocks=blocks)])
+        result = ContentMdService.render(doc, title='T')
+        assert '<figure>\n\n</figure>' in result
+
+    def test_table_block_rendered_as_is(self):
+        table_text = '| Col A | Col B |\n| ----- | ----- |\n| 1     | 2     |'
+        blocks = [make_table_block(table_text)]
+        doc = make_doc(pages=[make_page(text='', blocks=blocks)])
+        result = ContentMdService.render(doc, title='T')
+        assert table_text in result
+
+    def test_page_without_blocks_uses_page_text(self):
+        page = make_page(text='Fallback page text', blocks=None)
+        doc = make_doc(pages=[page])
+        result = ContentMdService.render(doc, title='T')
+        assert 'Fallback page text' in result
+
+    def test_empty_page_text_not_rendered(self):
+        page = make_page(text='   ', blocks=None)
+        doc = make_doc(pages=[page])
+        result = ContentMdService.render(doc, title='T')
+        body = result.split('---\n', 2)[-1].strip()
+        assert body == '# T'
+
+
+# ---------------------------------------------------------------------------
+# Whitespace normalisation
+# ---------------------------------------------------------------------------
+
+
+class TestWhitespaceNormalisation:
+    def test_multiple_spaces_in_paragraph_collapsed(self):
+        blocks = [make_text_block('Word1   Word2     Word3')]
+        doc = make_doc(pages=[make_page(text='', blocks=blocks)])
+        result = ContentMdService.render(doc, title='T')
+        assert 'Word1 Word2 Word3' in result
+
+    def test_tabs_in_paragraph_collapsed(self):
+        blocks = [make_text_block('Word1\t\tWord2')]
+        doc = make_doc(pages=[make_page(text='', blocks=blocks)])
+        result = ContentMdService.render(doc, title='T')
+        assert 'Word1 Word2' in result
+
+    def test_whitespace_in_heading_collapsed(self):
+        blocks = [make_text_block('My   Section', role='heading', level=1)]
+        doc = make_doc(pages=[make_page(text='', blocks=blocks)])
+        result = ContentMdService.render(doc, title='T')
+        assert '## My Section' in result
+
+    def test_whitespace_in_title_collapsed(self):
+        blocks = [make_text_block('  My   Title  ', role='doc-title')]
+        doc = make_doc(pages=[make_page(text='', blocks=blocks)])
+        result = ContentMdService.render(doc)
+        assert 'title: "My Title"' in result
+
+    def test_whitespace_in_description_collapsed(self):
+        blocks = [make_text_block('Summary   with   gaps.', role='doc-abstract')]
+        doc = make_doc(pages=[make_page(text='', blocks=blocks)])
+        result = ContentMdService.render(doc, title='T')
+        assert 'description: "Summary with gaps."' in result
+
+    def test_table_whitespace_preserved(self):
+        table_text = '| Col A | Col B |\n| ----- | ----- |'
+        blocks = [make_table_block(table_text)]
+        doc = make_doc(pages=[make_page(text='', blocks=blocks)])
+        result = ContentMdService.render(doc, title='T')
+        assert '| Col A | Col B |' in result
+
+
+# ---------------------------------------------------------------------------
+# Output structure
+# ---------------------------------------------------------------------------
+
+
+class TestOutputStructure:
+    def test_result_ends_with_newline(self, minimal_doc):
+        result = ContentMdService.render(minimal_doc, title='T')
+        assert result.endswith('\n')
+
+    def test_empty_pages_list_returns_frontmatter_and_title(self):
+        doc = Document(pages=[])
+        result = ContentMdService.render(doc, title='Empty')
+        assert 'title: "Empty"' in result
+        assert '# Empty' in result
+
+    def test_blocks_separated_by_blank_line(self):
+        blocks = [
+            make_text_block('First paragraph.'),
+            make_text_block('Second paragraph.'),
+        ]
+        doc = make_doc(pages=[make_page(text='', blocks=blocks)])
+        result = ContentMdService.render(doc, title='T')
+        assert 'First paragraph.\n\nSecond paragraph.' in result
+
+    def test_multipage_document_renders_all_pages(self):
+        page1 = make_page(
+            number=1,
+            text='',
+            blocks=[make_text_block('Page one content.')],
+        )
+        page2 = make_page(
+            number=2,
+            text='',
+            blocks=[make_text_block('Page two content.')],
+        )
+        doc = make_doc(pages=[page1, page2])
+        result = ContentMdService.render(doc, title='T')
+        assert 'Page one content.' in result
+        assert 'Page two content.' in result
+
+    def test_render_delegates_from_document_method(self, metadata_doc):
+        via_service = ContentMdService.render(metadata_doc)
+        via_method = metadata_doc.contentmd()
+        assert via_service == via_method
+
+    def test_empty_document_without_args_raises(self):
+        """A document with no metadata, no blocks, no filename, and no user
+        arguments cannot satisfy the required title constraint."""
+        doc = Document(pages=[])
+        with pytest.raises(ValueError, match='no title could be resolved'):
+            ContentMdService.render(doc)
+
+    def test_empty_document_with_title_arg_returns_contentmd(self):
+        """Passing title= explicitly must succeed even when the document is
+        completely empty."""
+        doc = Document(pages=[])
+        result = ContentMdService.render(doc, title='Provided Title')
+        assert 'title: "Provided Title"' in result
+        assert '# Provided Title' in result
+
+    def test_empty_document_with_title_and_description_returns_contentmd(self):
+        """Both title= and description= passed explicitly on an empty document."""
+        doc = Document(pages=[])
+        result = ContentMdService.render(
+            doc, title='My Title', description='My description.'
+        )
+        assert 'title: "My Title"' in result
+        assert 'description: "My description."' in result
+        assert result.endswith('\n')
+
+
+class TestPageSeparators:
+    """Tests for page_separators support in ContentMdService and Document.markdown."""
+
+    def test_contentmd_page_separators_off_by_default(self):
+        page = make_page(number=1, text='', blocks=[make_text_block('Content.')])
+        doc = make_doc(pages=[page])
+        result = ContentMdService.render(doc, title='T')
+        assert '<!-- page:' not in result
+
+    def test_contentmd_page_separators_inserted(self):
+        page = make_page(number=1, text='', blocks=[make_text_block('Content.')])
+        doc = make_doc(pages=[page])
+        result = ContentMdService.render(doc, title='T', page_separators=True)
+        assert '<!-- page: 1 -->' in result
+
+    def test_contentmd_page_separators_multipage(self):
+        page1 = make_page(number=1, text='', blocks=[make_text_block('Page one.')])
+        page2 = make_page(number=2, text='', blocks=[make_text_block('Page two.')])
+        doc = make_doc(pages=[page1, page2])
+        result = ContentMdService.render(doc, title='T', page_separators=True)
+        assert '<!-- page: 1 -->' in result
+        assert '<!-- page: 2 -->' in result
+        # Separators appear in correct order relative to each other
+        assert result.index('<!-- page: 1 -->') < result.index('<!-- page: 2 -->')
+
+    def test_contentmd_page_separators_via_document_method(self):
+        page = make_page(number=3, text='', blocks=[make_text_block('Content.')])
+        doc = make_doc(pages=[page])
+        result = doc.contentmd(title='T', page_separators=True)
+        assert '<!-- page: 3 -->' in result
+
+    def test_markdown_page_separators_off_by_default(self):
+        doc = Document(pages=[Page(number=1, text='Hello world')])
+        result = doc.markdown()
+        assert '<!-- page:' not in result
+
+    def test_markdown_page_separators_inserted(self):
+        doc = Document(pages=[Page(number=1, text='Hello world')])
+        result = doc.markdown(page_separators=True)
+        assert '<!-- page: 1 -->' in result
+
+    def test_markdown_page_separators_multipage(self):
+        doc = Document(
+            pages=[
+                Page(number=1, text='First page'),
+                Page(number=2, text='Second page'),
+            ]
+        )
+        result = doc.markdown(page_separators=True)
+        assert '<!-- page: 1 -->' in result
+        assert '<!-- page: 2 -->' in result
+        assert result.index('<!-- page: 1 -->') < result.index('First page')
+        assert result.index('<!-- page: 2 -->') < result.index('Second page')
+
+    def test_markdown_page_separators_empty_page_still_emits_comment(self):
+        doc = Document(
+            pages=[
+                Page(number=1, text='Content'),
+                Page(number=2, text=''),  # empty page
+            ]
+        )
+        result = doc.markdown(page_separators=True)
+        assert '<!-- page: 1 -->' in result
+        assert '<!-- page: 2 -->' in result