Skip to content

Case Sensitivity Issue: Windows Path Casing Causes Separate Collections in Milvus #233

@kovyfive

Description

@kovyfive

When using the claude-context MCP server on Windows with different MCP clients (e.g., Visual Studio Code, Claude Code, Visual Studio 2022), each client creates a separate collection in Milvus for the same codebase directory. This occurs because the collection name generation is case-sensitive and does not normalize Windows path casing.

For example:

  • Visual Studio indexes with path: C:\git\myproject
  • Claude Code searches with path: c:\git\myproject

These two paths refer to the same directory on Windows (which has a case-insensitive filesystem), but the MCP server generates different MD5 hashes for each, resulting in two separate collections:

  • hybrid_code_chunks_abc12345 (for C:\git\myproject)
  • hybrid_code_chunks_def67890 (for c:\git\myproject)

This causes:

  1. Duplicated indexing work and storage
  2. Search results not being found when the path casing differs
  3. Confusion about which collection contains the actual indexed data

Troubleshooting Guide

For MCP Use Cases

Get your MCP logs first

Started Claude Code with claude --debug mode and observed:

  • Indexing process creating collection hybrid_code_chunks_abc12345
  • Search attempts looking for collection hybrid_code_chunks_def67890
  • Different collection names for what should be the same codebase

What's your MCP Client Setting

{
  "servers": {
    "claude-context": {
      "type": "stdio",
      "command": "npx",
      "args": [
        "@zilliz/claude-context-mcp@latest"
      ],
      "env": {
        "MILVUS_ADDRESS": "127.0.0.1:19530",
        "EMBEDDING_PROVIDER": "Ollama",
        "OLLAMA_HOST": "http://127.0.0.1:11434",
        "OLLAMA_MODEL": "nomic-embed-text"
      }
    }
  }
}

Root Cause

The issue is in packages/core/src/context.ts, in the getCollectionName() method (lines 232-240):

public getCollectionName(codebasePath: string): string {
    const isHybrid = this.getIsHybrid();
    const normalizedPath = path.resolve(codebasePath);  // ← Preserves original casing!
    const hash = crypto.createHash('md5').update(normalizedPath).digest('hex');
    const prefix = isHybrid === true ? 'hybrid_code_chunks' : 'code_chunks';
    return `${prefix}_${hash.substring(0, 8)}`;
}

The path.resolve() function preserves the original casing of the drive letter and path, so:

  • C:\git\myproject produces one MD5 hash
  • c:\git\myproject produces a different MD5 hash

Suggested Solution

Normalize the path casing on Windows before generating the hash:

public getCollectionName(codebasePath: string): string {
    const isHybrid = this.getIsHybrid();
    let normalizedPath = path.resolve(codebasePath);
    
    // Normalize case on Windows for consistent hashing across different clients
    if (process.platform === 'win32') {
        normalizedPath = normalizedPath.toLowerCase();
    }
    
    const hash = crypto.createHash('md5').update(normalizedPath).digest('hex');
    const prefix = isHybrid === true ? 'hybrid_code_chunks' : 'code_chunks';
    return `${prefix}_${hash.substring(0, 8)}`;
}

This ensures that regardless of which MCP client is used or how the path is specified, the same collection name is generated for the same Windows directory.

Additional Considerations

The same normalization should be applied in:

  1. packages/core/src/sync/synchronizer.ts - getSnapshotPath() method (line 23-29) for Merkle tree snapshot consistency
  2. Any other places where paths are hashed for unique identification

Other Information

Whether you can reproduce the error

Yes, this is consistently reproducible:

  1. Index a codebase using one MCP client with uppercase drive letter (e.g., C:\git\myproject)
  2. Try to search the same codebase using another MCP client with lowercase drive letter (e.g., c:\git\myproject)
  3. Search returns no results because it's looking in a different collection

Screenshots

N/A - Issue identified through log analysis and code inspection

Software version:

  • OS: Windows 11
  • Node.js: v20.x
  • Package: @zilliz/claude-context-mcp@latest

Additional context

This is a Windows-specific issue due to:

  • Windows filesystem being case-insensitive (both C:\ and c:\ work)
  • Different applications/terminals using different default casing for drive letters
  • The MCP server treating these as different paths for hashing purposes

Linux and macOS are not affected since their filesystems are case-sensitive and path resolution is consistent.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions