weaviate · g-despot · May 4, 2026 · May 4, 2026
diff --git a/docs/weaviate/manage-collections/inverted-index.mdx b/docs/weaviate/manage-collections/inverted-index.mdx
@@ -201,15 +201,17 @@ Tokenization determines how text content is broken down into individual terms th
 
 **`word`** - The default tokenization that splits text on whitespace and punctuation, converting to lowercase. Best for general text search where you want to match individual words.
 
-**`lowercase`** - Converts the entire property value to lowercase but treats it as a single token. Useful for exact matching of short strings like categories or tags while being case-insensitive.
+**`lowercase`** - Splits text on whitespace only, then lowercases each token. Preserves symbols (like `&`, `@`, `_`) that `word` tokenization would strip. Good for case-insensitive matching where punctuation is meaningful — e.g. code snippets or email addresses.
 
 **`whitespace`** - Splits text only on whitespace characters, preserving punctuation and case. Good when punctuation is meaningful for search.
 
 **`field`** - Treats the entire property value as a single token without any processing. Use for exact matching of complete field values like IDs, email addresses, or URLs.
 
 **`trigram`** - Breaks text into overlapping 3-character sequences. Enables fuzzy matching and is useful for handling typos or partial matches.
 
-**`gse`** - Google Search Engine tokenization, optimized for Chinese, Japanese, and Korean text. Provides language-aware tokenization for CJK languages.
+**`gse`** - Language-aware tokenization for Chinese and Japanese text. Disabled by default. Enable with the `ENABLE_TOKENIZER_GSE` environment variable. For Korean text, see the `kagome_kr` option.
+
+For the full list of supported tokenizers — including `kagome_ja`, `kagome_kr`, and the per-property text-analyzer options — see the [tokenization reference](../config-refs/collections.mdx#tokenization).
 
 </details>