Skip to content

fix(xml): reject invalid XML element and attribute names#536

Open
lawrence3699 wants to merge 1 commit intoTomWright:masterfrom
lawrence3699:fix/xml-invalid-element-names
Open

fix(xml): reject invalid XML element and attribute names#536
lawrence3699 wants to merge 1 commit intoTomWright:masterfrom
lawrence3699:fix/xml-invalid-element-names

Conversation

@lawrence3699
Copy link
Copy Markdown

Summary

Problem

When converting from other formats (e.g., JSON) to XML, keys containing characters that are not valid in XML names (such as <, >, &, spaces) are passed directly to Go's encoding/xml encoder. The encoder does not validate element names, so it produces invalid XML that cannot be parsed back.

$ echo '{"<": "<", ">": ">", "&": "&"}' | dasel -i json -o xml
<<>&lt;</<>
<>>&gt;</>>
<&>&amp;</&>

This output is not valid XML. Even dasel itself cannot parse it:

$ echo '{"<": "<"}' | dasel -i json -o xml | dasel -i xml
Error: could not unmarshal data: XML syntax error ...

Solution

Add XML Name validation per the XML 1.0 (Fifth Edition) specification §2.3. The toElement function now checks element names, and extractAttrsAndText checks attribute names, before they are used. Invalid names produce a clear error:

$ echo '{"<": "<"}' | dasel -i json -o xml
dasel: error: ... key "<" is not a valid XML element name

Test plan

  • 3 integration tests: <, &, and space in element names all return errors
  • Unit test for isValidXMLName covering valid names (foo, _bar, ns:local, über, etc.) and invalid names (<, >, &, foo bar, 1abc, etc.)
  • Existing XML round-trip tests all pass (no regression)
  • Full go test ./... passes
  • go vet clean

Previously, keys containing characters not valid in XML names (e.g. <, >, &,
spaces) were passed directly to Go's encoding/xml encoder, producing invalid
XML that could not be parsed back.

Validate element and attribute names against the XML 1.0 (Fifth Edition)
specification before writing. Return a clear error for names containing
characters not permitted by the spec.

Fixes TomWright#425
Copilot AI review requested due to automatic review settings April 12, 2026 18:13
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds XML 1.0 (Fifth Edition) name validation to prevent emitting malformed XML when map keys contain invalid characters, returning an explicit error instead.

Changes:

  • Introduces XML Name/NameChar validation helpers and validates element names before encoding.
  • Validates attribute names (after stripping the - prefix) before encoding attributes.
  • Adds tests covering invalid element names and unit tests for isValidXMLName.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 3 comments.

File Description
parsing/xml/writer.go Adds XML name validation for elements/attributes and returns errors for invalid names.
parsing/xml/writer_test.go Adds integration tests asserting invalid element names cause write errors.
parsing/xml/writer_internal_test.go Adds unit tests for the XML name validator helper.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread parsing/xml/writer.go
if strings.HasPrefix(kv.Key, "-") {
attrName := kv.Key[1:]
if !isValidXMLName(attrName) {
return fmt.Errorf("key %q is not a valid XML attribute name", attrName)
Copy link

Copilot AI Apr 12, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The attribute-name error message reports the stripped attribute name as a “key” (e.g. input key "-lang" would produce an error about key "lang"). That can be confusing for users debugging map keys. Consider wording the message as an attribute name error and/or include the original map key (including the leading "-") in the error details.

Suggested change
return fmt.Errorf("key %q is not a valid XML attribute name", attrName)
return fmt.Errorf("invalid XML attribute name %q from map key %q", attrName, kv.Key)

Copilot uses AI. Check for mistakes.
Comment thread parsing/xml/writer.go
Comment on lines 200 to +204
if strings.HasPrefix(kv.Key, "-") {
attrName := kv.Key[1:]
if !isValidXMLName(attrName) {
return fmt.Errorf("key %q is not a valid XML attribute name", attrName)
}
Copy link

Copilot AI Apr 12, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

New behavior rejects invalid attribute names, but there isn’t a test that exercises this error path (e.g. a map containing an attribute key like "-<" or "-foo bar"). Adding an integration test for invalid attribute names would help prevent regressions and ensure the writer fails with a clear error when encountering invalid attribute keys.

Copilot uses AI. Check for mistakes.
Comment on lines +154 to +166
t.Run("invalid element name", func(t *testing.T) {
w, err := xml.XML.NewWriter(parsing.DefaultWriterOptions())
if err != nil {
t.Fatalf("Unexpected error: %s", err)
}

toEncode := model.NewMapValue()
_ = toEncode.SetMapKey("<", model.NewStringValue("value"))
_, err = w.Write(toEncode)
if err == nil {
t.Fatal("Expected error for invalid XML element name, got nil")
}
})
Copy link

Copilot AI Apr 12, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These tests only assert that an error occurs, but they don’t verify the error message includes the offending key and indicates it’s an element-name validation failure. Since the PR’s goal is to return a clear error for invalid XML names, asserting on key/message content would help keep the error user-friendly over time.

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Incorrect escaping of XML special characters

2 participants