Skip to content

[3.13] gh-128110: Fix rfc2047 whitespace handling in email parser address headers (GH-130749)#149789

Open
miss-islington wants to merge 1 commit into
python:3.13from
miss-islington:backport-7a4c6df-3.13
Open

[3.13] gh-128110: Fix rfc2047 whitespace handling in email parser address headers (GH-130749)#149789
miss-islington wants to merge 1 commit into
python:3.13from
miss-islington:backport-7a4c6df-3.13

Conversation

@miss-islington
Copy link
Copy Markdown
Contributor

@miss-islington miss-islington commented May 13, 2026

RFC 2047 Section 6.2 requires that "any 'linear-white-space' that
separates a pair of adjacent 'encoded-word's is ignored." The modern
header value parser correctly implements that for unstructured headers,
but had missed a case in structured headers. This could cause a parsed
address header to include extraneous spaces in a display-name.

Switch to @bitdancer's fix from review feedback. Recharacterize space
between ews as fws after parsing in get_phrase.

RDM: This fix is dependent on the fact that "subsequent" atoms will never have
leading whitespace because that's been consumed already. I don't think
it's worth adding extra code for the possibility of leading whitespace
because the parser won't produce it. It's a bit of parser fragility in the
face of code changes, but I think that's a minor concern given the
parser design (which is that it consumes whitespace greedily)
(cherry picked from commit 7a4c6df)

Co-authored-by: Mike Edmunds medmunds@gmail.com
Co-authored-by: R David Murray rdmurray@bitdance.com

…ess headers (pythonGH-130749)

RFC 2047 Section 6.2 requires that "any 'linear-white-space' that
separates a pair of adjacent 'encoded-word's is ignored." The modern
header value parser correctly implements that for unstructured headers,
but had missed a case in structured headers. This could cause a parsed
address header to include extraneous spaces in a display-name.

Switch to @bitdancer's fix from review feedback. Recharacterize space
between ews as fws after parsing in get_phrase.

RDM: This fix is dependent on the fact that "subsequent" atoms will never have
leading whitespace because that's been consumed already. I don't think
it's worth adding extra code for the possibility of leading whitespace
because the parser won't produce it. It's a bit of parser fragility in the
face of code changes, but I think that's a minor concern given the
parser design (which is that it consumes whitespace greedily)
(cherry picked from commit 7a4c6df)

Co-authored-by: Mike Edmunds <medmunds@gmail.com>
Co-authored-by: R David Murray <rdmurray@bitdance.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants