SerenityOS build: Wednesday, January 03
Images 💾⌗
- serenity-x86_64-20240103-9495f64.img.gz (Raw image, 206.78 MiB)
Last commit ⭐⌗
commit 9495f64f913107ef746588986edb8de5e4c39db7
Author: Nico Weber <thakis@chromium.org>
AuthorDate: Mon Jan 1 19:31:27 2024 -0500
Commit: Andreas Kling <kling@serenityos.org>
CommitDate: Tue Jan 2 22:13:21 2024 +0100
LibPDF: Improve hex string parsing
A local (non-public) PDF I have lying around contains this in
a page's operator stream:
```
[<00b4003e> 3 <002600480051> 3 <005700550044004f0003> -29
<00330044> 3 <0055> -3 <004e0040> 4 <0003> -29 <004c00560003> -31
<0057004b> 4 <00480003> -37 <0050
>] TJ
```
That is, there's a newline in a hexstring after a character.
This led to `Parser error at offset 5184: Unexpected character`.
The spec says in 3.2.3 String Objects, Hexadecimal Strings:
"""Each pair of hexadecimal digits defines one byte of the string.
White-space characters (such as space, tab, carriage return, line feed,
and form feed) are ignored."""
But we didn't ignore whitespace before or after a character, only
in between the bytes.
The spec also says:
"""If the final digit of a hexadecimal string is missing—that is, if
there is an odd number of digits—the final digit is assumed to be 0."""
In that case, we were skipping the closing `>` twice -- or, more
accurately, we ignored the character after it too. This has been
wrong all the way back in #6974.
Add a test that fails if either of the two changes isn't present.
Other builds