summaryrefslogtreecommitdiff
path: root/Userland/Libraries/LibPDF
AgeCommit message (Collapse)Author
2023-05-19LibPDF: Avoid unnecessary HashMap copy, mark other copiesBen Wiederhake
2023-05-12LibGfx+Fuzz: Convert ImageDecoder::initialize to ErrorOrBen Wiederhake
This prevents callers from accidentally discarding the result of initialize(), which was the root cause of this OSS Fuzz bug: https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=55896&q=label%3AProj-serenity&sort=summary
2023-04-12Everywhere: Fix a few typosNico Weber
Some even user-visible!
2023-04-09Everywhere: Remove unused DeprecatedString includesBen Wiederhake
2023-03-25LibPDF: Load replacements for TrueTypeFonts without an embedded fontJulian Offenhäuser
This previously only happened for Type 1 fonts.
2023-03-25LibPDF: Actually return an error when failing to load replacement fontsJulian Offenhäuser
2023-03-25LibPDF: Ask OpenType font programs for glyph widths if neededJulian Offenhäuser
If the font dictionary didn't specify custom glyph widths, we would fall back to the specified "missing width" (or 0 in most cases!), which meant that we would draw glyphs on top of each other in a lot of cases, namely for TrueTypeFonts or standard Type1Fonts with an OpenType fallback. What we actually want to do in this case is ask the OpenType font for the correct width.
2023-03-25LibPDF: Remove the subroutine length limit for PS1 font programsJulian Offenhäuser
A limit of 1024 subroutines seemed like a sensible choice, but some fonts actually do exceed it. We will now only assert that the specified amount is positive.
2023-03-25LibPDF: Scale vector paths with the viewJulian Offenhäuser
This ensures that lines have the correct size at every scale factor.
2023-03-25LibPDF: Accept floats as line dash pattern phasesJulian Offenhäuser
2023-03-25LibPDF: Allow the page rotation to be inheritedJulian Offenhäuser
2023-03-25LibPDF: Allow pages with no specified contentsJulian Offenhäuser
The contents object may be omitted as per spec, which will just leave the page blank.
2023-03-25LibPDF: Allow optional inheritable page attributesJulian Offenhäuser
Previously, get_inheritable_object would always try to find the object and throw an error if it couldn't. The spec tells us that some page attributes, like CropBox, are optional but also inheritable. Others, like the media box and resources, are technically required by the spec, but omitted by some documents. In both cases, we are now able to search for inheritable objects and find a suitable replacement if there wasn't one.
2023-03-25LibPDF: Ignore whitespace in the ASCII hex filterJulian Offenhäuser
The spec tells us that any amount of whitespace may appear between the hex digits and that it should just be ignored.
2023-03-22LibPDF: Pass the right point width to the font loader in TrueTypeFontJulian Offenhäuser
2023-03-22LibPDF: Fix navigate_to_before_eof_marker() for PDFs not ending in EOLJulian Offenhäuser
The way this was factored before, we would miss the %%EOF marker if it didn't have a valid end-of-line sequence after it.
2023-03-22LibPDF: Don't consume anything other than EOL in Reader::consume_eol()Julian Offenhäuser
This was previously a slightly confusing API. Even when there was no EOL marker at the current location, we would still consume one byte. It will now consume either EOL or nothing at all.
2023-03-22LibPDF: Be more cautious of errors when looking for linearization dictJulian Offenhäuser
We would previously assume that, following the header, there must be a valid PDF object that could be a linearization dict. However, if the file is not linearized, this is not necessarily true. We now try to detect if there even is an object, and don't treat parsing errors as fatal.
2023-03-22LibPDF: Don't treat a broken document header as a fatal errorJulian Offenhäuser
As the current goal is to make our best effort loading documents, we might as well ignore a broken header and power through, giving the user a warning.
2023-03-21LibGfx: Move all image loaders and writers to a subdirectoryLucas CHOLLET
2023-03-06Everywhere: Remove NonnullRefPtr.h includesAndreas Kling
2023-03-06Everywhere: Stop using NonnullRefPtrVectorAndreas Kling
This class had slightly confusing semantics and the added weirdness doesn't seem worth it just so we can say "." instead of "->" when iterating over a vector of NNRPs. This patch replaces NonnullRefPtrVector<T> with Vector<NNRP<T>>.
2023-03-02LibPDF: Detect CFF encodings with supplementsRodrigo Tobar
These are not yet actually parsed, but detecting them means we at least don't fail to understand the *actual* format value, which was causing some CFF fonts to fail to load.
2023-03-02LibPDF: Increase argument stack for Type1FontProgramsRodrigo Tobar
Type1 imposes a stack limit of 24 elements, but Type2 has a limit of 48. We are better off relaxing the limit of the former in favour of properly supporting the latter.
2023-03-02LibPDF: Improve Type2 hint countingRodrigo Tobar
There were two issues with how we counted hints with Type2 CharString commands: the first was that we assumed a single hint per command, even though there are commands that accept multiple hints thanks to taking a variable number of operands; and secondly, the hintmask/ctrlmask commands can also take operands (i.e., hints) themselves in certain situations. This commit fixes these two issues by correctly counting hints in both cases. This in turn fixes cases when there were more than 8 hints in total, therefore a hintmask/ctrlmask command needed to read more than one byte past the operator itself.
2023-03-02LibPDF: Don't crash when a font hasn't been loaded yetRodrigo Tobar
This could happen because there was a problem while loading the first font in the document.
2023-03-02LibPDF: Prevent crashes when loading XObject streamsRodrigo Tobar
These streams might need a Filter that isn't implemented yet, and thus cannot be blindly MUST()-ed.
2023-03-02LibPDF: Improve error support for Filter classRodrigo Tobar
The Filter class had a few TODO()s that resulted in crashes at runtime. Since we now have a better way to report errors back to the user let's use that instead.
2023-02-26LibGfx: Return bool not ErrorOr<bool> from ImageDecoderPlugin::sniff()MacDue
Nobody made use of the ErrorOr return value and it just added more chance of confusion, since it was not clear if failing to sniff an image should return an error or false. The answer was false, if you returned Error you'd crash the ImageDecoder.
2023-02-24LibPDF: Refactor *Font classesRodrigo Tobar
The PDFFont class hierarchy was very simple (a top-level PDFFont class, followed by all the children classes that derived directly from it). While this design was good enough for some things, it didn't correctly model the actual organization of font types: * PDF fonts are first divided between "simple" and "composite" fonts. The latter is the Type0 font, while the rest are all simple. * PDF fonts yield a glyph per "character code". Simple fonts char codes are always 1 byte long, while Type0 char codes are of variable size. To this effect, this commit changes the hierarchy of Font classes, introducing a new SimpleFont class, deriving from PDFFont, and acting as the parent of Type1Font and TrueTypeFont, while Type0 still derives from PDFFont directly. This distinction allows us now to: * Model string rendering differently from simple and composite fonts: PDFFont now offers a generic draw_string method that takes a whole string to be rendered instead of a single char code. SimpleFont implements this as a loop over individual bytes of the string, with T1 and TT implementing draw_glyph for drawing a single char code. * Some common fields between T1 and TT fonts now live under SimpleFont instead of under PDFfont, where they previously resided. * Some other interfaces specific to SimpleFont have been cleaned up, with u16/u32 not appearing on these classes (or in PDFFont) anymore. * Type0Font's rendering still remains unimplemented. As part of this exercise I also took the chance to perform the following cleanups and restructurings: * Refactored the creation and initialisation of fonts. They are all centrally created at PDFFont::create, with a virtual "initialize" method that allows them to initialise their inner members in the correct order (parent first, child later) after creation. * Removed duplicated code. * Cleaned up some public interfaces: receive const refs, removed unnecessary ctro/dtors, etc. * Slightly changed how Type1 and TrueType fonts are implemented: if there's an embedded font that takes priority, otherwise we always look for a replacement. * This means we don't do anything special for the standard fonts. The only behavior previously associated to standard fonts was choosing an encoding, and even that was under questioning.
2023-02-24LibPDF: Add new error construction functionsRodrigo Tobar
These should make it easier to create specific errors, specially when wanting to create a formatted message.
2023-02-24LibPDF: Allow show_text to return errorsRodrigo Tobar
Errors can (and do) occur when trying to render text, and so far we've silently ignored them, making us think that all is well when it isn't. Letting show_text return errors will allow us to inform the user about these errors instead of having to hiding them.
2023-02-21LibPDF: Make Object::cast<T>() non-constAndreas Kling
This was only ever used to cast non-const objects to other non-const object types.
2023-02-19LibTextCodec+Everywhere: Port Decoders to new StringsSam Atkins
2023-02-18LibGfx: Rename `JPGLoader` to `JPEGLoader`Lucas CHOLLET
The patch also contains modifications on several classes, functions or files that are related to the `JPGLoader`. Renaming include: - JPGLoader{.h, .cpp} - JPGImageDecoderPlugin - JPGLoadingContext - JPG_DEBUG - decode_jpg - FuzzJPGLoader.cpp - Few string literals or texts
2023-02-15LibTextCodec+Everywhere: Make TextCodec::decoder_for() take a StringViewSam Atkins
We don't need a full String/DeprecatedString inside this function, so we might as well not force users to create one.
2023-02-13LibPDF: Add more built-in SIDsRodrigo Tobar
The first iteration has enough SIDs to display simple documents, but when trying more and more documents we started to need more of these SIDs to be properly defined. This is a copy/paste exercise from the CFF document, which is tedious, so it will continue in small drops. This commit fills all the gaps until SID 228, which covers all the ISOAdobe space, and should be enough for most use cases. Since this is a continuous space starting at 0, we now use an Array instead of a Map to store these names, which should be more performant. Also to simplify things I've moved the Array out of the CFF class, making it a simpler static variable, which allows us to use template type deduction.
2023-02-12LibPDF: Check for end of stream in Reader::matches_regular_character()Julian Offenhäuser
The way this was set up before, this function would return "true" if the underlying stream had ended, which would cause us to try to read past the end in some edge cases.
2023-02-12LibPDF: Return an error if we fail to load a replacement fontJulian Offenhäuser
2023-02-12LibPDF: Allow filter DecodeParms array entries to be nullJulian Offenhäuser
Filters will use the default values in this case.
2023-02-12LibPDF: Allow reading documents with incremental updatesJulian Offenhäuser
The PDF spec allows incremental changes of a document by appending a new XRef table and file trailer to it. These will only contain the changed objects and will point back to the previous change, forming an arbitrarily long chain of XRef sections and file trailers. Every one of those XRef sections may be encoded as an XRef stream as well, in which case the trailer is part of the stream dictionary as usual. To make this easier, I made it so every XRef table may "own" a trailer. This means that the main file trailer is now part of the main XRef table.
2023-02-10LibPDF: Fix glyph sizing bug that caused incorrect spacingJulian Offenhäuser
When loading OpenType fonts, either as a replacement for the standard 14 fonts or an embedded one, we previously passed the font size as the _point_ size to the loader class. The difference is quite subtle, being that Gfx::ScaledFont uses the optional dpi parameter to convert the input from inches to pixels. This meant that our glyphs were exactly 1.333% too large, causing them to overlap in places.
2023-02-10LibPDF: Use more appropriate standard 14 replacement fontsJulian Offenhäuser
The mapping of standard font to replacement now looks like this: Times New Roman -> Liberation Serif Courier -> Liberation Mono Helvetica, Arial -> Liberation Sans
2023-02-08Everywhere: Use ReadonlySpan<T> instead of Span<T const>MacDue
2023-02-08LibPDF: Construct accented characters with Type1 seac commandRodrigo Tobar
The seac command provides the base and accented character that are needed to create an accented character glyph. Storing these values is all that was left to properly support these composed glyphs.
2023-02-08LibPDF: Add infrastructure for accented character glyphsRodrigo Tobar
Type1 accented character glyphs are composed of two other glyphs in the same font: a base glyph and an accent glyph, given as char codes in the standard encoding. These two glyphs are then composed together to form the accented character. This commit adds the data structures to hold the information for accented characters, and also the routine that composes the final glyph path out of the two individual components. All glyphs must have been loaded by the time this composition takes place, and thus a new protected consolidate_glyphs() routine has been added to perform this calculation.
2023-02-08LibPDF: Turn Glyph into a classRodrigo Tobar
Glyph was a simple structure, but even now it's become more complex that it was initially. Turning it into a class hides some of that complexity, and make sit easier to understand to external eyes. While doing this I also decided to remove the float + bool combo for keeping track of the glyph's width, and replaced it with an Optional instead.
2023-02-08LibPDF: Index Type1 glyphs by name, not char codeRodrigo Tobar
Storing glyphs indexed by char code in a Type1 Font Program binds a Font Program instance to the particular Encoding that was used at Font Program construction time. This makes it difficult to reuse Font Program instances against different Encodings, which would be otherwise possible. This commit changes how we store the glyphs on Type1 Font Programs. Instead of storing them on a map indexed by char code, the map is now indexed by glyph name. In turn, when rendering a glyph we use the Encoding object to turn the char code into a glyph name, which in turn is used to index into the map of glyphs. This is the first step towards reusability of Type1 Font Programs. It also unlocks the ability to render glyphs that are described via the "seac" command (standard encoding accented character), which requires accessing the base and accent glyphs by name.
2023-02-08LibPDF: Add placeholders for *flex Type2 commandsRodrigo Tobar
These should be implemented properly in the future, but for now we are adding the as placeholders to avoid crashes.
2023-02-08LibPDF: Add char_code -> name mapping functionRodrigo Tobar
We already keep both mappings internally, now it's time to actually use it.