Age | Commit message (Collapse) | Author |
|
|
|
|
|
Strings can be encoded in either UTF16-BE or UTF8. In either case,
there are a few initial bytes which specify the encoding that must
be checked and also removed from the final string.
|
|
|
|
IndirectValueRef is so simple that it can be stored directly in the
Value class instead of being heap allocated.
As the comment in Value says, however, in theory the max bits needed to
store is 48 (16 for the generation index and 32(?) for the object
index), but 32 should be good enough for now. We can increase it to u64
later if necessary.
|
|
This commit only supports the three most basic color spaces:
DeviceGray, DeviceRGB, and DeviceCMYK
|
|
This completely ignores the actual path and just uses its bounding box,
since our painter doesn't support clipping to paths.
|
|
|
|
This commit also splits up StreamObject into PlainTextStreamObject and
EncodedStreamObject, which is essentially just a stream object which
does not own its bytes vs one which does.
|
|
|
|
|
|
|
|
|
|
We can't call resolve_to with IndirectValue{,Ref}, since the values
will obviously be resolved, and will not give us the object of the
correct type.
|
|
This commit adds the Renderer class, which is responsible for rendering
a page into a Gfx::Bitmap. There are many improvements to make here,
but this is a great start!
|
|
|
|
|
|
Some PDFs omit this key apparently, but Firefox opens them fine. Let's
emulate that behavior.
|
|
The Parser will need to call resolve_to on certain values.
|
|
We now follow nested page tree nodes to find all of the actual
page dicts, whereas previously we just assumed the root level
page tree node contained all of the page children directly.
|
|
This commit introduces the ability to parse the document catalog dict,
as well as the page tree and individual pages. Pages obviously aren't
fully parsed, as we won't care about most of the fields until we
start actually rendering PDFs.
One of the primary benefits of the PDF format is laziness. PDFs are
not meant to be parsed all at once, and the same is true for pages.
When a Document is constructed, it builds a map of page number to
object index, but it does not fetch and parse any of the pages. A page
is only parsed when a caller requests that particular page (and is
cached going forwards).
Additionally, this commit also adds an object_cast function which
logs bad casts if DEBUG_PDF is set. Additionally, utility functions
were added to ArrayObject and DictObject to get all types of objects
from the collections to avoid having to manually cast.
|
|
This commit adds a parser as well as the Reader class, which serves
as a utility to aid in reading the PDF both forwards and in reverse.
The parser currently is capable of reading xref tables, as well as
all values. We don't really do anything with any of this information,
however.
|
|
This commit is the start of LibPDF, and introduces some basic structure
objects. This emulates LibJS's Value structure, where Value is a simple
class that can contain a pointer to a more complex Object class with
more data. All of the basic PDF objects have a representation.
|