Age | Commit message (Collapse) | Author |
|
This basically never tells us anything actionable anyway, and it's a
real annoyance when doing something validation-heavy like profiling.
|
|
This is definitely a bug, but it seems to happen randomly every now
and then and we need more info to track it down, so let's log for now.
|
|
InodeVMObjects can have nulled-out physical page slots. That just means
we haven't cached that page from disk right now.
|
|
The kernel sampling profiler will walk thread stacks during the timer
tick handler. Since it's not safe to trigger page faults during IRQ's,
we now avoid this by checking the page tables manually before accessing
each stack location.
|
|
We're not equipped to deal with page faults during an IRQ handler,
so add an assertion so we can immediately tell what's wrong.
This is why profiling sometimes hangs the system -- walking the stack
of the profiled thread causes a page fault and things fall apart.
|
|
|
|
This makes Region 4 bytes smaller and we can use bitfield initializers
since they are allowed in C++20. :^)
|
|
Anonymous VM objects should never have null entries in their physical
page list. Instead, "empty" or untouched pages should refer to the
shared zero page.
Fixes #1237.
|
|
Suggested by Sergey. The currently running Thread and Process are now
Thread::current and Process::current respectively. :^)
|
|
This is exposed via the non-standard serenity_mmap() call in userspace.
|
|
|
|
|
|
|
|
|
|
|
|
A 16-bit refcount is just begging for trouble right nowl.
A 32-bit refcount will be begging for trouble later down the line,
so we'll have to revisit this eventually. :^)
|
|
This patch adds a globally shared zero-filled PhysicalPage that will
be mapped into every slot of every zero-filled AnonymousVMObject until
that page is written to, achieving CoW-like zero-filled pages.
Initial testing show that this doesn't actually achieve any sharing yet
but it seems like a good design regardless, since it may reduce the
number of page faults taken by programs.
If you look at the refcount of MM.shared_zero_page() it will have quite
a high refcount, but that's just because everything maps it everywhere.
If you want to see the "real" refcount, you can build with the
MAP_SHARED_ZERO_PAGE_LAZILY flag, and we'll defer mapping of the shared
zero page until the first NP read fault.
I've left this behavior behind a flag for future testing of this code.
|
|
This gets rid of a bunch of inline assembly.
|
|
|
|
This was only used by HashTable::dump() which I used when doing the
first HashTable implementation. Removing this allows us to also remove
most includes of <AK/kstdio.h>.
|
|
|
|
|
|
It doesn't look healthy to create raw references into an array before
a temporary unlock. In fact, that temporary unlock looks generally
unhealthy, but it's a different problem.
|
|
Add PageTableEntry::clear() to zero out a whole PTE, and use that for
unmapping instead of clearing individual fields.
|
|
|
|
We should never end up deallocating an empty range, or a range that
ends before it begins.
|
|
|
|
Previously we were only checking that each of the virtual pages in the
specified range were valid.
This made it possible to pass in negative buffer sizes to some syscalls
as long as (address) and (address+size) were on the same page.
|
|
Previously it was not possible for this function to fail. You could
exploit this by triggering the creation of a VMObject whose physical
memory range would wrap around the 32-bit limit.
It was quite easy to map kernel memory into userspace and read/write
whatever you wanted in it.
Test: Kernel/bxvga-mmap-kernel-into-userspace.cpp
|
|
Regions *do* zero-fill on demand now. :^)
|
|
It always bothered me that we're using the overloaded "dereference"
term for this. Let's call it "unreference" instead. :^)
|
|
When using dbg() in the kernel, the output is automatically prefixed
with [Process(PID:TID)]. This makes it a lot easier to understand which
thread is generating the output.
This patch also cleans up some common logging messages and removes the
now-unnecessary "dbg() << *current << ..." pattern.
|
|
We don't need to have this method anymore. It was a hack that was used
in many components in the system but currently we use better methods to
create virtual memory mappings. To prevent any further use of this
method it's best to just remove it completely.
Also, the APIC code is disabled for now since it doesn't help booting
the system, and is broken since it relies on identity mapping to exist
in the first 1MB. Any call to the APIC code will result in assertion
failed.
In addition to that, the name of the method which is responsible to
create an identity mapping between 1MB to 2MB was changed, to be more
precise about its purpose.
|
|
This can be used to create a VMObject for a single PhysicalPage.
|
|
There is no real "read protection" on x86, so we have no choice but to
map write-only pages simply as "present & read/write".
If we get a read page fault in a non-readable region, that's still a
correctness issue, so we crash the process. It's by no means a complete
protection against invalid reads, since it's trivial to fool the kernel
by first causing a write fault in the same region.
|
|
VirtualAddress is constructible from uintptr_t and const void*.
PhysicalAddress is constructible from uintptr_t but not const void*.
|
|
uintptr_t is 32-bit or 64-bit depending on the target platform.
This will help us write pointer size agnostic code so that when the day
comes that we want to do a 64-bit port, we'll be in better shape.
|
|
This made the allocator perform worse, so here's another second off of
the Kernel/Process.cpp compile time from a simple bugfix! (31s to 30s)
|
|
Instead of restoring CR3 to the current process's paging scope when a
ProcessPagingScope goes out of scope, we now restore exactly whatever
the CR3 value was when we created the ProcessPagingScope.
This fixes breakage in situations where a process ends up with nested
ProcessPagingScopes. This was making profiling very fragile, and with
this change it's now possible to profile g++! :^)
|
|
Previously, when deallocating a range of VM, we would sort and merge
the range list. This was quite slow for large processes.
This patch optimizes VM deallocation in the following ways:
- Use binary search instead of linear scan to find the place to insert
the deallocated range.
- Insert at the right place immediately, removing the need to sort.
- Merge the inserted range with any adjacent range(s) in-line instead
of doing a separate merge pass into a list copy.
- Add Traits<Range> to inform Vector that Range objects are trivial
and can be moved using memmove().
I've also added an assertion that deallocated ranges are actually part
of the RangeAllocator's initial address range.
I've benchmarked this using g++ to compile Kernel/Process.cpp.
With these changes, compilation goes from ~41 sec to ~35 sec.
|
|
This will panic the kernel immediately if these functions are misused
so we can catch it and fix the misuse.
This patch fixes a couple of misuses:
- create_signal_trampolines() writes to a user-accessible page
above the 3GB address mark. We should really get rid of this
page but that's a whole other thing.
- CoW faults need to use copy_from_user rather than copy_to_user
since it's the *source* pointer that points to user memory.
- Inode faults need to use memcpy rather than copy_to_user since
we're copying a kernel stack buffer into a quickmapped page.
This should make the copy_to/from_user() functions slightly less useful
for exploitation. Before this, they were essentially just glorified
memcpy() with SMAP disabled. :^)
|
|
Technically the bottom 2MB is still identity-mapped for the kernel and
not made available to userspace at all, but for simplicity's sake we
can just ignore that and make "address < 0xc0000000" the canonical
check for user/kernel.
|
|
It's now an error to sys$mmap() a file as writable if it's currently
mapped executable by anyone else.
It's also an error to sys$execve() a file that's currently mapped
writable by anyone else.
This fixes a race condition vulnerability where one program could make
modifications to an executable while another process was in the kernel,
in the middle of exec'ing the same executable.
Test: Kernel/elf-execve-mmap-race.cpp
|
|
..and do it very very early in boot.
|
|
As suggested by Joshua, this commit adds the 2-clause BSD license as a
comment block to the top of every source file.
For the first pass, I've just added myself for simplicity. I encourage
everyone to add themselves as copyright holders of any file they've
added or modified in some significant way. If I've added myself in
error somewhere, feel free to replace it with the appropriate copyright
holder instead.
Going forward, all new source files should include a license header.
|
|
|
|
|
|
Move the CPU feature enabling to functions in Arch/i386/CPU.cpp.
|
|
This is not ASLR, but it does de-trivialize exploiting the ELF loader
which would previously always parse executables at 0x01001000 in every
single exec(). I've taken advantage of this multiple times in my own
toy exploits and it's starting to feel cheesy. :^)
|
|
|