summaryrefslogtreecommitdiff
path: root/Kernel/VM/MemoryManager.cpp
AgeCommit message (Collapse)Author
2019-12-25Kernel: Make kernel memory regions be non-executable by defaultAndreas Kling
From now on, you'll have to request executable memory specifically if you want some.
2019-12-25Kernel: Set NX bit for virtual addresses 0-1MB and 2-8MBAndreas Kling
This removes the ability to jump into kmalloc memory, etc. Only the kernel image itself is allowed to exec, located between 1-2MB.
2019-12-25Kernel: Use the CPU's NX bit to enforce PROT_EXEC on memory mappingsAndreas Kling
Now that we have PAE support, we can ask the CPU to crash processes for trying to execute non-executable memory. This is pretty cool! :^)
2019-12-25Kernel: Enable PAE (Physical Address Extension)Andreas Kling
Introduce one more (CPU) indirection layer in the paging code: the page directory pointer table (PDPT). Each PageDirectory now has 4 separate PageDirectoryEntry arrays, governing 1 GB of VM each. A really neat side-effect of this is that we can now share the physical page containing the >=3GB kernel-only address space metadata between all processes, instead of lazily cloning it on page faults. This will give us access to the NX (No eXecute) bit, allowing us to prevent execution of memory that's not supposed to be executed.
2019-12-25Kernel: Rename PageDirectory::find_by_pdb() => find_by_cr3()Andreas Kling
I caught myself wondering what "pdb" stood for, so let's rename this to something more obvious.
2019-12-25Kernel: Uh, actually *actually* turn on CR4.PGEAndreas Kling
I'm not sure how I managed to misread the location of this bit twice. But I did! Here is finally the correct value, according to Intel: "Page Global Enable (bit 7 of CR4)" Jeez! :^)
2019-12-24Kernel: Oops, actually enable CR4.PGE (page table global bit)Andreas Kling
Turns out we were setting the wrong bit here. Now we will actually keep kernel memory mappings in the TLB across context switches.
2019-12-21Kernel: Enable the x86 WP bit to catch invalid memory writes in ring 0Andreas Kling
Setting this bit will cause the CPU to generate a page fault when writing to read-only memory, even if we're executing in the kernel. Seemingly the only change needed to make this work was to have the inode-backed page fault handler use a temporary mapping for writing the read-from-disk data into the newly-allocated physical page.
2019-12-20Kernel: Fix some warnings about passing non-POD to kprintfAndreas Kling
2019-12-19Kernel: Rename vmo => vmobject everywhereAndreas Kling
2019-12-15Kernel: Make sure the kernel info page is read-only for userspaceAndreas Kling
To enforce this, we create two separate mappings of the same underlying physical page. A writable mapping for the kernel, and a read-only one for userspace (the one returned by sys$get_kernel_info_page.)
2019-12-15Kernel: Improve comment about the system virtual memory map a bitAndreas Kling
2019-12-01Kernel: Put some debug spam behind PAGE_FAULT_DEBUGAndreas Kling
2019-11-29Kernel: Disallow syscalls from writeable memoryAndreas Kling
Processes will now crash with SIGSEGV if they attempt making a syscall from PROT_WRITE memory. This neat idea comes from OpenBSD. :^)
2019-11-27Kernel: Fix triple-fault when clicking on SystemServer in SystemMonitorAndreas Kling
The fault was happening when retrieving a current backtrace for the SystemServer process. To generate a backtrace, we go into the paging scope of the process, meaning we temporarily switch to using its page directory as our own. Because kernel VM is allocated on demand, it's possible for a process's mappings above the 3GB mark to be out-of-date. Normally this just gets fixed up transparently by the page fault handler (which simply copies the PDE from the canonical MM.kernel_page_directory() into the current process.) However, if the current kernel *stack* is in a piece of memory that the backtraced process lacks up-to-date PDE's for, we still get a page fault, but are unable to handle it, since the CPU wants to push to the stack as part of calling the page fault handler. So we're screwed and it's a triple-fault. Fix this by always updating the kernel VM mappings before switching into a paging scope. In practical terms, this is a 1KB memcpy() that happens when generating a backtrace, or doing exec().
2019-11-23Revert "Kernel: Move Kernel mapping to 0xc0000000"Andreas Kling
This reverts commit bd33c6627394b2166e1419965dd3b2d2dc0c401f. This broke the network card drivers, since they depended on kmalloc addresses being identity-mapped.
2019-11-22Kernel: Move Kernel mapping to 0xc0000000Jesse Buhagiar
The kernel is now no longer identity mapped to the bottom 8MiB of memory, and is now mapped at the higher address of `0xc0000000`. The lower ~1MiB of memory (from GRUB's mmap), however is still identity mapped to provide an easy way for the kernel to get physical pages for things such as DMA etc. These could later be mapped to the higher address too, as I'm not too sure how to go about doing this elegantly without a lot of address subtractions.
2019-11-17Kernel: Implement some basic stack pointer validationAndreas Kling
VM regions can now be marked as stack regions, which is then validated on syscall, and on page fault. If a thread is caught with its stack pointer pointing into anything that's *not* a Region with its stack bit set, we'll crash the whole process with SIGSTKFLT. Userspace must now allocate custom stacks by using mmap() with the new MAP_STACK flag. This mechanism was first introduced in OpenBSD, and now we have it too, yay! :^)
2019-11-08Kernel: Fix the search method of free userspace physical pages (#742)Liav A
Now the userspace page allocator will search through physical regions, and stop the search as it finds an available page. Also remove an "address of" sign since we don't need that when counting size of physical regions
2019-11-08Kernel: Removing hardcoded offsets from Memory Managersupercomputer7
Now the kernel page directory and the page tables are located at a safe address, to prevent from paging data colliding with garbage.
2019-11-04Kernel: Reorganize memory layout a bitAndreas Kling
Move the kernel image to the 1 MB physical mark. This prevents it from colliding with stuff like the VGA memory. This was causing us to end up with the BIOS screen contents sneaking into kernel memory sometimes. This patch also bumps the kmalloc heap size from 1 MB to 3 MB. It's not the perfect permanent solution (obviously) but it should get the OOM monkey off our backs for a while.
2019-11-04Kernel: Move page fault handling from MemoryManager to RegionAndreas Kling
After the page fault handler has found the region in which the fault occurred, do the rest of the work in the region itself. This patch also makes all fault types consistently crash the process if a new page is needed but we're all out of pages.
2019-11-04Kernel: Don't expose a region's page directory to the outside worldAndreas Kling
Now that region manages its own mapping/unmapping, there's no need for the outside world to be able to grab at its page directory.
2019-11-04Kernel: Merge MemoryManager::map_region_at_address() into Region::map()Andreas Kling
2019-11-03Kernel: Fix bad setup of CoW faults for offset regionsAndreas Kling
Regions with an offset into their VMObject were incorrectly adding the page offset when indexing into the CoW bitmap.
2019-11-03Kernel: Set the G (global) bit for kernel page tablesAndreas Kling
Since the kernel page tables are shared between all processes, there's no need to (implicitly) flush the TLB for them on every context switch. Setting the G bit on kernel page tables allows the CPU to keep the translation caches around.
2019-11-03Kernel: Teach Region how to remap itselfAndreas Kling
Now remapping (i.e flushing kernel metadata to the CPU page tables) is done by simply calling Region::remap().
2019-11-03Kernel: Regions should be mapped into a PageDirectory, not a ProcessAndreas Kling
This patch changes the parameter to Region::map() to be a PageDirectory since that matches how we think about the memory model: Regions are views onto VMObjects, and are mapped into PageDirectories. Each Process has a PageDirectory. The kernel also has a PageDirectory.
2019-11-03Kernel: Move region map/unmap operations into the Region classAndreas Kling
The more Region can take care of itself, the better.
2019-11-03Kernel: Clean up a bunch of wrong-looking Region/VMObject codeAndreas Kling
Since a Region is merely a "window" onto a VMObject, it can both begin and end at a distance from the VMObject's boundaries. Therefore, we should always be computing indices into a VMObject's physical page array by adding the Region's "first_page_index()". There was a whole bunch of code that forgot to do that. This fixes many wrong behaviors for Regions that start part-way into a VMObject.
2019-11-03Kernel: Move page remapping into Region::remap_page(index)Andreas Kling
Let Region deal with this, instead of everyone calling MemoryManager.
2019-11-01Kernel: Zero-fill faults should not temporarily enable interruptsAndreas Kling
We were doing a temporary STI/CLI in MemoryManager::zero_page() to be able to acquire the VMObject's lock before zeroing out a page. This logic was inherited from the inode fault handler, where we need to enable interrupts anyway, since we might need to interact with the underlying storage device. Zero-fill faults don't actually need to lock the VMObject, since they are already guaranteed exclusivity by interrupts being disabled when entering the fault handler. This is different from inode faults, where a second thread can often get an inode fault for the same exact page in the same VMObject before the first fault handler has received a response from the disk. This is why the lock exists in the first place, to prevent this race. This fixes an intermittent crash in sys$execve() that was made much more visible after I made userspace stacks lazily allocated.
2019-10-16APIC: Enable APIC and start APsTom
2019-10-02Kernel+SystemMonitor: Add fault countersAndreas Kling
This patch adds three separate per-process fault counters: - Inode faults An inode fault happens when we've memory-mapped a file from disk and we end up having to load 1 page (4KB) of the file into memory. - Zero faults Memory returned by mmap() is lazily zeroed out. Every time we have to zero out 1 page, we count a zero fault. - CoW faults VM objects can be shared by multiple mappings that make their own unique copy iff they want to modify it. The typical reason here is memory shared between a parent and child process.
2019-10-01Kernel: Defer creation of Region CoW bitmaps until they're neededAndreas Kling
Instead of allocating and populating a Copy-on-Write bitmap for each Region up front, wait until we actually clone the Region for sharing with another process. In most cases, we never need any CoW bits and we save ourselves a lot of kmalloc() memory and time.
2019-09-28Kernel: Repair unaligned regions supplied by the boot loaderConrad Pankoff
We were just blindly trusting that the bootloader would only give us page-aligned memory regions. This is apparently not always the case, so now we can try to repair those regions. Fixes #601
2019-09-27Kernel: Fix partial munmap() deallocating still-in-use VMAndreas Kling
We were always returning the full VM range of the partially-unmapped Region to the range allocator. This caused us to re-use those addresses for subsequent VM allocations. This patch also skips creating a new VMObject in partial munmap(). Instead we just make split regions that point into the same VMObject. This fixes the mysterious GCC ICE on large C++ programs.
2019-09-27Kernel: Make Region single-owner instead of ref-countedAndreas Kling
This simplifies the ownership model and makes Region easier to reason about. Userspace Regions are now primarily kept by Process::m_regions. Kernel Regions are kept in various OwnPtr<Regions>'s. Regions now only ever get unmapped when they are destroyed.
2019-09-17Kernel: Ignore memory the bootloader gives us above 2^32Conrad Pankoff
2019-09-16Kernel: Fix some bitrot in MemoryManager debug logging codeAndreas Kling
2019-09-15Kernel: Get rid of MemoryManager::allocate_page_table()Andreas Kling
We can just use the physical page allocator directly, there's no need for a dedicated function for page tables.
2019-09-04Kernel: Rename "vmo" to "vmobject" everywhereAndreas Kling
2019-08-26Revert "Kernel: Avoid a memcpy() of the whole block when paging in from inode"Andreas Kling
This reverts commit 11896d0e26555b8090540b04b627d43365aaec2e. This caused a race where other processes using the same InodeVMObject could end up accessing the newly-mapped physical page before we've actually filled it with bytes from disk. It would be nice to avoid these copies without breaking anything.
2019-08-26Kernel: Display virtual addresses as V%p instead of L%xAndreas Kling
The L was a leftover from when these were called linear addresses.
2019-08-25Kernel: Avoid a memcpy() of the whole block when paging in from inodeAndreas Kling
2019-08-19Kernel: Put debug spam about already-paged-in inode pages behind #ifdefAndreas Kling
2019-08-08Kernel: Use range-for with InlineLinkedListAndreas Kling
2019-08-08Kernel: Put all Regions on InlineLinkedLists (separated by user/kernel)Andreas Kling
Remove the global hash tables and replace them with InlineLinkedLists. This significantly reduces the kernel heap pressure from doing many small mmap()'s.
2019-08-08Kernel: Put all VMObjects in an InlineLinkedList instead of a HashTableAndreas Kling
Using a HashTable to track "all instances of Foo" is only useful if we actually need to look up entries by some kind of index. And since they are HashTable (not HashMap), the pointer *is* the index. Since we have the pointer, we can just use it directly. Duh. This increase sizeof(VMObject) by two pointers, but removes a global table that had an entry for every VMObject, where the cost was higher. It also avoids all the general hash tabling business when creating or destroying VMObjects. Generally we should do more of this. :^)
2019-08-07Kernel: Remove unused MemoryManager::remove_identity_mapping()Andreas Kling
This was not actually used and just sitting there being confusing.