Age | Commit message (Collapse) | Author |
|
These now scrub allocated and freed memory like kmalloc()/kfree() was
already doing.
|
|
Since we scrub both kmalloc() and kfree() with predictable values, we
can log a helpful message when hitting a crash that looks like it might
be a dereference of such scrubbed data.
|
|
This reverts commit 6c72736b26a81a8f03d8dd47989bfffe26bb1c95.
I am unable to boot on my home machine with this change in the tree.
|
|
System components that need an IRQ handling are now inheriting the
InterruptHandler class.
In addition to that, the initialization process of PATAChannel was
changed to fit the changes.
PATAChannel, E1000NetworkAdapter and RTL8139NetworkAdapter are now
inheriting from PCI::Device instead of InterruptHandler directly.
|
|
We don't need to have this method anymore. It was a hack that was used
in many components in the system but currently we use better methods to
create virtual memory mappings. To prevent any further use of this
method it's best to just remove it completely.
Also, the APIC code is disabled for now since it doesn't help booting
the system, and is broken since it relies on identity mapping to exist
in the first 1MB. Any call to the APIC code will result in assertion
failed.
In addition to that, the name of the method which is responsible to
create an identity mapping between 1MB to 2MB was changed, to be more
precise about its purpose.
|
|
VirtualAddress is constructible from uintptr_t and const void*.
PhysicalAddress is constructible from uintptr_t but not const void*.
|
|
uintptr_t is 32-bit or 64-bit depending on the target platform.
This will help us write pointer size agnostic code so that when the day
comes that we want to do a 64-bit port, we'll be in better shape.
|
|
I noticed this while debugging a crash in backtrace generation.
If a process would crash while temporarily inspecting another process's
address space, the crashing thread would still use the other process's
page tables while handling the crash, causing all kinds of confusion
when trying to walk the stack of the crashing thread.
|
|
|
|
..and do it very very early in boot.
|
|
As suggested by Joshua, this commit adds the 2-clause BSD license as a
comment block to the top of every source file.
For the first pass, I've just added myself for simplicity. I encourage
everyone to add themselves as copyright holders of any file they've
added or modified in some significant way. If I've added myself in
error somewhere, feel free to replace it with the appropriate copyright
holder instead.
Going forward, all new source files should include a license header.
|
|
Move the CPU feature enabling to functions in Arch/i386/CPU.cpp.
|
|
The kernel and its static data structures are no longer identity-mapped
in the bottom 8MB of the address space, but instead move above 3GB.
The first 8MB above 3GB are pseudo-identity-mapped to the bottom 8MB of
the physical address space. But things don't have to stay this way!
Thanks to Jesse who made an earlier attempt at this, it was really easy
to get device drivers working once the page tables were in place! :^)
Fixes #734.
|
|
mmap() & mmap_region() methods are removed from ACPI & DMI components,
and we replace them with the new MM.allocate_kernel_region() helper.
Instead of doing a raw calculation for each VM address, from now on we
can use helper functions to do perform those calculations in a neat,
reusable and readable way.
|
|
These were always so awkwardly named.
|
|
It would be nice to do this in the assembly code, but we have to check
if the feature is available before doing a CLAC, so I've put this in
the C++ code for now.
|
|
|
|
Supervisor Mode Access Prevention (SMAP) is an x86 CPU feature that
prevents the kernel from accessing userspace memory. With SMAP enabled,
trying to read/write a userspace memory address while in the kernel
will now generate a page fault.
Since it's sometimes necessary to read/write userspace memory, there
are two new instructions that quickly switch the protection on/off:
STAC (disables protection) and CLAC (enables protection.)
These are exposed in kernel code via the stac() and clac() helpers.
There's also a SmapDisabler RAII object that can be used to ensure
that you don't forget to re-enable protection before returning to
userspace code.
THis patch also adds copy_to_user(), copy_from_user() and memset_user()
which are the "correct" way of doing things. These functions allow us
to briefly disable protection for a specific purpose, and then turn it
back on immediately after it's done. Going forward all kernel code
should be moved to using these and all uses of SmapDisabler are to be
considered FIXME's.
Note that we're not realizing the full potential of this feature since
I've used SmapDisabler quite liberally in this initial bring-up patch.
|
|
We now have these API's in <Kernel/Random.h>:
- get_fast_random_bytes(u8* buffer, size_t buffer_size)
- get_good_random_bytes(u8* buffer, size_t buffer_size)
- get_fast_random<T>()
- get_good_random<T>()
Internally they both use x86 RDRAND if available, otherwise they fall
back to the same LCG we had in RandomDevice all along.
The main purpose of this patch is to give kernel code a way to better
express its needs for random data.
Randomness is something that will require a lot more work, but this is
hopefully a step in the right direction.
|
|
When entering the kernel from a syscall, we now insert a small bit of
stack padding after the RegisterDump. This makes kernel stacks less
deterministic across syscalls and may make some bugs harder to exploit.
Inspired by Elena Reshetova's talk on kernel stack exploitation.
|
|
It's still possible to read the TSC via the read_tsc() syscall, but we
will now clear some of the bottom bits for unprivileged users.
|
|
x86 descriptor limits are 20 bytes, not 24 bytes. This was already
a 4-bit wide bitfield, so no damage done, but let's be correct.
|
|
Lazy FPU restore is well known to be vulnerable to timing attacks,
and eager restore is a lot simpler anyway, so let's just do it eagerly.
|
|
This prevents code running outside of kernel mode from using the
following instructions:
* SGDT - Store Global Descriptor Table
* SIDT - Store Interrupt Descriptor Table
* SLDT - Store Local Descriptor Table
* SMSW - Store Machine Status Word
* STR - Store Task Register
There's no need for userspace to be able to use these instructions so
let's just disable them to prevent information leakage.
|
|
We now refuse to boot on machines that don't support PAE since all
of our paging code depends on it.
Also let's only enable SSE and PGE support if the CPU advertises it.
|
|
We don't actually react to these in any meaningful way other than
crashing, but let's at least print the correct information. :^)
|
|
Introduce one more (CPU) indirection layer in the paging code: the page
directory pointer table (PDPT). Each PageDirectory now has 4 separate
PageDirectoryEntry arrays, governing 1 GB of VM each.
A really neat side-effect of this is that we can now share the physical
page containing the >=3GB kernel-only address space metadata between
all processes, instead of lazily cloning it on page faults.
This will give us access to the NX (No eXecute) bit, allowing us to
prevent execution of memory that's not supposed to be executed.
|
|
These were looking a bit messy after we started using 32-bit fields
to store segment registers in RegisterDumps.
|
|
This avoids -Wclass-memaccess warnings exposed by the new Makefiles.
|
|
Instead of having a common entry point and looking at the PIC ISR to
figure out which IRQ we're servicing, just make a separate entryway
for each IRQ that pushes the IRQ number and jumps to a common routine.
This fixes a weird issue where incoming network packets would sometimes
cause the mouse to stop working. I didn't track it down further than
realizing we were sometimes EOI'ing the wrong IRQ.
|
|
Now that we can see the kernel entry points all the time in profiles,
let's tweak the names a little bit and switch to named exceptions.
|
|
|
|
Sometimes QEMU hits us with an IRQ 15 and I don't know what it is.
Just ignore it for now instead of crashing the system.
|
|
Now that we have proper wait queues to drive waiter wakeup, we can use
the wake actions to break out of the scheduler's idle loop when we've
got a thread to run.
|
|
There was a race window between instantiating a WaitQueueBlocker and
setting the thread state to Blocked. If a thread was preempted between
those steps, someone else might try to wake the wait queue and find an
unblocked thread in a wait queue, which is not sane.
|
|
This reverts commit bd33c6627394b2166e1419965dd3b2d2dc0c401f.
This broke the network card drivers, since they depended on kmalloc
addresses being identity-mapped.
|
|
The kernel is now no longer identity mapped to the bottom 8MiB of
memory, and is now mapped at the higher address of `0xc0000000`.
The lower ~1MiB of memory (from GRUB's mmap), however is still
identity mapped to provide an easy way for the kernel to get
physical pages for things such as DMA etc. These could later be
mapped to the higher address too, as I'm not too sure how to
go about doing this elegantly without a lot of address subtractions.
|
|
VM regions can now be marked as stack regions, which is then validated
on syscall, and on page fault.
If a thread is caught with its stack pointer pointing into anything
that's *not* a Region with its stack bit set, we'll crash the whole
process with SIGSTKFLT.
Userspace must now allocate custom stacks by using mmap() with the new
MAP_STACK flag. This mechanism was first introduced in OpenBSD, and now
we have it too, yay! :^)
|
|
The SysV ABI says that the DF flag should be clear on function entry.
That means we have to clear it when jumping into the kernel from some
random userspace context.
|
|
Now the kernel page directory and the page tables are located at a
safe address, to prevent from paging data colliding with garbage.
|
|
It was silly to push the address of the stack pointer when we can also
just change the callee argument to be a value type.
|
|
It was really confusing to have different calling conventions in kernel
and userspace. Also this has prevented us from linking with libgcc.
|
|
Since the kernel page tables are shared between all processes, there's
no need to (implicitly) flush the TLB for them on every context switch.
Setting the G bit on kernel page tables allows the CPU to keep the
translation caches around.
|
|
This code was not doing anything important. Since we're building the
kernel with -mregparm=3, the first function argument goes in %eax.
|
|
|
|
|
|
After we clear the FPU state in a thread when it uses the FPU for the
first time, we also save the clean slate in the thread's FPU state
buffer. When we're doing that, let's write through current->fpu_state()
just to make it clear what's going on.
It was actually safe, since we'd just overwritten the g_last_fpu_thread
pointer anyway, but this patch improves the communication of intent.
Spotted by Bryan Steele, thanks!
|
|
Cloned threads (basically, forked processes) inherit the complete FPU
state of their origin thread. There was a bug in the lazy FPU state
save/restore mechanism where a cloned thread would believe it had a
buffer full of valid FPU state (because the inherited flag said so)
but the origin thread had never actually copied any FPU state into it.
This patch fixes that by forcing out an FPU state save after doing
the initial FPU initialization (FNINIT) in a thread. :^)
|
|
Now programs can catch the SIGSEGV signal when they segfault.
This commit also introduced the send_urgent_signal_to_self method,
which is needed to send signals to a thread when handling exceptions
caused by the same thread.
|
|
Added the exception_code field to RegisterDump, removing the need
for RegisterDumpWithExceptionCode. To accomplish this, I had to
push a dummy exception code during some interrupt entries to properly
pad out the RegisterDump. Note that we also needed to change some code
in sys$sigreturn to deal with the new RegisterDump layout.
|