summaryrefslogtreecommitdiff
path: root/Kernel/Syscall.cpp
AgeCommit message (Collapse)Author
2021-03-15Kernel: Don't return -EFOO when return type is KResultOr<...>Andreas Kling
2021-03-12Kernel: Convert klog() => AK::Format in a handful of placesAndreas Kling
2021-03-04Kernel: Make the kernel compile & link for x86_64Andreas Kling
It's now possible to build the whole kernel with an x86_64 toolchain. There's no bootstrap code so it doesn't work yet (obviously.)
2021-03-02Kernel: Use RDTSC instead of get_fast_random() for syscall stack noiseAndreas Kling
This was the original approach before we switched to get_fast_random() which wasn't fast enough, so we added a buffer. Unfortunately that buffer is racy and we can actually skid past the end of it and continue fetching "random" offsets from the adjacent memory for a while, until we run out of kernel data segment and trip a fault. Instead of making this even more convoluted, let's just go back to the pleasantly simple (RDTSC & 0xff) approach. :^) Fixes #4912.
2021-03-01Kernel: Oops, SC_abort was actually calling sys$exit_thread()Andreas Kling
2021-03-01Kernel: Detach any attached thread tracer on sys$abort()Andreas Kling
2021-03-01Kernel: Make all syscall functions return KResultOr<T>Andreas Kling
This makes it a lot easier to return errors since we no longer have to worry about negating EFOO errors and can just return them flat.
2021-02-25Kernel: Don't disable interrupts while exiting a thread or processAndreas Kling
This was another vestige from a long time ago, when exiting a thread would mutate global data structures that were only protected by the interrupt flag.
2021-02-23Everywhere: Rename ASSERT => VERIFYAndreas Kling
(...and ASSERT_NOT_REACHED => VERIFY_NOT_REACHED) Since all of these checks are done in release builds as well, let's rename them to VERIFY to prevent confusion, as everyone is used to assertions being compiled out in release. We can introduce a new ASSERT macro that is specifically for debug checks, but I'm doing this wholesale conversion first since we've accumulated thousands of these already, and it's not immediately obvious which ones are suitable for ASSERT.
2021-02-20Kernel: Slap a handful more things with UNMAP_AFTER_INITAndreas Kling
2021-02-14Kernel: Fix TOCTOU in syscall entry region validationAndreas Kling
We were doing stack and syscall-origin region validations before taking the big process lock. There was a window of time where those regions could then be unmapped/remapped by another thread before we proceed with our syscall. This patch closes that window, and makes sys$get_stack_bounds() rely on the fact that we now know the userspace stack pointer to be valid. Thanks to @BenWiederhake for spotting this! :^)
2021-02-14Kernel: Mark handle_crash() as [[noreturn]]Andreas Kling
2021-02-14Kernel: Panic on syscall from process with IOPL != 0Andreas Kling
If this happens then the kernel is in an undefined state, so we should rather panic than attempt to limp along.
2021-02-08Kernel: Factor address space management out of the Process classAndreas Kling
This patch adds Space, a class representing a process's address space. - Each Process has a Space. - The Space owns the PageDirectory and all Regions in the Process. This allows us to reorganize sys$execve() so that it constructs and populates a new Space fully before committing to it. Previously, we would construct the new address space while still running in the old one, and encountering an error meant we had to do tedious and error-prone rollback. Those problems are now gone, replaced by what's hopefully a set of much smaller problems and missing cleanups. :^)
2021-02-02Kernel: Add a way to specify which memory regions can make syscallsAndreas Kling
This patch adds sys$msyscall() which is loosely based on an OpenBSD mechanism for preventing syscalls from non-blessed memory regions. It works similarly to pledge and unveil, you can call it as many times as you like, and when you're finished, you call it with a null pointer and it will stop accepting new regions from then on. If a syscall later happens and doesn't originate from one of the previously blessed regions, the kernel will simply crash the process.
2021-01-27Kernel: Track previous mode when entering/exiting trapsTom
This allows us to determine what the previous mode (user or kernel) was, e.g. in the timer interrupt. This is used e.g. to determine whether a signal handler should be set up. Fixes #5096
2020-12-22Kernel: Don't allow modifying IOPL via sys$ptrace() or sys$sigreturn()Andreas Kling
It was possible to overwrite the entire EFLAGS register since we didn't do any masking in the ptrace and sigreturn syscalls. This made it trivial to gain IO privileges by raising IOPL to 3 and then you could talk to hardware to do all kinds of nasty things. Thanks to @allesctf for finding these issues! :^) Their exploit/write-up: https://github.com/allesctf/writeups/blob/master/2020/hxpctf/wisdom2/writeup.md
2020-12-12Kernel: Change wait blocking to Process-only blockingTom
This prevents zombies created by multi-threaded applications and brings our model back to closer to what other OSs do. This also means that SIGSTOP needs to halt all threads, and SIGCONT needs to resume those threads.
2020-12-12Kernel: Fix some issues related to fixes and block conditionsTom
Fix some problems with join blocks where the joining thread block condition was added twice, which lead to a crash when trying to unblock that condition a second time. Deferred block condition evaluation by File objects were also not properly keeping the File object alive, which lead to some random crashes and corruption problems. Other problems were caused by the fact that the Queued state didn't handle signals/interruptions consistently. To solve these issues we remove this state entirely, along with Thread::wait_on and change the WaitQueue into a BlockCondition instead. Also, deliver signals even if there isn't going to be a context switch to another thread. Fixes #4336 and #4330
2020-11-30Kernel: Move block condition evaluation out of the SchedulerTom
This makes the Scheduler a lot leaner by not having to evaluate block conditions every time it is invoked. Instead evaluate them as the states change, and unblock threads at that point. This also implements some more waitid/waitpid/wait features and behavior. For example, WUNTRACED and WNOWAIT are now supported. And wait will now not return EINTR when SIGCHLD is delivered at the same time.
2020-11-23Kernel: Convert dbg() to dbgln() in Syscall.cppAndreas Kling
2020-09-25Meta+Kernel: Make clang-format-10 cleanBen Wiederhake
2020-08-13Kernel: Request random numbers for syscall stack noise in larger chunks (#3125)Nico Weber
Cuts time needed for `disasm /bin/id` from 2.5s to 1s -- identical to the time it needs when not doing the random adjustment at all. The downside is that it's now very easy to get the random offsets with out-of-bounds reads, so it does make this mitigation less effective.
2020-08-10Kernel: Use Userspace<T> for the exit_thread syscallBrian Gianforcaro
Userspace<void*> is a bit strange here, as it would appear to the user that we intend to de-refrence the pointer in kernel mode. However I think it does a good join of illustrating that we are treating the void* as a value type, instead of a pointer type.
2020-08-04Kernel: Tidy up the syscalls list by reorganizing the enumerator macroAndreas Kling
2020-08-03Kernel: Consolidate timeout logicTom
Allow passing in an optional timeout to Thread::block and move the timeout check out of Thread::Blocker. This way all Blockers implicitly support timeouts and don't need to implement it themselves. Do however allow them to override timeouts (e.g. for sockets).
2020-07-30Kernel: Rename region_from_foo() => find_region_from_foo()Andreas Kling
Let's emphasize that these functions actually go out and find regions.
2020-07-18Kernel: Remove special-casing of sys$gettid() in syscall entryAndreas Kling
We had a fast-path for the gettid syscall that was useful before we started caching the thread ID in LibC. Just get rid of it. :^)
2020-07-04Kernel: Move headers intended for userspace use into Kernel/API/Andreas Kling
2020-07-01Kernel: Turn Thread::current and Process::current into functionsTom
This allows us to query the current thread and process on a per processor basis
2020-07-01Kernel: Implement software context switching and Processor structureTom
Moving certain globals into a new Processor structure for each CPU allows us to eventually run an instance of the scheduler on each CPU.
2020-03-28Kernel: Add 'ptrace' syscallItamar
This commit adds a basic implementation of the ptrace syscall, which allows one process (the tracer) to control another process (the tracee). While a process is being traced, it is stopped whenever a signal is received (other than SIGCONT). The tracer can start tracing another thread with PT_ATTACH, which causes the tracee to stop. From there, the tracer can use PT_CONTINUE to continue the execution of the tracee, or use other request codes (which haven't been implemented yet) to modify the state of the tracee. Additional request codes are PT_SYSCALL, which causes the tracee to continue exection but stop at the next entry or exit from a syscall, and PT_GETREGS which fethces the last saved register set of the tracee (can be used to inspect syscall arguments and return value). A special request code is PT_TRACE_ME, which is issued by the tracee and causes it to stop when it calls execve and wait for the tracer to attach.
2020-03-24Interrupts: Assert if trying to install an handler on syscall vectorLiav A
Installing an interrupt handler on the syscall IDT vector can lead to fatal results, so we must assert if that happens.
2020-03-22Kernel: Run clang-format on filesShannon Booth
Let's rip off the band-aid
2020-03-02Kernel: Run clang-format on various filesLiav A
2020-03-02Kernel: Use klog() instead of kprintf()Liav A
Also, duplicate data in dbg() and klog() calls were removed. In addition, leakage of virtual address to kernel log is prevented. This is done by replacing kprintf() calls to dbg() calls with the leaked data instead. Also, other kprintf() calls were replaced with klog().
2020-02-27Syscall: Use dbg() instead of dbgprintf()Liav A
2020-02-27Kernel: Fix the gettid syscallCristian-Bogdan SIRB
syscall_handler was not actually updating the value in regs->eax, so the gettid() was always returning 85: the value of regs->eax was not actually updated, and it remained the one from Userland (the value of SC_gettid). The syscall_handler was modified to actually get a pointer to RegisterState, so any changes to it will actually be saved. NOTE: This was actually more of a compiler optimization: On the SC_gettid flow, we saved in regs.eax the return value of sys$gettid(), but the compiler discarded it, since it followed a return. On a normal flow, the value of regs.eax was reused in tracer->did_syscall, so the compiler actually updated the value.
2020-02-17Kernel: Replace "current" with Thread::current and Process::currentAndreas Kling
Suggested by Sergey. The currently running Thread and Process are now Thread::current and Process::current respectively. :^)
2020-02-16Kernel: Move all code into the Kernel namespaceAndreas Kling
2020-02-16Kernel: Rename RegisterDump => RegisterStateAndreas Kling
2020-01-25Kernel: Clear ESI and EDI on syscall entryAndreas Kling
Since these are not part of the system call convention, we don't care what userspace had in there. Might as well scrub it before entering the kernel. I would scrub EBP too, but that breaks the comfy kernel-thru-userspace stack traces we currently get. It can be done with some effort.
2020-01-19Kernel: Add fast-path for sys$gettid()Andreas Kling
The userspace locks are very aggressively calling sys$gettid() to find out which thread ID they have. Since syscalls are quite heavy, this can get very expensive for some programs. This patch adds a fast-path for sys$gettid(), which makes it skip all of the usual syscall validation and just return the thread ID right away. This cuts Kernel/Process.cpp compile time by ~18%, from ~29 to ~24 sec.
2020-01-18Meta: Add license header to source filesAndreas Kling
As suggested by Joshua, this commit adds the 2-clause BSD license as a comment block to the top of every source file. For the first pass, I've just added myself for simplicity. I encourage everyone to add themselves as copyright holders of any file they've added or modified in some significant way. If I've added myself in error somewhere, feel free to replace it with the appropriate copyright holder instead. Going forward, all new source files should include a license header.
2020-01-12Kernel: Dispatch pending signals when returning from a syscallAndreas Kling
It was quite easy to put the system into a heavy churn state by doing e.g "cat /dev/zero". It was then basically impossible to kill the "cat" process, even with "kill -9", since signals are only delivered in two conditions: a) The target thread is blocked in the kernel b) The target thread is running in userspace However, since "cat /dev/zero" command spends most of its time actively running in the kernel, not blocked, the signal dispatch code just kept postponing actually handling the signal indefinitely. To fix this, we now check before returning from a syscall if there are any pending unmasked signals, and if so, we take a dramatic pause by blocking the current thread, knowing it will immediately be unblocked by signal dispatch anyway. :^)
2020-01-09Kernel: Rename {ss,esp}_if_crossRing to userspace_{ss,esp}Andreas Kling
These were always so awkwardly named.
2020-01-05Kernel: Start implementing x86 SMAP supportAndreas Kling
Supervisor Mode Access Prevention (SMAP) is an x86 CPU feature that prevents the kernel from accessing userspace memory. With SMAP enabled, trying to read/write a userspace memory address while in the kernel will now generate a page fault. Since it's sometimes necessary to read/write userspace memory, there are two new instructions that quickly switch the protection on/off: STAC (disables protection) and CLAC (enables protection.) These are exposed in kernel code via the stac() and clac() helpers. There's also a SmapDisabler RAII object that can be used to ensure that you don't forget to re-enable protection before returning to userspace code. THis patch also adds copy_to_user(), copy_from_user() and memset_user() which are the "correct" way of doing things. These functions allow us to briefly disable protection for a specific purpose, and then turn it back on immediately after it's done. Going forward all kernel code should be moved to using these and all uses of SmapDisabler are to be considered FIXME's. Note that we're not realizing the full potential of this feature since I've used SmapDisabler quite liberally in this initial bring-up patch.
2020-01-03Kernel: Use get_fast_random() for the random syscall stack offsetAndreas Kling
2020-01-01Kernel: Add a random offset to kernel stacks upon syscall entryAndreas Kling
When entering the kernel from a syscall, we now insert a small bit of stack padding after the RegisterDump. This makes kernel stacks less deterministic across syscalls and may make some bugs harder to exploit. Inspired by Elena Reshetova's talk on kernel stack exploitation.
2019-12-15Kernel: Make separate kernel entry points for each PIC IRQAndreas Kling
Instead of having a common entry point and looking at the PIC ISR to figure out which IRQ we're servicing, just make a separate entryway for each IRQ that pushes the IRQ number and jumps to a common routine. This fixes a weird issue where incoming network packets would sometimes cause the mouse to stop working. I didn't track it down further than realizing we were sometimes EOI'ing the wrong IRQ.