summaryrefslogtreecommitdiff
path: root/Kernel/Syscall.cpp
AgeCommit message (Collapse)Author
2020-01-25Kernel: Clear ESI and EDI on syscall entryAndreas Kling
Since these are not part of the system call convention, we don't care what userspace had in there. Might as well scrub it before entering the kernel. I would scrub EBP too, but that breaks the comfy kernel-thru-userspace stack traces we currently get. It can be done with some effort.
2020-01-19Kernel: Add fast-path for sys$gettid()Andreas Kling
The userspace locks are very aggressively calling sys$gettid() to find out which thread ID they have. Since syscalls are quite heavy, this can get very expensive for some programs. This patch adds a fast-path for sys$gettid(), which makes it skip all of the usual syscall validation and just return the thread ID right away. This cuts Kernel/Process.cpp compile time by ~18%, from ~29 to ~24 sec.
2020-01-18Meta: Add license header to source filesAndreas Kling
As suggested by Joshua, this commit adds the 2-clause BSD license as a comment block to the top of every source file. For the first pass, I've just added myself for simplicity. I encourage everyone to add themselves as copyright holders of any file they've added or modified in some significant way. If I've added myself in error somewhere, feel free to replace it with the appropriate copyright holder instead. Going forward, all new source files should include a license header.
2020-01-12Kernel: Dispatch pending signals when returning from a syscallAndreas Kling
It was quite easy to put the system into a heavy churn state by doing e.g "cat /dev/zero". It was then basically impossible to kill the "cat" process, even with "kill -9", since signals are only delivered in two conditions: a) The target thread is blocked in the kernel b) The target thread is running in userspace However, since "cat /dev/zero" command spends most of its time actively running in the kernel, not blocked, the signal dispatch code just kept postponing actually handling the signal indefinitely. To fix this, we now check before returning from a syscall if there are any pending unmasked signals, and if so, we take a dramatic pause by blocking the current thread, knowing it will immediately be unblocked by signal dispatch anyway. :^)
2020-01-09Kernel: Rename {ss,esp}_if_crossRing to userspace_{ss,esp}Andreas Kling
These were always so awkwardly named.
2020-01-05Kernel: Start implementing x86 SMAP supportAndreas Kling
Supervisor Mode Access Prevention (SMAP) is an x86 CPU feature that prevents the kernel from accessing userspace memory. With SMAP enabled, trying to read/write a userspace memory address while in the kernel will now generate a page fault. Since it's sometimes necessary to read/write userspace memory, there are two new instructions that quickly switch the protection on/off: STAC (disables protection) and CLAC (enables protection.) These are exposed in kernel code via the stac() and clac() helpers. There's also a SmapDisabler RAII object that can be used to ensure that you don't forget to re-enable protection before returning to userspace code. THis patch also adds copy_to_user(), copy_from_user() and memset_user() which are the "correct" way of doing things. These functions allow us to briefly disable protection for a specific purpose, and then turn it back on immediately after it's done. Going forward all kernel code should be moved to using these and all uses of SmapDisabler are to be considered FIXME's. Note that we're not realizing the full potential of this feature since I've used SmapDisabler quite liberally in this initial bring-up patch.
2020-01-03Kernel: Use get_fast_random() for the random syscall stack offsetAndreas Kling
2020-01-01Kernel: Add a random offset to kernel stacks upon syscall entryAndreas Kling
When entering the kernel from a syscall, we now insert a small bit of stack padding after the RegisterDump. This makes kernel stacks less deterministic across syscalls and may make some bugs harder to exploit. Inspired by Elena Reshetova's talk on kernel stack exploitation.
2019-12-15Kernel: Make separate kernel entry points for each PIC IRQAndreas Kling
Instead of having a common entry point and looking at the PIC ISR to figure out which IRQ we're servicing, just make a separate entryway for each IRQ that pushes the IRQ number and jumps to a common routine. This fixes a weird issue where incoming network packets would sometimes cause the mouse to stop working. I didn't track it down further than realizing we were sometimes EOI'ing the wrong IRQ.
2019-12-14Kernel: Tidy up kernel entry points a little bitAndreas Kling
Now that we can see the kernel entry points all the time in profiles, let's tweak the names a little bit and switch to named exceptions.
2019-11-29Kernel: Disallow syscalls from writeable memoryAndreas Kling
Processes will now crash with SIGSEGV if they attempt making a syscall from PROT_WRITE memory. This neat idea comes from OpenBSD. :^)
2019-11-26Kernel: Make syscall counters and page fault counters per-threadAndreas Kling
Now that we show individual threads in SystemMonitor and "top", it's also very nice to have individual counters for the threads. :^)
2019-11-18Kernel: When userspaces calls a removed syscall, fail with ENOSYSAndreas Kling
This is a bit gentler than jumping to 0x0, which always crashes the whole process. Also log a debug message about what happened, and let the user know that it's probably time to rebuild the program.
2019-11-17Kernel+LibC: Remove the isatty() syscallAndreas Kling
This can be implemented entirely in userspace by calling tcgetattr(). To avoid screwing up the syscall indexes, this patch also adds a mechanism for removing a syscall without shifting the index of other syscalls. Note that ports will still have to be rebuilt after this change, as their LibC code will try to make the isatty() syscall on startup.
2019-11-17Kernel: Implement some basic stack pointer validationAndreas Kling
VM regions can now be marked as stack regions, which is then validated on syscall, and on page fault. If a thread is caught with its stack pointer pointing into anything that's *not* a Region with its stack bit set, we'll crash the whole process with SIGSTKFLT. Userspace must now allocate custom stacks by using mmap() with the new MAP_STACK flag. This mechanism was first introduced in OpenBSD, and now we have it too, yay! :^)
2019-11-14Kernel: Unwind kernel stacks before dyingSergey Bugaev
While executing in the kernel, a thread can acquire various resources that need cleanup, such as locks and references to RefCounted objects. This cleanup normally happens on the exit path, such as in destructors for various RAII guards. But we weren't calling those exit paths when killing threads that have been executing in the kernel, such as threads blocked on reading or sleeping, thus causing leaks. This commit changes how killing threads works. Now, instead of killing a thread directly, one is supposed to call thread->set_should_die(), which will unblock it and make it unwind the stack if it is blocked in the kernel. Then, just before returning to the userspace, the thread will automatically die.
2019-11-13LibPthread: Start working on a POSIX threading libraryAndreas Kling
This patch adds pthread_create() and pthread_exit(), which currently simply wrap our existing create_thread() and exit_thread() syscalls. LibThread is also ported to using LibPthread.
2019-11-09Kernel: Clear the x86 DF flag when entering the kernelAndreas Kling
The SysV ABI says that the DF flag should be clear on function entry. That means we have to clear it when jumping into the kernel from some random userspace context.
2019-11-09Kernel: Use a lookup table for syscallsAndreas Kling
Instead of the big ugly switch statement, build a lookup table using the syscall enumeration macro. This greatly simplifies the syscall implementation. :^)
2019-11-06Kernel: Simplify kernel entry points slightlyAndreas Kling
It was silly to push the address of the stack pointer when we can also just change the callee argument to be a value type.
2019-11-06Kernel: Don't build with -mregparm=3Andreas Kling
It was really confusing to have different calling conventions in kernel and userspace. Also this has prevented us from linking with libgcc.
2019-11-02Kernel+LibC: Implement clock_gettime() and clock_nanosleep()Andreas Kling
Only the CLOCK_MONOTONIC clock is supported at the moment, and it only has millisecond precision. :^)
2019-10-13Kernel: Add a Linux-style getrandom syscallCalvin Buckley
The way it gets the entropy and blasts it to the buffer is pretty ugly IMHO, but it does work for now. (It should be replaced, by not truncating a u32.) It implements an (unused for now) flags argument, like Linux but instead of OpenBSD's. This is in case we want to distinguish between entropy sources or any other reason and have to implement a new syscall later. Of course, learn from Linux's struggles with entropy sourcing too.
2019-10-07Kernel: Add exception_code to RegisterDump.Drew Stratford
Added the exception_code field to RegisterDump, removing the need for RegisterDumpWithExceptionCode. To accomplish this, I had to push a dummy exception code during some interrupt entries to properly pad out the RegisterDump. Note that we also needed to change some code in sys$sigreturn to deal with the new RegisterDump layout.
2019-09-13Kernel: Implement fchdir syscallMauri de Souza Nunes
The fchdir() function is equivalent to chdir() except that the directory that is to be the new current working directory is specified by a file descriptor.
2019-09-05Kernel: Use user stack for signal handlers.Drew Stratford
This commit drastically changes how signals are handled. In the case that an unblocked thread is signaled it works much in the same way as previously. However, when a blocking syscall is interrupted, we set up the signal trampoline on the user stack, complete the blocking syscall, return down the kernel stack and then jump to the handler. This means that from the kernel stack's perspective, we only ever get one system call deep. The signal trampoline has also been changed in order to properly store the return value from system calls. This is necessary due to the new way we exit from signaled system calls.
2019-08-25Kernel: Add realpath syscallRok Povsic
2019-08-17Kernel+LibC+Userland: Support mounting other kinds of filesystemsSergey Bugaev
2019-08-17Kernel: Added unmount ability to VFSJesse Buhagiar
It is now possible to unmount file systems from the VFS via `umount`. It works via looking up the `fsid` of the filesystem from the `Inode`'s metatdata so I'm not sure how fragile it is. It seems to work for now though as something to get us going.
2019-08-15Kernel+LibC: Add get_process_name() syscallAndreas Kling
It does exactly what it sounds like: int get_process_name(char* buffer, int buffer_size);
2019-08-12Kernel+LibC+crash: Add mprotect() syscallAndreas Kling
This patch adds the mprotect() syscall to allow changing the protection flags for memory regions. We don't do any region splitting/merging yet, so this only works on whole mmap() regions. Added a "crash -r" flag to verify that we crash when you attempt to write to read-only memory. :^)
2019-08-05Kernel+LibC: Support passing O_CLOEXEC to pipe()Sergey Bugaev
In the userspace, this mimics the Linux pipe2() syscall; in the kernel, the Process::sys$pipe() now always accepts a flags argument, the no-argument pipe() syscall is now a userspace wrapper over pipe2().
2019-08-02Kernel: mount system call (#396)Jesse
It is now possible to mount ext2 `DiskDevice` devices under Serenity on any folder in the root filesystem. Currently any user can do this with any permissions. There's a fair amount of assumptions made here too, that might not be too good, but can be worked on in the future. This is a good start to allow more dynamic operation under the OS itself. It is also currently impossible to unmount and such, and devices will fail to mount in Linux as the FS 'needs to be cleaned'. I'll work on getting `umount` done ASAP to rectify this (as well as working on less assumption-making in the mount syscall. We don't want to just be able to mount DiskDevices!). This could probably be fixed with some `-t` flag or something similar.
2019-07-29Kernel+ProcessManager: Let processes have an icon and show it in the table.Andreas Kling
Processes can now have an icon assigned, which is essentially a 16x16 RGBA32 bitmap exposed as a shared buffer ID. You set the icon ID by calling set_process_icon(int) and the icon ID will be exposed through /proc/all. To make this work, I added a mechanism for making shared buffers globally accessible. For safety reasons, each app seals the icon buffer before making it global. Right now the first call to GWindow::set_icon() is what determines the process icon. We'll probably change this in the future. :^)
2019-07-22Kernel: Add a mechanism for listening for changes to an inode.Andreas Kling
The syscall is quite simple: int watch_file(const char* path, int path_length); It returns a file descriptor referring to a "InodeWatcher" object in the kernel. It becomes readable whenever something changes about the inode. Currently this is implemented by hooking the "metadata dirty bit" in Inode which isn't perfect, but it's a start. :^)
2019-07-21Kernel+LibC: Add a dbgputstr() syscall for sending strings to debug output.Andreas Kling
This is very handy for the DebugLogStream implementation, among others. :^)
2019-07-21Kernel+LibC: Add a dbgputch() syscall and use it for userspace dbgprintf().Andreas Kling
The "stddbg" stream was a cute idea but we never ended up using it in practice, so let's simplify this and implement userspace dbgprintf() on top of a simple dbgputch() syscall instead. This makes debugging LibC startup a little bit easier. :^)
2019-07-21Kernel+LibC: Add a dump_backtrace() syscall.Andreas Kling
This is very simple but already very useful. Now you're able to call to dump_backtrace() from anywhere userspace to get a nice symbolicated backtrace in the debugger output. :^)
2019-07-19Kernel: Only allow superuser to halt() the system (#342)Jesse
Following the discussion in #334, shutdown must also have root-only run permissions.
2019-07-19Kernel+Userland: Addd reboot syscall (#334)Jesse
Rolling with the theme of adding a dialog to shutdown the machine, it is probably nice to have a way to reboot the machine without performing a full system powerdown. A reboot program has been added to `/bin/` as well as a corresponding `syscall` (SC_reboot). This syscall works by attempting to pulse the 8042 keyboard controller. Note that this is NOT supported on new machines, and should only be a fallback until we have proper ACPI support. The implementation causes a triple fault in QEMU, which then restarts the system. The filesystems are locked and synchronized before this occurs, so there shouldn't be any corruption etctera.
2019-07-18SharedBuffer: Split the creation and share stepsRobin Burchell
This allows us to seal a buffer *before* anyone else has access to it (well, ok, the creating process still does, but you can't win them all). It also means that a SharedBuffer can be shared with multiple clients: all you need is to have access to it to share it on again.
2019-07-08Kernel: Have the open() syscall take an explicit path length parameter.Andreas Kling
Instead of computing the path length inside the syscall handler, let the caller do that work. This allows us to implement to new variants of open() and creat(), called open_with_path_length() and creat_with_path_length(). These are suitable for use with e.g StringView.
2019-07-03AK: Rename the common integer typedefs to make it obvious what they are.Andreas Kling
These types can be picked up by including <AK/Types.h>: * u8, u16, u32, u64 (unsigned) * i8, i16, i32, i64 (signed)
2019-06-16Kernel/Userland: Add a halt syscall, and a shutdown binary to invoke itRobin Burchell
2019-06-07Kernel: Move i386.{cpp,h} => Arch/i386/CPU.{cpp,h}Andreas Kling
There's a ton of work that would need to be done before we could spin up on another architecture, but let's at least try to separate things out a bit.
2019-06-07Kernel: Run clang-format on everything.Andreas Kling
2019-06-01Kernel: Add fchown() syscall.Andreas Kling
2019-05-30Kernel/LibC: Implement sched_* functionality to set/get process priorityRobin Burchell
Right now, we allow anything inside a user to raise or lower any other process's priority. This feels simple enough to me. Linux disallows raising, but that's annoying in practice.
2019-05-23Kernel: Return ENOSYS if an invalid syscall number is requested.Andreas Kling
2019-05-20Kernel: Add getpeername() syscall, and fix getsockname() behavior.Andreas Kling
We were copying the raw IPv4 addresses into the wrong part of sockaddr_in, and we didn't set sa_family or sa_port.