Age | Commit message (Collapse) | Author |
|
This patch modifies InodeWatcher to switch to a one watcher, multiple
watches architecture. The following changes have been made:
- The watch_file syscall is removed, and in its place the
create_iwatcher, iwatcher_add_watch and iwatcher_remove_watch calls
have been added.
- InodeWatcher now holds multiple WatchDescriptions for each file that
is being watched.
- The InodeWatcher file descriptor can be read from to receive events on
all watched files.
Co-authored-by: Gunnar Beutner <gunnar@beutner.name>
|
|
If the HPET main counter does not support full 64 bits, we should
not expect the upper 32 bit to work. This is a problem when writing
to the upper 32 bit of the comparator value, which requires the
TimerConfiguration::ValueSet bit to be set, but if it's not 64 bit
capable then the bit will not be cleared and leave it in a bad state.
Fixes #6990
|
|
Without this patch we'd send packets through the physical adapter
and they'd incorrectly end up on the network.
|
|
Previously we'd use the adapter's address as the source address
when sending packets. Instead we should use the socket's bound local
address.
|
|
This matches what other operating systems like Linux do:
$ ip route get 0.0.0.0
local 0.0.0.0 dev lo src 127.0.0.1 uid 1000
cache <local>
$ ssh 0.0.0.0
gunnar@0.0.0.0's password:
$ ss -na | grep :22 | grep ESTAB
tcp ESTAB 0 0 127.0.0.1:43118 127.0.0.1:22
tcp ESTAB 0 0 127.0.0.1:22 127.0.0.1:43118
|
|
Previously we'd send a TCP ACK for each TCP packet we received. This
changes NetworkTask so that we send fewer TCP ACKs.
|
|
When we receive a TCP packet with a sequence number that is not what
we expected we have lost one or more packets. We can signal this to
the sender by sending a TCP ACK with the previous ack number so that
they can resend the missing TCP fragments.
|
|
Previously we'd process TCP packets in whatever order we received
them in. In the case where packets arrived out of order we'd end
up passing garbage to the userspace process.
This was most evident for TLS connections:
courage:~ $ git clone https://github.com/SerenityOS/serenity
Cloning into 'serenity'...
remote: Enumerating objects: 178826, done.
remote: Counting objects: 100% (1880/1880), done.
remote: Compressing objects: 100% (907/907), done.
error: RPC failed; curl 56 OpenSSL SSL_read: error:1408F119:SSL
routines:SSL3_GET_RECORD:decryption failed or bad record mac, errno 0
error: 1918 bytes of body are still expected
fetch-pack: unexpected disconnect while reading sideband packet
fatal: early EOF
fatal: fetch-pack: invalid index-pack output
|
|
When the MSS option header is missing the default maximum segment
size is 536 which results in lots of very small TCP packets that
NetworkTask has to handle.
This adds the MSS option header to outbound TCP SYN packets and
sets it to an appropriate value depending on the interface's MTU.
Note that we do not currently do path MTU discovery so this could
cause problems when hops don't fragment packets properly.
|
|
This avoids allocating a KBuffer for each incoming TCP packet.
|
|
This increases the default TCP window size to a more reasonable
value of 64k. This allows TCP peers to send us more packets before
waiting for corresponding ACKs.
|
|
This increases the buffer size for connection-oriented sockets
to 256kB. In combination with the other patches in this series
I was able to receive TCP packets at a rate of about 120Mbps.
|
|
The get_dir_entries syscall failed if the serialized form of all the
directory entries together was too large to fit in its temporary buffer.
Now the kernel uses a fixed size buffer, that is flushed to an output
buffer when it is full. If this flushing operation fails because there
is not enough space available, the syscall will return -EINVAL. That
error code is then used in userspace as a signal to allocate a larger
buffer and retry the syscall.
|
|
Previously we'd try to load ELF images which did not have
an interpreter set with an incorrect load offset of 0, i.e. way
outside of the part of the address space where we'd expect either
the dynamic loader or the user's executable to reside.
This fixes the problem by using get_load_offset for both executables
which have an interpreter set and those which don't. Notably this
allows us to actually successfully execute the Loader.so binary:
courage:~ $ /usr/lib/Loader.so
You have invoked `Loader.so'. This is the helper program for programs
that use shared libraries. Special directives embedded in executables
tell the kernel to load this program.
This helper program loads the shared libraries needed by the program,
prepares the program to run, and runs it. You do not need to invoke
this helper program directly.
courage:~ $
|
|
Previously we'd incorrectly use the default gateway's MAC address.
Instead we must use destination MAC addresses that are derived from
the multicast IPv4 address.
With this patch applied I can query mDNS on a real network.
|
|
Modify the Custody::create(..) API so it has the ability to propagate
OOM back to the caller.
|
|
The Kernel/.gitignore file is a remnant of the prior build system,
where the kernel.map was written directly to to the Kernel folder.
The run.sh was also under Kernel so pcap files and others would get
dropped there when running the system under qemu.
None of these situations are possible now, so lets get rid of it.
|
|
Instead of reading in the entire contents of a directory into a large
buffer, we can iterate block by block. This only requires a small
buffer.
Because directory entries are guaranteed to never span multiple blocks
we do not have to handle any edge cases related to that.
|
|
On some cases, the FADT could be on the end of a page, so if we don't
have two pages being mapped, we could easily read from a non-mapped
virtual address, which will trigger the UB sanitizer.
Also, we need to treat the FADT structure as volatile and const, as it
may change at any time, but we should not touch (write) it anyhow.
|
|
|
|
Ext2 dir entries spanning multiple blocks are not allowed.
If they do occur they are flagged as corrupt by e2fsck for example.
|
|
In VFS::rename, if new_path is equal to '/', then, parent custody is
set to null.
VFS::rename would then use parent custody without checking it first.
Fixed VFS::rename to check both old and new path parent custody
before actually using them.
|
|
write_bytes is called with a count of 0 bytes if a directory is being
deleted, because in that case even the . and .. pseudo directories are
getting removed. In this case write_bytes is now a no-op.
Before write_bytes would fail because it would check to see if there
were any blocks available to write in (even though it wasn't going to
write in them anyway).
This behaviour was uncovered because of a recent change where
directories are correctly reduced in size. Which in this case results in
all the blocks being removed from the inode, whereas previously there
would be some stale blocks around to pass the check.
|
|
e2fsck considers all blocks reachable through any of the pointers in
m_raw_inode.i_block as part of this inode regardless of the value in
m_raw_inode.i_size. When it finds more blocks than the amount that
is indicated by i_size or i_blocks it offers to repair the filesystem
by changing those values. That will actually cause further corruption.
So we must zero all pointers to blocks that are now unused.
|
|
This fixes an OOM when hitting the VM with lots of UDP packets.
fixes #6907
|
|
|
|
|
|
|
|
The current method of emitting performance events requires a bit of
boiler plate at every invocation, as well as having to ignore the
return code which isn't used outside of the perf event syscall. This
change attempts to clean that up by exposing high level API's that
can be used around the code base.
|
|
Ext2 directory contents are stored in a linked list of ext2_dir_entry
structs. There is no sentinel value to determine where the list ends.
Instead the list fills the entirety of the allocated space for the
inode.
Previously the inode was not correctly resized when it became smaller.
This resulted in stale data being interpreted as part of the linked list
of directory entries.
|
|
When reading UDP packets from userspace with recvmsg()/recv() we
would hit a VERIFY() if the supplied buffer is smaller than the
received UDP packet. Instead we should just return truncated data
to the caller.
This can be reproduced with:
$ dd if=/dev/zero bs=1k count=1 | nc -u 192.168.3.190 68
|
|
We use a global setting to determine if Caps Lock should be remapped to
Control because we don't care how keyboard events come in, just that they
should be massaged into different scan codes.
The `proc` filesystem is able to manipulate this global variable using
the `sysctl` utility like so:
```
# sysctl caps_lock_to_ctrl=1
```
|
|
An IP socket can now join a multicast group by using the
IP_ADD_MEMBERSHIP sockopt, which will cause it to start receiving
packets sent to the multicast address, even though this address does
not belong to this host.
|
|
When using `sysctl` you can enable/disable values by writing to the
ProcFS. Some drift must have occured where writing was failing due to
a missing `set_mtime` call. Whenever one `write`'s a file the modified
time (mtime) will be updated so we need to implement this interface in
ProcFS.
|
|
The fact that current_time can "fail" makes its use a bit awkward.
All callers in the Kernel are trusted besides syscalls, so assert
that they never get there, and make sure all current callers perform
validation of the clock_id with TimeManagement::is_valid_clock_id().
I have fuzzed this change locally for a bit to make sure I didn't
miss any obvious regression.
|
|
The variety of checks for Processor::id() == 0 could use some assistance
in the readability department. This change adds a new function to
represent this check, and replaces the comparison everywhere it's used.
|
|
FileDescriptionBlocker::m_should_block was shadowing the parent's
FileBlocker::m_should_block variable, which would cause should_block()
to return the wrong value.
Found by @gunnarbeutner
|
|
This solves a problem where checking whether a thread is an idle
thread may require iterating all processors if it is not the idle
thread of the current processor.
|
|
Previously we would return a 0xdeadc0de frame for every kernel frame
in the real kernel stack when an non super-user issued the request.
This isn't useful, and just produces visual clutter in tools which
attempt to symbolize stacks.
|
|
|
|
|
|
Found while browsing code with CLion.
|
|
Mark final to aid in de-virtualization since they are not currently
derived from.
|
|
|
|
While hacking on `sysctl` an issue in ProcFS was making me unable to
read/write from `/proc/sys/XXX`. Some directories in the ProcFS are not
actually backed by a process and need to return `nullptr` so callbacks
get properly set. We now do an explicit check for the parent to ensure
it's one that is PID-based.
|
|
The triple-fault issue has long been fixed
|
|
The error handling in all these cases was still using the old style
negative values to indicate errors. We have a nicer solution for this
now with KResultOr<T>. This change switches the interface and then all
implementers to use the new style.
|
|
With the recent fixes to how close() gets called this is not
necessary anymore.
|
|
|
|
Our current implementation does not work in the special case in which
both shift keys are pressed, and then only one of the keys is released,
as this would result in writing lower case letters, instead of the
expected upper case letters.
This commit fixes that by keeping track of the amount of shift keys
that are pressed (instead of if any are at all), and only switching to
the unshifted keymap once all of them are released.
|