SHFS kernel module (as well as other sources in this package) are quite nice split across few files in two directories, shfs and shfsmount. Follows the overview of the files.
Sources tree overview:
|- TODO future plans :-) |- docs | |- install.html installation instructions | |- internals.html this file | +- manpages | +- shfsmount.8 nice man page |- misc utility and data used while debugging | |- shfs SHFS kernel module sources | |- dcache.c directory cache | |- dir.c directory handling, calls dir cache for reading dirs | |- fcache.c rw file cache | |- file.c file handling, calls rw file cache for operations | |- inode.c super block stuff, (un)registering fs | |- proc.c reading/writing, low-level stuff, misc. functions | |- proto.c shell commands for read, write, mkdir, ls, etc. | |- shfs.h | |- shfs_dcache.h | |- shfs_debug.h | |- shfs_fcache.h | |- shfs_proc.h | |- shfs_proto.h | +- symlink.c symlink support | +- shfsmount +- shfsmount.c mount utility, invokes ssh/rsh and mounts shfs filesystem
Here you can find man interesting features and tricks used in SHFS kernel module. You can make overview about how it has been written.
Since writing new filesystem module from the scratch is not very funny thing, Shfs is partially based on Florin Malita's nice ftpfs. with some bugs fixed (locking, memory leaks, handling of date).
These pieces of code were examined, partially rewritten and used:
Sending shell command to the remote host on every request from the kernel VFS layer is not a good idea, because of high load it generates on both sides of channel. Much better way is to use the caches for some operations, such as reading directories, reading and writing files, etc.
This makes great performance improvement, since calling dd (= storing data on the remote side) for each page generate quite high system load. Using read-write cache, dd is only called every on 16th request. Cache size could be adjusted in file shfs_fcache.h. You can switch this cache off using "nocache" option while mounting the filesystem.
Lines are read all at once instead of char-by-char. This speeds up directory lookup.
Symlinks are stored just like another files/dirs, but with special two-part name link-name\0link-target\0. This way, it's very easy to implement symlinks with minimal effort.
Since shfs is SHell FileSystem, shell commands are used for all operations. The table with commands used by generic file operations and some platform dependent optimisations follows.
generic | uname, ls, dd, printf, ln, chmod, chown, chgrp, cut |
Linux | generic + head |
When connecting to the remote site, uname command is invoked and simple system-recognition code is executed in order to get the system remote host is running. If the system is unknown, generic shell file operations are used.
Generic file operations should be safe at least on Linux, Solaris 2.x and IRIX 5.x, probably on other UN*X systems too. New types of OS could be easily added (see proto.c).
Note: Generic write operation is very slow. On non-Linux systems it uses the following command:
dd of='file' bs=1 seek=X count=Y conv=notruncIf X is a large number (say 106), dd calls 106 times the write() system call for writing 1 byte. This will generate high system load and slow down the transfer significantly. All suggestions on this topic are welcome (see Linux write optimisation first (proto.c)).
SHFS module (client side) has been successfully tested on
Note: The code is believed to by SMP safe, but it has NOT been tested on SMP machine.
Server side was tested on:
EAN code: |
![]() |