Filesystems

Ben Clifford

benc@hawaga.org.uk

www.hawaga.org.uk/ben/tech/raspberry-pint-filesystems

Files

Storage devices

Storage devices

microsd

microsd

  1. Read 512-byte block from storage device
  2. Modify bytes
  3. Write 512-byte block to storage device

filesystems

filesystems

$ df -hT / /boot
Filesystem     Type  Size  Used Avail Use% Mounted on
/dev/root      ext4   29G  5.7G   23G  21% /
/dev/mmcblk0p1 vfat  253M   40M  214M  16% /boot

mountpoints

$ df -hT / /boot
Filesystem     Type  Size  Used Avail Use% Mounted on
/dev/root      ext4   29G  5.7G   23G  21% /
/dev/mmcblk0p1 vfat  253M   40M  214M  16% /boot

$ ls /etc/password
$ ls /boot/config.txt

(many) filesystems

pi@tyne:~ $ df -hTa
Filesystem     Type         Size  Used Avail Use% Mounted on
/dev/root      ext4          29G  5.7G   23G  21% /
devtmpfs       devtmpfs     459M     0  459M   0% /dev
sysfs          sysfs           0     0     0    - /sys
proc           proc            0     0     0    - /proc
tmpfs          tmpfs        464M     0  464M   0% /dev/shm
devpts         devpts          0     0     0    - /dev/pts
tmpfs          tmpfs        464M   47M  417M  11% /run
tmpfs          tmpfs        5.0M  4.0K  5.0M   1% /run/lock
tmpfs          tmpfs        464M     0  464M   0% /sys/fs/cgroup
cgroup2        cgroup2         0     0     0    - /sys/fs/cgroup/unified
cgroup         cgroup          0     0     0    - /sys/fs/cgroup/systemd
cgroup         cgroup          0     0     0    - /sys/fs/cgroup/pids
cgroup         cgroup          0     0     0    - /sys/fs/cgroup/net_cls
cgroup         cgroup          0     0     0    - /sys/fs/cgroup/cpu,cpuacct
cgroup         cgroup          0     0     0    - /sys/fs/cgroup/memory
cgroup         cgroup          0     0     0    - /sys/fs/cgroup/blkio
cgroup         cgroup          0     0     0    - /sys/fs/cgroup/freezer
cgroup         cgroup          0     0     0    - /sys/fs/cgroup/cpuset
cgroup         cgroup          0     0     0    - /sys/fs/cgroup/devices
sunrpc         rpc_pipefs      0     0     0    - /run/rpc_pipefs
systemd-1      -               -     -     -    - /proc/sys/fs/binfmt_misc
mqueue         mqueue          0     0     0    - /dev/mqueue
debugfs        debugfs         0     0     0    - /sys/kernel/debug
configfs       configfs        0     0     0    - /sys/kernel/config
/dev/mmcblk0p1 vfat         253M   40M  214M  16% /boot
/dev/sdb1      ext4         1.8T  1.7T   96G  95% /mnt
tmpfs          tmpfs         93M     0   93M   0% /run/user/1000
binfmt_misc    binfmt_misc     0     0     0    - /proc/sys/fs/binfmt_misc

default filesystems

vfat

vs

ext4

FAT on small devices

other disk file systems

  • zfs
  • btrfs

multiple block devices. snapshots.

network file systems

sshfs

$ mkdir ~/mnt/tyne
$ sshfs pi@tyne.cqx.ltd.uk:/home/pi ~/mnt/tyne
$ df -hT /home/benc/mnt/tyne/
Filesystem                  Type        Size  Used Avail Use% Mounted on
pi@tyne.cqx.ltd.uk:/home/pi fuse.sshfs   29G  5.7G   23G  21% /home/benc/mnt/tyne

other network filesystems

  • NFS - unix tradition
  • samba - windows tradition

Network Filesystem downsides

Software expects file access to be:

    fast
    reliable

The network is often not those things.

media devices

~ $ jmtpfs ./phone
Device 0 (VID=2717 and PID=ff40) is a Xiaomi Mi-2s (id2) (MTP).
Android device detected, assigning default bug flags

~ $ df -h ./phone
Filesystem      Size  Used Avail Use% Mounted on
jmtpfs          708M -2.7G  3.4G    - /home/pi/phone

~/phone/Internal shared storage/DCIM/Camera $ cp IMG_20210223_17* ~/tmp/p/

LEDs-as-filesystem

$ df -h /sys
Filesystem      Size  Used Avail Use% Mounted on
sysfs              0     0     0    - /sys

$ cat /sys/class/leds/led0/trigger
none rc-feedback kbd-scrolllock kbd-numlock kbd-capslock
kbd-kanalock kbd-shiftlock kbd-altgrlock kbd-ctrllock
kbd-altlock kbd-shiftllock kbd-shiftrlock kbd-ctrlllock
kbd-ctrlrlock timer oneshot heartbeat backlight gpio cpu
cpu0 cpu1 cpu2 cpu3 default-on input panic mmc1 [mmc0]
rfkill-any rfkill-none rfkill0 rfkill1
# echo none > trigger
# while true; do
    echo 255 > brightness ;
    sleep 0.2 ;
    echo 0 > brightness ;
    sleep 0.8 ;
  done

let a thousand filesystems bloom

  • Write code in linux kernel
  • Filesystem in User Space (FUSE)
  • overlayfs: make multiple directories appear as one
  • encfs: transparently encrypts files on disk
  • tmpfs: stores files in memory (what in the 1980s would be called a RAM disk)
  • proc: Exposes lots of linux internals under /proc
  • ntfs: mount Windows NTFS filesystems
  • iso9660: access files on CD-ROMs
  • cernvmfs: for distributing software installations globally
  • exFAT: like FAT but more modern features eg >4Gb file size

- end -

more notes



so i make files and they get stored on my Pis internal SD card.
except if i plug in a USB stick it appears in this folder here:
 and somehow that... doesn't get stored on the Pi internal SD card?

(at this point, gentle intro to mount points, and perhaps `df` command line utility
with -h parameter (but no -T?) - just show mountpoints, size and block device)

now can introduce a diagram perhaps of tools / VFS / filesystems / block device
  - deliberately simplify to exclude the non-block-based filesystems at this stage

discuss briefly what a block device is.

filesystems:
Two that you'll usually see in use on a bog-stanard Pi installed "normally":
ext4 (3,2...)
fat (variants: vfat, exafat) - bit of history there. especially note that on a pi, /boot is a fat filesystem. get some old original PC picture? FAT stands for "file allocation table". Windows stream moved onto NTFS. but fat has stuck around as a fairly simple filesystem to implement that is used commonly on removable media - eg digital cameras. phones. even programmign a BBC microbit or a Pi Pico (pi pico especially relevant to Raspberry Pi meetup, using somewhat weird https://github.com/Microsoft/uf2 format).
 (show df -h with a microbit plugged in and a USB stick plugged in, and a pi pico plugged in (or two of them!), on a Pi? along with a photo of that physically - **even** MY SOLDERING IRON!)

   Discuss unix permissions/ownership, and point out that regular FAT doesn't have these - its history is as a single user filesystem.

other filesystems ... that don't use block devices
network filesystems: doesn't need to use a block device - the files could be coming from some other computer, not from a block device.
two examples of that I use a lot are NFS and SSHFS - they look very different.
SSHFS especially interesting to me as a very low end way of getting from my linux laptop
to edit files on remote systems.
* request from richard to know about samba (although not sure if I can easily demo that?
perhaps I can set up a samba server on my laptop?)

and ... they don't even need to have any backing store that looks like storage.
The classic example of that is /proc: df -hT /proc
but on the pi, can see things like GPIO pins: (that mode where you read from a file and
see GPIO pin state is a nice Pi specific example of that)

It doesn't have a backing store (it just says proc) and it doesn't have a size. this is a
filesystem that makes every process running on your Pi appear as a directory, as well as
various other bits of operating system specific stuff
(give a couple of examples: eg a process one, and /proc/loadavg - one process based and one system-wide)
(or a process one, and a GPIO /sys one?) 
* specific example: can I toggle the LED on a Pi Zero by saving in a text editor?
(assuming people can see my camera in speaker view)


Also interestings:
* overlayfs - used to create a writeable overlay on top of a read only filesystem. This technique is something that you might do 20 years ago with mounting a CD-ROM of a linux distro and then letting you edit things locally; more recently thats how container images work.
* andrew mentioned ZFS - properties of this? (is it available on Pi?)
*  also 'btrfs' - see bodil stokke's build a pi thing that came up on my feed - eg start here: https://twitter.com/bodil/status/1349091913274679301
* FUSE is one way to build your own filesystem implementation (can I get a 1 screen filesystem?)
   - especially note that I'm a big fan of plugin architectures in general: build a core, build a clean-ish interface, plug stuff in across that interface.
    - SSHFS is actually implemented on top of fuse.
    - I also use encfs on top of fuse
    - plenty of other exampels here: https://en.wikipedia.org/wiki/Filesystem_in_Userspace
* another way to write a filesystem is to implement it in the kernel - lower level
* mention in passing iso9660 - CD-rom filesystem
* CERN VM FS

example of what FAT looks like on disk, briefly?

discuss partitions somewhere: so that a block device like an SD card is treated as several block devices. show /proc/partitions. /dev/sda /dev/sda1  (or mmblkc etc)

mention "special" nodes like symlinks, device nodes, etc - without going into detail too much.
but also mention that these are traditionally unixy so non-unixy filesystems like vfat don't
deal with them. example of device nodes: the partitions (which i probably will have shown by now)
and serial ports - for example, /dev/ttyAMA0 which if you try to use the Pi UART pins, you 
might have encountered.

mention extended attributes - merely in passing. "I don't encounter a lot of use for them". the main example is rsync fake root.

* "why use different filesystems on block devices?" - because of different properties - ZFS "interesting large scale stuff". FAT - broad compatibility across devices. In a Pi context, if I wanted to be able to put a config file on my SD card in a form that I could edit on almost any computer, making a FAT partition and putting it on that partition would be the way to do it.

* downsides of network file systems: they try to make something remote, subject to things like connection failure, remote server reboots, etc, work like something that is directly connected: so occasional weird problems: eg if I write to a remote filesystem (i.e. press save in my text editor), and that server has crashed, should the local system: a) wait (for days?) until that remote system is restarted or b) return an error right away? (neither is the right answer in all cases...). There are problems with distributed systems that need to be addressed, and using the filesystem interface isn't always the right way to address them.

* when demoing fuse, show that this is an example of what the kernel might ask of any filesystem
 - so show a few (not many) filesystem-like API calls.

* THINGS FOR ME TO PRACTIALLY LEARN FOR THIS TALK AND THEN TALK ABOUT MY ADVENTURES:

 * install ZFS on a pi (?)
 * write a basic fuse filesystem (can I make it do something interesting?) -- perhaps something to talk to the micropython filesystem on a Pi Pico (separate from the bootload FAT filesystem)...
 * cernvmfs

* "mount" command - talk about how lots of desktop environments will do this mount automatically so a lot of the time if you plug it in it will appear somewhere automatically - put that alongside "df" as "some commands we can use to do stuff" (and umount too).


* TODO: can I format this as a blog post/article (series?) primarily and present that way? so that there is a better lasting artefact than a slide set / video? (or really make the slide set "online viewing first, presentation second?"

* make a "christmas tree" slide which is my pi with as many different filesystems mounted on it as I have talked about - or many, at least - df -hT output


* show NFS between VMs on my linux VMs (not Pi, though, but still an example)



* should give some example of layout on disk: FAT might be an interesting one to learn about because its used on SD cards a lot and not complicated - so relevant to the microcontroller side of things. - point is "here's a layout of things on disk, but what you see is 'files' - what hides this layout from you is the filesystem code"

* need to be clear when I say filesystem, do I mean "the driver code (in kernel or FUSE)" or "the conceptual layout of things on disk" or "an instance of that conceptual layout on a particular block device instance/loopback"

* mention loopbacks as block devices


* mention "mount" command, and mention that if you have a desktop environment installed
then often stuff will be mounted automatically for you.