As a data engineer and solutions architect I'm often playing with data at home. Lately, I've been gathering image data for training some machine learning algorithms on no-reference automatic image quality assessment (no reference IQA). This problem seems to have been worked on by the group behind the LAION-Aesthetics dataset.

This dataset is massive (at least, to me). The full set is 1.2 billion images. And! I want every one of them. πŸ˜‚

As a data engineer, I’m more familiar with ETL/ELT pipelines than managing raw filesystems. Let alone managing such a huge file based dataset. After some scoping out S3 bucket costs, I've decided I really needed a cheap home network attached storage device (NAS). This led me to repurposing "Three" on my Raspberry Pi server rack.

Prerequisites

I'm using:

  • Raspberry Pi 5 with 16 GB RAM
  • 128 GB SD Card
  • Official Raspberry Pi Power Supply
  • SAMSUNG T7 Portable SSD, 2TB External Solid State Drive

raspberry-pi-kit

Here's an ASCII diagram of the planned setup:

                          +--------------------+                      
                          |    Internet / ISP  |                      
                          +---------+----------+                      
                                    |                                 
                          +---------v----------+                      
                          |   Home Router      |                      
                          |  (192.168.1.1)     ──────────────────────┐
                          +---β”‚-------------β”‚--+                     β”‚
                              β”‚             β”‚                        β”‚
          β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜             └───┐                    β”‚
          β”‚                                     β”‚                    β”‚
+---------β–Ό---------+                 +---------β–Ό---------+          β”‚
| Raspberry Pi NAS  |                 |   Windows Laptop  |          β”‚
| Host: three       |                 | \\192.168.1.103\  |          β”‚
| IP: 192.168.1.103 |                 +--------------------+         β”‚
| /mnt/storage via  |                                                β”‚
|   Samba (SMB)     |                                                β”‚
+---------β”‚---------+                                                β”‚
          β”‚                                                          β”‚
          β”‚                                                          β”‚
    +-----β–Ό------+                    +--------------------+         β”‚
    | 2TB SSD    |                    |    MacBook Pro     |         β”‚
    | /dev/sda1  |                    | /Volumes/Shared    |β—„β”€β”€β”€β”€β”€β”€β”€β”€β”˜
    +------------+                    +--------------------+            

Setup Ubuntu Server

Download and install the Raspberry Pi Imager application:

After installation, open the application and insert your SD card.

raspberry-pi-installer-splash

  1. Select Choose Device --> Raspberry Pi 5 (or your Pi type)
  2. Click on Choose OS --> Other general-purpose OS --> Ubuntu --> Ubuntu Server xx.xx (64-bit) ubuntu-server-selection-in-raspberry-pi-os-installer

  3. Then select Choose Storage and select your SD card. raspberry-pi-os-installer-sd-card

  4. Select Next

  5. You should be prompted to set OS customisation and click Edit Settings . customize-settings

  6. Under the General settings of customization fill in the following:

    • Set hostname β€” personally, I set this a a nickname for this host machine. It shows up in my router device listing. I'm naming mine three as it will be at IP subnet .103 . Later, I'll make sure this matches my Mac and Linux machines ~/.ssh/config entry later
    • Set the Username and Password -- this will allow you to login as soon as you boot up

    general-tab-raspberry-pi-os-installer

  7. On the "Services" tab:

    • Tick on "Enable SSH" -- this is required to finish the setup
    • Set radio selector to "Allow public-key authentication only"
    • Paste in your public key

    If you need information on creating a ssh key, see:

    services-tab-raspberry-pi-os-installer

  8. After saving your customization settings, click "Yes" to continue the installation.

Requirements

The following can be used to prepare your disk drive.

  1. Before plugging in the disk you plan to use for your NAS, from the Raspberry Pi terminal run:
lsblk

It should provide output similar to:

ladvien@three:~$ lsblk
NAME        MAJ:MIN RM   SIZE RO TYPE MOUNTPOINTS
loop0         7:0    0  38.7M  1 loop /snap/snapd/23772
loop1         7:1    0  44.3M  1 loop /snap/snapd/24509
mmcblk0     179:0    0 119.1G  0 disk 
β”œβ”€mmcblk0p1 179:1    0   512M  0 part /boot/firmware
└─mmcblk0p2 179:2    0 118.6G  0 part /
nvme0n1     259:0    0 476.9G  0 disk 
└─nvme0n1p1 259:1    0 476.9G  0 part

Take note of these results, as we will compare it against this command run again after connecting your the new disk drive.

  1. Connect your new disk to the Raspberry Pi.
  2. At the Raspberry Pi terminal run lsblk again.
ladvien@three:~$ lsblk
NAME        MAJ:MIN RM   SIZE RO TYPE MOUNTPOINTS
loop0         7:0    0  38.7M  1 loop /snap/snapd/23772
loop1         7:1    0  44.3M  1 loop /snap/snapd/24509
sda           8:0    0   1.8T  0 disk 
└─sda1        8:1    0   1.8T  0 part
mmcblk0     179:0    0 119.1G  0 disk 
β”œβ”€mmcblk0p1 179:1    0   512M  0 part /boot/firmware
└─mmcblk0p2 179:2    0 118.6G  0 part /
nvme0n1     259:0    0 476.9G  0 disk 
└─nvme0n1p1 259:1    0 476.9G  0 part

Note the new sda1 drive. This is the disk we will target.

  1. Format the storage device

Before continuing we need to ensure the disk is setup with the correct filesystem format. We are going to use ext4 . This format allows for extremely fast read and write access, but is limited to a Linux based system. To be clear, one will still be able to access the drive remotely from Windows or Mac, but if it is an external drive, you will not be able to disconnect it from your Pi and connect it to a Windows or Mac computer directly.

One last note before proceeding:

Warning! The following command will reformat the drive as ext4 . All existing data will be lost. This will also make the drive unreadable by Windows and MacOS.

If you'd like to continue, run the following from the Raspberry Pi with the drive connected:

sudo mkfs.ext4 /dev/sda1 -L external_ssd

You can identify your external drive using lsblk before and after plugging it in to make sure /dev/sda1 is correct.

Prepare Raspberry Pi

  1. Setup Raspberry Pi image using Raspberry Pi Imager
  2. On your desktop add ssh key to ~/.ssh/config . This will allow you to login in with ssh three . Please note, this will only work on Unix systems (Linux and MacOS)
Host three
HostName 192.168.1.xxx
User ladvien
IdentityFile ~/.ssh/local_id
PreferredAuthentications publickey

Setup

  1. Update the server sh sudo apt update -y && sudo apt upgrade -y

  2. Create a mount point and set permissions: sh sudo mkdir -p /mnt/storage sudo chmod 0775 /mnt/storage

  3. Mount the drive manually to test it

sudo mount /dev/sda1 /mnt/storage

setup-storage

Persistent Mounting

  1. Find the disk UUID by running:
sudo blkid

Look for the UUID= value next to /dev/sda1 .

For example:

/dev/sda1: UUID=e2a1-1234-5678-90ab ...

You can use quoted or unquoted string UUID in your /etc/fstab entry.

  1. At the terminal type:
sudo nano /etc/fstab

And update the file to include a line like:

UUID=xxxx-xxxx  /mnt/storage  ext4  defaults,noatime 0 2

Where the xxxx-xxxx is the UUID for your corresponding drive when you ran blkid

setup-storage

Option Meaning
defaults Enables standard options: read/write, auto mount, exec, etc.
noatime Prevents updating file access times, improving performance
0 Skip dump (used by legacy backup utilities)
2 Run fsck after / , in order (use 1 for root, 2 for others)

Install Samba

  1. At the terminal, install Samba:
sudo apt install samba -y
  1. Add a Samba user:
sudo groupadd sambauser  # Only if the group doesn't already exist
sudo useradd -M -s /usr/sbin/nologin -g sambauser sambauser
sudo smbpasswd -a sambauser

This creates a system-level sambauser account (without shell access) and adds it to the Samba password database.

  1. Set ownership and permissions on the shared folder:
sudo chown -R sambauser:sambauser /mnt/storage
sudo chmod -R 2775 /mnt/storage

The 2 at the beginning of 2775 sets the setgid bit, ensuring that new files inherit the directory group ( sambauser ), which is essential for consistent access by Samba.

  1. (Optional) Add your current user to the sambauser group to allow local access via CLI:
sudo usermod -aG sambauser $USER

This step lets you manipulate files in /mnt/storage directly as your Linux user.

Command Flags Summary
sudo groupadd sambauser β€”
sudo useradd -M -s /usr/sbin/nologin -g sambauser sambauser -M : no home
-s : disable shell
-g : group
sudo smbpasswd -a sambauser -a : add new Samba user
sudo usermod -aG sambauser $USER -a : append
-G : group
sudo chown -R sambauser:sambauser /mnt/storage -R : recursive
user:group : owner/group
sudo chmod -R 2775 /mnt/storage 2 : setgid
7 : rwx
5 : r-x

Command Explanations:

  • groupadd sambauser -- Creates the sambauser group (skip if it already exists).
  • useradd -M -s /usr/sbin/nologin -g sambauser sambauser -- Adds a system user without shell access, belonging to the sambauser group.
  • smbpasswd -a sambauser -- Adds the user to Samba’s internal password database.
  • usermod -aG sambauser $USER -- Adds your current user to the sambauser group (helpful for CLI access to shared files).
  • chown -R sambauser:sambauser /mnt/storage -- Assigns ownership of the shared directory to the Samba user and group.
  • chmod -R 2775 /mnt/storage -- Enables group inheritance (setgid) and sets read/write/execute permissions appropriately.

Note, the last command is optional. It adds your Linux user to the sambauser group. This should ensure your user has the ability to manipulate files when logged into the NAS Pi.

  1. Edit the Samba Config file
sudo nano /etc/samba/smb.conf

And add the following to the end of the file:

[Shared]
    path = /mnt/storage
    browsable = yes
    writable = yes
    guest ok = no
    valid users = sambauser
    force user = sambauser
    force group = sambauser
    create mask = 0666
    directory mask = 0777

    # Performance tuning
    read raw = yes
    write raw = yes
    min receivefile size = 16384
    socket options = TCP_NODELAY SO_RCVBUF=131072 SO_SNDBUF=131072
    use sendfile = yes
    aio read size = 1
    aio write size = 1
  1. Restart Samba
sudo systemctl restart smbd
  1. Make NAS IP address static. Run sudo nano /etc/netplan/50-cloud-init.yaml and edit the file to look like this. Please note, change the xxx to whatever you prefer. E.g., 192.168.1.xxx/24 becomes 192.168.1.103/24 . Replace the <YOUR_ROUTERS_IP> with the IP address of your router.
network:
  version: 2
  ethernets:
    eth0:
      dhcp4: no
      addresses:
        - 192.168.1.103/24
      routes:
        - to: default
          via: <YOUR_ROUTERS_IP>
      nameservers:
        addresses:
          - 1.1.1.1
          - 8.8.8.8

This will disconnect your session if you’ve changed the IP. Reconnect using the new address, if you've selected a different IP address than the one assigned. When ready, apply these changes with sudo netplan apply

  1. To check the log, use sudo tail -f /var/log/samba/log.smbd

Test it Works

From the Pi terminal run:

smbclient -L localhost -U sambauser

On Mac

From your Mac:

mkdir /Volumes/Shared
mount_smbfs //sambauser@192.168.1.103/Shared /Volumes/Shared

Automatically Connect on Mac at Login

To auto-mount the NAS share when your Mac starts:

  1. Open Finder and press Cmd + K or choose Go > Connect to Server...
  2. Enter the address: smb://192.168.1.103/Shared
    Screenshot: Connect to Server dialog
  3. Click the + button to add it to your list of favorite servers.
  4. Click Connect , then log in using sambauser credentials.
  5. Check "Remember this password" if prompted, and save it to your Keychain.
  6. Open System Settings > General > Login Items .
  7. Click the + button under Open at Login , navigate to /Volumes/Shared , and add it.
    Screenshot: Add NAS to Login Items

On Windows

Open File Explorer and enter:

\\192.168.1.103\Shared

Log in with the sambauser credentials when prompted.