Practical Regex Building: Emulating the ls Command to Separate Options from Files

The best way to learn how to use regex (regular expressions) is to use it practically. This will be a series of educational posts that shows how to go about making regular expressions iteratively by starting small then building up.

Goal

When making your own shell command, you may want to emulate the format of common Linux/Unix commands, such as ls (I started this series to figure out how I can emulate ls in my own bash shell script: ShellTree). In this series we will see how the ls command differentiates between parameters, and see its behaviour. Then we will see how we can use regex to emulate the ls command by separating the options from the files.

First let’s take a look at the format of the ls command.

Format of Ls Commands

Whenever you see the man pages of a linux/unix command you will usually see something that looks like the following:

commandName [-Options] [Files]

Let’s try man ls

 ls [-ABCFGHLOPRSTUW@abcdefghiklmnopqrstuwx1] [file ...]

That means that to use the command we first type the command name, then we can optionally type any of those options with a dash before hand. We can combine the options or use them separately for example:

ls -Al

is the same as the following:

ls -A -l

Afterwards we can type in as many files as we want, or no files at all.

Adding Options Between the Files

An interesting case is when we add options between the files, will the command recognize the option?

Ahmeds-MacBook-Pro:shell-tree ahmedamayem$ ls README.md -l LICENSE 
ls: -l: No such file or directory
LICENSE     README.md

The command recognizes it as a file name not an option, so we will want the regex to do the same.

Cases

To better understand what is meant by the options and files parts let’s draw out some cases:

Case Parameter Examples Options Files
No Arguments
One dash with one non-whitespace character -i -i
One dash with multiple non-whitespace characters -ia -ia
Multiple dashes with single and multiple non-whitespace characters -ia -q -A -ia -q -A
One file i i
Multiple Files i . /usr i . /usr
Multiple Files with Options Format i . -i /usr -A i . -i /usr -A
Multiple Options with Multiple Files with Options Format i . -i /usr -A -ia -q -A i . -i /usr -A

These are not comprehensive but it gives us an idea of what we want the regex to accomplish. Let’s start building the regex.

Building the Regex

  1. Building the regex for the Options part
  2. Building the regex for the Files part

Ahmed Amayem has written 90 articles

A Web Application Developer Entrepreneur.