This post is part of an educational series on building a shell script to graphically display the structure of a directory.
While you are surfing your linux in command line, sometimes you want to be able to see the directory structure spread out in your terminal. I personally don’t know of any native command that allows you to do that, but luckily we can make such a command. Let’s discover how we can do this:
Setup
First let’s make a directory structure that we want to display:
[ahmed@amayem ~]$ mkdir test; cd test
[ahmed@amayem test]$ git init
Initialized empty Git repository in /home/ahmed/test/.git/
If you don’t want to make a git repository then feel free to manually make some folders and add some test files randomly to them. The purpose of this is just to test our system.
Dem Pilafian’s method
I found this method by Dem Pilafian on centerkey. Let’s give it a try:
[ahmed@amayem test]$ ls -R | grep ":$" | sed -e 's/:$//' -e 's/[^-][^/]*//--/g' -e 's/^/ /' -e 's/-/|/'
.
Hmm it’s telling me that nothing is in the repo. But I think I know why, it’s because the git folder is actually hidden because it starts with a dot: .git
.
[ahmed@amayem test]$ ls -A
.git
Let’s add the -A
flag to our first command:
[ahmed@amayem test]$ ls -RA | grep ":$" | sed -e 's/:$//' -e 's/[^-][^/]*//--/g' -e 's/^/ /' -e 's/-/|/'
.
|-.git
|---branches
|---hooks
|---info
|---objects
|-----info
|-----pack
|---refs
|-----heads
|-----tags
Great it worked. To avoid the -A
option I can just enter the directory:
[ahmed@amayem test]$ cd .git/
[ahmed@amayem .git]$ ls -R | grep ":$" | sed -e 's/:$//' -e 's/[^-][^/]*//--/g' -e 's/^/ /' -e 's/-/|/'
.
|-branches
|-hooks
|-info
|-objects
|---info
|---pack
|-refs
|---heads
|---tags
Let’s break it down to understand what is going on:
Breakdown
ls -R
ls
is our usual command that lists for us what is in our current directory, and if you give it a directory as an argument it will list the contents of that directory. Let’s check the man
page to see what it means:
-R, --recursive
list subdirectories recursively
So let’s see what we get with that:
[ahmed@amayem .git]$ ls -R
.:
branches config description HEAD hooks info objects refs
./branches:
./hooks:
applypatch-msg.sample post-update.sample pre-commit.sample pre-rebase.sample
commit-msg.sample pre-applypatch.sample prepare-commit-msg.sample update.sample
./info:
exclude
./objects:
info pack
./objects/info:
./objects/pack:
./refs:
heads tags
./refs/heads:
./refs/tags:
The first line says, .:
. The dot stands for the current directory and the colon indicates that the next line will show the contents of that directory. After showing the contents, there is an empty line to indicate the end of the contents of that directory, then the next directory is shown, ./branches:
and so on till all directories are listed. We are piping that output into grep ":$"
. Let’s see what that means:
grep “:$”
grep
, “prints lines matching a pattern”, as we learn from grep
‘s man
page. So basically it is taking the output above and then printing out the lines that match the pattern ":$"
. What does the $
mean? The man
page sheds some light:
Anchoring
The caret ^ and the dollar sign $ are meta-characters that respectively match the empty string at the beginning and end of a line.
So the pattern ":$"
means lines ending with a colon. So that means that we are printing out the directory paths. Let’s give it a try:
[ahmed@amayem .git]$ ls -R | grep ":$"
.:
./branches:
./hooks:
./info:
./objects:
./objects/info:
./objects/pack:
./refs:
./refs/heads:
./refs/tags:
Looks like we have succeeded.
sed
sed
is a, “stream editor for filtering and transforming text”, as mentioned in sed
‘s man
page. So it is editing the output that we gave it earlier. The man
page gives us more details:
DESCRIPTION
Sed is a stream editor. A stream editor is used to perform basic text transformations on an input stream (a file or input from a pipeline). While in some ways similar to an editor which permits scripted edits (such as ed), sed works by making only one pass over the input(s), and is consequently more efficient. But it is sed’s ability to filter text in a pipeline which particularly distinguishes it from other types of editors.
Our sed
arguments above were as follows:
sed -e 's/:$//' -e 's/[^-][^/]*//--/g' -e 's/^/ /' -e 's/-/|/'
I see four -e
options. The man
page tells us the following:
-e script, --expression=script
add the script to the commands to be executed
So there are four scripts that we are executing. Let’s go through them.
-e ‘s/:$//’
I recognize the :$
part from earlier as the pattern that matches to the end of the directory paths, as for the rest we have to check the man
page again:
s/regexp/replacement/
Attempt to match regexp against the pattern space. If successful, replace that portion matched with replacement. The replacement may contain the special character & to refer to that portion of the pattern space which matched, and the special escapes 1 through 9 to refer to the corresponding matching sub-expressions in the regexp.
So we are, essentially, deleting the colon. Let’s see what happens with it:
[ahmed@amayem .git]$ ls -R | grep ":$" | sed -e 's/:$//'
.
./branches
./hooks
./info
./objects
./objects/info
./objects/pack
./refs
./refs/heads
./refs/tags
Looks like we were correct.
-e ‘s/[^-][^/]*//–/g’
We may be unsure what this new pattern, [^-][^/]*/
means. Let’s check grep
‘s man
page again to figure it out.
Character Classes and Bracket Expressions
A bracket expression is a list of characters enclosed by [ and ]. It matches any single character in that list; if the first character of the list is the caret ^ then it matches any character not in the list. For example, the regular expression [0123456789] matches any single digit.
So the pattern is saying, match anything that doesn’t start with a -
or a /
. The backslash in /
is to escape the slash, otherwise sed will think that it was the end of the pattern. The *
is explained here:
Repetition
A regular expression may be followed by one of several repetition operators:
? The preceding item is optional and matched at most once.
* The preceding item will be matched zero or more times.
+ The preceding item will be matched one or more times.
{n} The preceding item is matched exactly n times.
{n,} The preceding item is matched n or more times.
{,m} The preceding item is matched at most m times.
{n,m} The preceding item is matched at least n times, but not more than m times.
So we are looking for a string that does not start with a dash nor slash, and ends with a slash. Then we will replace it with --
. What’s that g
doing after the replacement?
g G Copy/append hold space to pattern space.
It’s not clear what this does, so let’s try it without and with the option:
[ahmed@amayem .git]$ ls -R | grep ":$" | sed -e 's/:$//' -e 's/[^-][^/]*//--/'
.
--branches
--hooks
--info
--objects
--objects/info
--objects/pack
--refs
--refs/heads
--refs/tags
[ahmed@amayem .git]$ ls -R | grep ":$" | sed -e 's/:$//' -e 's/[^-][^/]*//--/g'
.
--branches
--hooks
--info
--objects
----info
----pack
--refs
----heads
----tags
So the g
tells sed
to keep going after it has done its replacement. Pretty cool.
-e ‘s/^/ /’
This time the caret, ^
, is not inside square brackets so it acts as an anchor to the beginning of the word. Check anchoring mentioned earlier. We are adding three spaces at the beginning of eachline:
[ahmed@amayem .git]$ ls -R | grep ":$" | sed -e 's/:$//' -e 's/[^-][^/]*//--/g' -e 's/^/ /'
.
--branches
--hooks
--info
--objects
----info
----pack
--refs
----heads
----tags
-e ‘s/-/|/’
This one replaces the -
with a |
:
[ahmed@amayem .git]$ ls -R | grep ":$" | sed -e 's/:$//' -e 's/[^-][^/]*//--/g' -e 's/^/ /' -e 's/-/|/'
.
|-branches
|-hooks
|-info
|-objects
|---info
|---pack
|-refs
|---heads
|---tags
Notice that because there was no g
flag, sed
only replaced the first dash of each line.
Next steps
- Breaking down the script that uses the one line command
- Modifying the one line command to show files, as well as directories
- Building a new recursive script that looks better and has more functionality
References
- Dem Pilafian on centerkey