Nov 112018

If you use the Unix or Linux command-line, you may very well wonder about the origins of some of the “special” characters. One of those is tilde (~) which is expanded by the shell into “home” :-

✓ mike@Michelin» echo $HOME                        
✓ mike@Michelin» echo ~
✓ mike@Michelin» echo ~root

This doesn’t of course work in general; just in the shell.

But where did this usage originate?

As it turns out, it was the markings on the keyboard of the ADM3A terminal :-

If you used Unix in the late 1970s/1980s, you may very well have used the ADM3A terminal and it seems that those who added the tilde feature to the Unix shell were amongst the users.

Oct 192016

I have just been listening to a Microsoft fanboy on the you tube wittering on about something (not computer related), when he tried to read out a URL. According to him, there are “backslashes” in the URL.

Not in any normal URL. For those who do not know, URLs are web site addresses such as The character that appears after the network protocol (http) – the “/” is formally known as the solidus, and less formally as a slash. The slash that goes the other way is called the backslash (or more formally the reverse solidus).

And who decided that one was a slash (‘/”) and the other a backslash (‘\’)? Although it has been used since the Medieval era, it was probably first called as solidus in the 19th century because of it being used to signify the British shilling. Currently it is the Unicode Consortium who call it a solidus in the international standard for character encoding. If you disagree with them, by all means either convince them they’re wrong or set up a new international standard and get it more widely adopted than Unicode.

Until then, I’ll carry on calling someone who says a backslash looks like – ‘/’, wrong.

Does it matter? In the big scheme of things probably not, but it does make reading out instructions more difficult when either slashes or backslashes appear. After all computers rarely say “Ah! I see what you meant! You meant which is different (and makes sense) to http:\\\“. And as anyone who has ever encountered autocorrect “mistakes” will attest, letting computers decide what you meant is not always the best idea.

And how did the mistake originally occur? To some extent Microsoft is to blame, although I doubt Microsoft ever called the slashes the wrong name.

When Microsoft wrote their first operating system (DOS), they chose to make it semi-compatible with an earlier operating system (CP/M) which used the slash to indicate the use of an option to a command-line command which in turn was inherited from certain early DEC operating systems.

When they came to implementing directories (yes that long ago), they broke with the tradition of stealing ideas from DEC (or we would have ended up with paths like C:[WINDOWS.SYSTEM]FOO.SYS) and instead chose the Unix path separator. But the slash conflicted with option processing on the command-line, so they used the backslash instead – C:\WINDOWS\SYSTEM\FOO.SYS.

Of course people started calling the backslash, a slash, and I’m sure there are many out there who will continue despite being told that they are wrong. Of course when I say they’re wrong, I have the backing of an international group of grapheme experts behind me.


Dec 102015


You have a a column of numbers that you have produced in some manner such as :-

$ awk '/clean message/ {print $(NF-1)}'

And you want a quick and dirty way of finding the largest number. Well there is a way but it is perhaps the least efficient way to do it, and that is to sort the numbers into numerical order and use “head” to display the first one :-

$ awk '/clean message/ {print $(NF-1)}' | sort -rn | head -1

But frankly there must be a better method. And yes there is if you happen to be using zsh (or possibly others, but this has been tested with zsh). Simply iterate over the values assigning the current value to the “max” variable if the current variable is larger :-

$ max=0; for x in $(awk '/clean message/ {print $(NF-1)}'; [[ $x -gt $max ]] && max=$x; echo $max

You may be wondering why I don’t simply use the ability of awk to perform calculations. Well that is certainly possible, but I may not always be using awk to produce the numbers in the first place, and this is supposed to be a generic recipe.

Oct 232015

Have you ever wondered if you can tinker with the ps command to change how and what is displayed? No? Well give up reading this post then.

I've known about ps for ages and also the way that the output can be tinkered with, but have not had an excuse to dig into it properly until I was looking for a way for ps to show the Linux container name for each process (don't get excited: ps -o machine is documented but not implemented at the time of writing). 

If you read the manual page for ps you will be quickly distracted by all the different options available. These can be grossly simplified into three different groups of options: which processes to list, what to output, and how to sort the output.

Which Processes?

This can be simplified down to almost nothing; ps on it's own lists just the processes running from the current terminal (window) :-

% ps
  PID TTY          TIME CMD
 2591 pts/17   00:00:00 zsh
13325 pts/17   00:00:00 ps

If you want to display all processes, add the "-e" option :-

% ps -e
  PID TTY          TIME CMD
    1 ?        00:00:03 systemd
    2 ?        00:00:00 kthreadd
    3 ?        00:00:00 ksoftirqd/0
    5 ?        00:00:00 kworker/0:0H
    7 ?        00:00:54 rcu_sched
    8 ?        00:00:00 rcu_bh
    9 ?        00:00:35 rcuos/0
   10 ?        00:00:00 rcuob/0
   11 ?        00:00:00 migration/0
   12 ?        00:00:00 watchdog/0

And lastly (not literally – there are other options), add the "-p" option to list processes by process ID :-

% ps -p 1
  PID TTY          TIME CMD
    1 ?        00:00:03 systemd

Tuning The Output

By default the fields that ps outputs is somewhat peculiar until you realise that the output fields have been frozen in time. The default choice is somewhat minimal; and I'm not in favour of minimalism. And what use is the TTY and the TIME fields?

The TTY field shows you what terminal the process is running on – this was handy on a multi-user system where you could find out who was on what terminal and then write a message directly to their screen. A great way of winding people up, but not so much use these days. And TIME? We're no longer billed for the cpu time we consume, so the time spent running on the cpu is a rather pointless thing to list.

The "-f" option displays more information :-

% ps -f
mike     26486 31092  0 21:24 pts/24   00:00:00 ps -f
mike     31092 31091  0 20:11 pts/24   00:00:00 -zsh

But the output is still somewhat peculiar, and there are other more interesting fields to display.

There are various options for choosing the output format amongst a set of predefined choices, but the best bet is to ignore these and jump straight into selecting the individual fields that you want. These can be found in the manual page in the "STANDARD FORMAT SPECIFIERS" section. Simply list the fields you want after the "-o" option :-

% ps -o pid,comm,pcpu,pmem,nlwp,user,stat,sgi_p,wchan,class,pri,nice,flags
28061 ps               0.0  0.0    1 mike     R+   3 -      TS   19   0 0
31092 zsh              0.0  0.0    1 mike     Ss   * -      TS   19   0 0

Obviously typing this in every time is somewhat less than ideal, but fortunately the authors of ps have already thought of this. By listing the fields within the PS_FORMAT environment variable, there is no need to specify -o :-

% export PS_FORMAT="pid,comm,pcpu,pmem,nlwp,user,stat,sgi_p,wchan,class,pri,nice,flags"
% ps
29440 ps               0.0  0.0    1 mike     R+   5 -      TS   19   0 0
31092 zsh              0.0  0.0    1 mike     Ss   * -      TS   19   0 0

To make this pernament, add this to your shell startup rc file; whilst editing you may as well set PS_PERSONALITY to "linux".

Sorting The Output

According to the ps documentation, by default the output is not sorted. In that case either my kernel's process table is remarkably well organised, or the distributions I use "cheat" and sort the output in process ID order. In the distant past where computers were shared amongst too many people, and the machines themselves were quite slow, it made sense for the output of ps to be unsorted. But it certainly doesn't make sense now.

And the ps command allows processes to be sorted by any field that you can specify in the "STANDARD FORMAT SPECIFIERS" section which conveniently enough you are now intimately acquianted. Simply add the relevant field to the –sort option :-

% ps --sort pcpu
31092 zsh              0.0  0.0    1 mike     Ss   * -      TS   19   0 0
31743 ps               0.0  0.0    1 mike     R+   5 -      TS   19   0 0

With just a short list (and such a low percentage of the cpu in use) it doesn't make sense, but added to -e, it does.

Rather than change the default sort order, I personally prefer to configure aliases to do the job for me :-

% alias pscpu='ps --sort pcpu'
% alias psmem='ps --sort pmem'

Preferring to use an alias here is rather convenient as there doesn't seem to be a way to configure the default sort order – officially there isn't one!

Reading through the ps manual page (during which you will notice many different options referring to old Unix varients) is a reminder of just how long and bitter the fight over which ps varient was the best. And now for a completely irrelevant picture :-


Feb 082009

I was reading a comment about the df command (in relation to reserved filesystem space) and realised that the clueless newbie was right; it is odd that df does not mention reserved space. Of course it would also be wrong for df to lie about the matter too. I then realised that df is long overdue for a bit of refreshing. If you look at the typical output of the df command, you will find it inconveniently cluttered :-

Filesystem            Size  Used Avail Use% Mounted on
                       12G  7.8G  4.3G  65% /
tmpfs                 2.0G     0  2.0G   0% /lib/init/rw
varrun                2.0G  416K  2.0G   1% /var/run
varlock               2.0G     0  2.0G   0% /var/lock
udev                  2.0G  3.1M  2.0G   1% /dev
tmpfs                 2.0G  344K  2.0G   1% /dev/shm
lrm                   2.0G  2.4M  2.0G   1% /lib/modules/2.6.27-7-generic/volatile
/dev/sdb1             130M   36M   88M  29% /boot
                      2.0G  776M  1.3G  39% /opt
                      5.0G  1.4G  3.7G  28% /var
                      256G  116G  141G  46% /home
                       96G   62G   35G  64% /vmachines
                      256G  6.2G  250G   3% /var/spool/brag
                       16G  4.6G   12G  29% /var/herpes
/dev/sda1             463G  147G  293G  34% /mdata
/dev/scd0             2.4G  2.4G     0 100% /media/CIVCOMPLETEEU
                       32G  1.9G   31G   6% /cdimages
                       16G  498M   16G   4% /sim

Part of the problem is that df does not do quite what it claims to do … to report free space on the mounted filesystems. It also gives some (a very small amount) of additional information about the relevant filesystems … particularly the device the filesystem is mounted on. This “helps” to make the output more cluttered that it needs to be. It is possible that there are those who will argue that the device is the filesystem and not where it is mounted; they are arguably right, but when you use df you are either looking at where in the Unix file hierarchy there are places that have less space than is comfortable, or for places that have enough space to put that big file you are about to download.

Next the command itself has an obscure command to make it easier to type on a slow type-writter like terminal (those who are below a certain age will not realise that we used to comminicate with Unix machines using a terminal that was more like a printer than the screens we use today). It might be better named fsspace with an alias of diskspace for those who want to concentrate on what worries them rather than on what worries the machine.

Next why not take advantage of certain features that have crept almost silently into the command line over the last few decades ? Why not adjust the output to the width of the terminal window (look for the $COLUMNS evironment variable), spacing things out or even adding more information when you have enough space?

Finally if you were to dig around the df command a little bit you will encounter something peculiar called “inodes”. Now I know what an inode is, and I dare say quite a few people reading this will know, but if you do not, knowing how many inodes there are is not very useful information. It is relatively rare (these days) for a filesystem to run out of inodes so this information has a low priority, and why not use a term more understandable than “inodes” ?

Changing a term is something to be avoided in most circumstances which is why we still have “inode” where even the originator of the term has to guess that the “i” means “index”. I would suggest that something like “fileslots”or perhaps “fslots”

We now have the basic specification of something that should look like :-

% diskspace
Filesystem                            Size  %Used %fslots  Avail
/                                      12G    70%      3%   3.6G
/lib/init/rw                          2.0G     0%      0%   2.0G
/var/run                              2.0G     0%      0%   2.0G
/var/lock                             2.0G     0%      0%   2.0G
/dev                                  2.0G     0%      1%   2.0G
/dev/shm                              2.0G     0%      0%   2.0G
/lib/modules/2.6.27-7-generic/vola+   2.0G     0%      0%   2.0G
/boot                                 130M    29%      0%    88M
/opt                                  2.0G    40%      1%   1.2G
/var                                  5.0G    18%      0%   4.1G
/home                                 256G    52%      0%   124G
/cdimages                              32G    65%      0%    12G
/mdata                                463G    36%      1%   280G

This could be improved in some ways – for instance it would be helpful to skip over certain of the filesystems that are not strictly speaking backed by disk. However it is beginning to be useful.

Or would be if the code exists. Fortunately it does.

WP2Social Auto Publish Powered By :