Archive for the ‘shell’ tag
Unix: Summing the total time from a log file
As I mentioned in my last post we’ve been doing some profiling of a data ingestion job and as a result have been putting some logging into our code to try and work out where we need to work on.
We end up with a log file peppered with different statements which looks a bit like the following:
18:50:08.086 [akka:event-driven:dispatcher:global-5] DEBUG - Imported document. /Users/mneedham/foo.xml in: 1298 18:50:09.064 [akka:event-driven:dispatcher:global-1] DEBUG - Imported document. /Users/mneedham/foo2.xml in: 798 18:50:09.712 [akka:event-driven:dispatcher:global-4] DEBUG - Imported document. /Users/mneedham/foo3.xml in: 298 18:50:10.336 [akka:event-driven:dispatcher:global-3] DEBUG - Imported document. /Users/mneedham/foo4.xml in: 898 18:50:10.982 [akka:event-driven:dispatcher:global-1] DEBUG - Imported document. /Users/mneedham/foo5.xml in: 12298
I can never quite tell which column I need to get so end up doing some exploration with awk like this to find out:
$ cat foo.log | awk ' { print $9 }'
1298
798
298
898
12298Once we’ve worked out the column then we can add them together like this:
$ cat foo.log | awk ' { total+=$9 } END { print total }'
15590I think that’s much better than trying to determine the total run time in the application and printing it out to the log file.
We can also calculate other stats if we record a log entry for each record:
$ cat foo.log | awk ' { total+=$9; number+=1 } END { print total/number }'
3118$ cat foo.log | awk 'min=="" || $9 < min {min=$9; minline=$0}; END{ print min}'
298Browsing around the Unix shell more easily
Following on from my post about getting the pwd to display on the bash prompt all the time I have learnt a couple of other tricks to make the shell experience more productive.
Aliases are the first new concept I came across and several members of my current team and I now have these setup.
We are primarily using them to provide a shortcut command to get to various locations in the file system. For example I have the following ‘work’ alias in my ~/.bash_profile file:
alias work='cd ~/path/to/my/current/project'
I can then go to the bash prompt and type ‘work’ and it navigates straight there. You can put as many different aliases as you want in there, just don’t forget to execute the following command after adding new ones to get them reflected in the current shell:
. ~/.bash_profile
A very simple idea but one that helps save so many keystrokes for me every day.
Another couple of cool commands I recently discovered are pushd and popd
They help provide a stack to store directories on, which I have found particularly useful when browsing between distant directories.
For example suppose I am in the directory ‘/Users/mneedham/Desktop/Blog/’ but I want to go to ‘/Users/mneedham/Projects/Ruby/path/to/some/code’ to take a look at some code.
Before changing to that directory I can execute:
pushd .
This will push the current directory (‘/Users/mneedham/Desktop/Blog/’) onto the stack. Then once I’m done I just need to run:
popd
I’m back to ‘/Users/mneedham/Desktop/Blog/’ with a lot less typing.
Running the following command shows a list of the directories currently on the stack:
dirs
I love navigating with the shell so if you’ve get any other useful tips please share them!
Calling shell script from ruby script
Damana and I previously posted about our experiences with different Ruby LDAP solutions.
Having settled on Ruby-LDAP (although having read Ola and Steven’s comments we will now look at ruby-net-ldap) we then needed to put together the setup, installation and teardown into a ruby script file.
A quick bit of Googling revealed that we could use the Kernel.exec method to do this.
For example, you could put the following in a ruby script file and it would execute and show you the current directory listing:
exec "ls"
The problem with using Kernel.exec, which we became aware of after reading Jay’s post, is that we lose control of the current process – i.e. the script will exit after running ‘exec’ and won’t process any other commands that follow it in the file.
Luckily for us there is another method called Kernel.system which allows us to execute a command in a sub shell, and therefore continue processing other commands that follow it.
We were able to use this method for making calls to the make script to install Ruby-LDAP:
@extconf = "ruby extconf.rb" system @extconf system "make" system "make install"
There is one more option we can use if we need to collect the results called %x[...]. We didn’t need to collect the results so we have gone with ‘Kernel.system’ for the time being.
Jay covers the options in more detail on his post for those that need more information than I have presented.
Show pwd all the time
Finally back in the world of the shell last week I was constantly typing ‘pwd’ to work out where exactly I was in the file system until my colleague pointed out that you can adjust your settings to get this to show up automatically for you on the left hand side of the prompt.
To do this you need to create or edit your .bash_profile file by entering the following command:
vi ~/.bash_profile
Then add the following line to this file:
export PS1='\u@\H \w\$ '
You should now see something like the following on your command prompt:
mneedham@Macintosh-5.local /users/mneedham/Erlang/playbox$
Another colleague pointed out that the information on the left side is completely configurable. The following entry from the manual pages of bash (Type ‘man bash’ then search for ‘PROMPTING’) show how to do this:
PROMPTING
When executing interactively, bash displays the primary prompt PS1 when it is ready to read a command, and the secondary prompt PS2 when it needs more input to complete a command. Bash allows these prompt
strings to be customized by inserting a number of backslash-escaped special characters that are decoded as follows:
\a an ASCII bell character (07)
\d the date in "Weekday Month Date" format (e.g., "Tue May 26")
\D{format}
the format is passed to strftime(3) and the result is inserted into the prompt string; an empty format results in a locale-specific time representation. The braces are required
\e an ASCII escape character (033)
\h the hostname up to the first `.'
\H the hostname
\j the number of jobs currently managed by the shell
\l the basename of the shell's terminal device name
\n newline
\r carriage return
\s the name of the shell, the basename of $0 (the portion following the final slash)
\t the current time in 24-hour HH:MM:SS format
\T the current time in 12-hour HH:MM:SS format
\@ the current time in 12-hour am/pm format
\A the current time in 24-hour HH:MM format
\u the username of the current user
\v the version of bash (e.g., 2.00)
\V the release of bash, version + patchelvel (e.g., 2.00.0)
\w the current working directory
\W the basename of the current working directory
\! the history number of this command
\# the command number of this command
\$ if the effective UID is 0, a #, otherwise a $
\nnn the character corresponding to the octal number nnn
\\ a backslash
\[ begin a sequence of non-printing characters, which could be used to embed a terminal control sequence into the prompt
\] end a sequence of non-printing charactersThis page has more information on some of the other files that come in useful when shell scripting.