All posts by Peter Spangler

Setting up and using Cron

Cron jobs are a mainstay of Unix and Linux. Cron basically lets you run a shell script or other program according to some schedule. The system has been around forever and it just plain works.. even if it is just a little on the cryptic side. The system is ubiquitous and so widely used. A common use of this is to automate backups, for instance. When administering a server I would say it is of the things to think about when setting up a system — what kinds of tasks will you need to run when? (This will, of course, evolve over time. But still….)

Pretty much every implementation of Linux/Unix does the same thing. There is a cron daemon that will go through all the cron files on the system and check in every minute to see what it needs to run. You can’t really do cron jobs any more precisely than at the minute level — there is way to say ‘run a script at 5:33:45 on Monday’ for instance (and why would you ever want to do that anyway?)

Basic Usage

Cron operates on crontab files. You don’t access these files directly to view or edit them. The system maintains these in some location. Whenever you want to do anything with a crontab file, you use the crontab command.

At the command line you can view your crontab with crontab -l. This simply prints out your crontab to stdout. If you haven’t set one up yet then you’ll most likely see a message saying something like ‘crontab: no crontab for you’ (where ‘you’ is your username on the system, of course.)

In general every user with an account has access to setting up cron jobs. To do this, run ‘crontab’ with the -e flag to edit your crontab file. That will kick you into your editor (whichever it is set to, or vi by default.) Initially this will be empty but here we’ll show you how to set that up.

[me@myserver ~]$ crontab -e

However, if you don’t want to edit the crontab file interactively, you can also edit the file in whatever editor you choose and set it up with the crontab command by giving the filename. Usually when doing that I check that I loaded in what I thought by running crontab -l right after.

[me@myserver ~]$ crontab my_crontab.txt

In a crontab file lines that start with a hash mark (‘#’) are comments. The cron program will ignore those lines.

One entry in a crontab file is set up per line. This is a space-delimited list that has the information about what program or script to run and how often. The format is:

mm hh dd tt ww command

Where each field is defined as such:

mm Number of minutes past the hour to run the cron job.

(having the minutes field first is usually the first thing that throws you off. We’re so used to using the hour figure first whenever we talk about time.)
hh Hour of day to run the cron job.
dd day of month (00 to 31)
tt month (1 to 12)
ww day of week (0 to 6, with 0=Sunday, 1=Monday, etc.)
command the absolute path to the script/program to run, along with any redirection to send output wherever.

The basic usage is pretty straight-forward. For example, we might want to run a script at 10pm every night…

0 22 * * * /backup/bar.sh > /dev/null 2>&1

Where cron can get a bit cryptic is when you want to run things multiple times a day or every few days. To run a command at two different times, say 10am and 10pm, you could put each hour value separated by a comma:

# Run foo.sh at 10:15am and 10:15pm every night and send the output to
* /dev/null (ie. throw it away)
15 10,22 * * * /home/me/foo.sh > /dev/null 2>&1

Or to run a command every four hours you would use an asterisk, but then specify the multiple of ‘every four hours’ with ‘/4’.

# Run foo.sh every 4 hours
0 */4 * * * /home/me/foo.sh > /dev/null 2>&1

You can also specify a range of values with a hyphen. In practice this probably only every makes sense in the column indicating which days of the week to run a script.

# Run bar.sh at 5:30pm every workday (mon-fri), and save the output to a logs directory…
30 17 * * 1-5 /home/me/bar.sh >> /home/me/logs/bar.log 2>&1

Version control of your crontab files

Since you can both load a crontab from a text file as well as list it out easily, it is not too much work to set up crontab files under a version control system (ie. git or subversion). Then as you add more things to your crontab you’ll have a log of these changes over time.

Considerations for scripts run via cron

There’s nothing to say you can’t run a compiled program with cron — we do this often enough. But probably the main use of cron is to run a script of some kind — BASH, Python, Perl, PHP scripts can all be run via cron.

1. At the risk of stating something kinda obvious, your script/program should start up, do it’s thing, and exit. It shouldn’t hang around running forever. If you want a daemon, set up a daemon (that’s a future blog post.)

2. Your script should use absolute paths — you can’t assume the script is run from your home directory, for instance.

3. Some jobs might take a long time to run. You should take this into account so that one invocation of your script does not overlap the next invocation by cron. Putting in a check of some kind — a sentinel or lock file — to see if your script is still running can be effective. Or, if they do overlap, that each instantiation of your script can coexist happily with the others.

*** Update: (2013-03-15) Just came across a link to Chronos, which is a replacement for Cron developed by the folks at AirBnB. I haven’t looked at it much yet but wanted to throw in a link to that here.

Which programming language/tool/other technology to learn?

When I was an undergrad studying computer science I remember walking with a couple of friends and fellow students through one of the buildings on campus on night. As it happened it was the building where the computer science department was located. We were (probably) heading to one of the computer labs, and as it happened the chair of the department had come out of the elevator and was on his way out of the building. We said hello and he then decided to take the moment to share a bit of advice, which was to learn Java. That was it, fairly simple. I think at the time we looked at each other and more or less shrugged our shoulders and got on with things.

Somehow I’ve remembered that bit of advice from our department chair since then (although certainly I didn’t act on it at the time in a meaningful way). That was about twenty years ago — perhaps 1993 or 1994 — and at the time Java was still a pretty new language, and none of our regular computer science courses was using the language. With a few exceptions most of the programming I was doing in my classes as an undergrad was in C. My introductory programming class in college used a language called Modula-2, which probably very few remember now (and I believe that was the last semester they taught that class with Module-2, switching to C afterwards.) There was also a course on programming language design introduced us to Scheme (a close cousin of LISP). The point though was that the department chair recognized that this was going to be an important language and was worth investing the time to learn it.

When getting started in this business there is an incentive to try to learn a lot of different languages, or development tools. After all, the more one knows, the more potentially valuable when hitting the job market. Or at least we might be inclined to think — a point I’ll return to later. But there are any number of things one might spend time learning and only a subset of those will prove to actually be useful. This is a classic problem of course — how do you decide what to focus on and what to set aside? A couple of years ago when I was teaching I had a student more or less ask this question as well. I think since I was an undergrad the problem has become much more challenging. There are simply more languages and applications in active use now. We have more choices for databases, and mobile app development didn’t even exist ten years ago.

Trying to decompose this problem is worth doing I think. For any given language, tool, or other bit of technology we might pose a few questions:

How widely used is it now?

A widely used technology can be comforting to get into since you may assume there will be more resources to help get up to speed with it, as well as a broader community to tap into. If we are talking about programming languages, then you might look to see which ones there seems to be more demand for. The Jobs Tractor Language Trends – February 2013 report for instance shows Java and PHP being more popular now. But a narrow market segment isn’t necessarily bad either. In some niche areas a less widely used language or technology might be the dominant one in that area. Also, new technologies have to start out somewhere and can sometimes take a while to find their audience, which leads to the second point.

Potential for growth?

Besides current popularity another variable is what kind of growth might be expected for people that know Technology X.

Don’t confuse how widely used something is with demand — they are related but not the same thing. There’s still a market for COBOL and Fortran developers, for instance.

Open vs. Proprietary

Right now there is a division between open technologies and closed, proprietary ones. With proprietary technologies you can expect your investment cost to go up if only because of the need to acquire the necessary software (and license).

Getting into mobile app development is a good case study here. Becoming an iPhone developer, for example, requires a certain up-front investment: you need a) a Mac to have XCode and the whole development environment, b) an iPhone and c) join Apple’s iPhone developer program. However, that’s a popular platform to develop for (at the risk of understatement) so yeah, it’s pretty compelling. It should be said, given this example, that becoming an Android developer isn’t free either — but you can get by with a less expensive computer with Linux and you can get Java for free.

Is it something you are interested in?

I think there’s something to be said for pursing things you actually are interested in versus things that you think will be good to know, but are otherwise not that into. You’ll generally do better at things that you are intrinsically inclined towards. At the end of the day I think this is the one to weigh most.

Avoid spreading yourself too thin

Finally, to get back to the point about trying to ‘learn everything’, at a certain point I think it is important to recognize that you can’t do possibly do that , and certainly not at the same time. There is likely an added bonus that comes from having experience with a variety of tools — seeing how different languages handle the same or similar problems can be insightful, for instance — but that kind of knowledge comes over time. Over the span of a career your primary tools are going to evolve anyway.

The list above is just an attempt to try to sketch out how to think about this problem in some systematic way; I certainly don’t think I have any definitive answers here. What I can say is that, at least for myself, I’ve decided that there are certain areas I don’t see myself investing time to learn things and instead focus on other areas. I’m less inclined to get into .Net and Windows development at this point in my career, for instance, since that would be a pretty significant switch from where my current skill set is, and quite frankly I’m not as interested in it. This is certainly not to disparage .Net and Windows development — it’s just a choice of where to spend resources.

W. Edwards Deming’s view of systems

I’ve had a general interest in systems and systems theory and how different people have written about it. Depending on the context different authors will view what a system is differently. Recently I’ve been reading W. Edwards Deming’s The New Economics for Industry, Government, Education. Deming, who passed away in 1993, had a long career as a management consultant. He is probably most known for his work in Japan introducing them to concepts of quality control. The Deming Prize was created in his honor and continues to be awarded every year. (See here for more information on the Deming prize than is available on the JUSE site.)

Deming’s concept of a system is a key part of his overall thinking and is the basis for what he termed the ‘system of profound knowledge’. His view of systems differs a bit from how systems are described elsewhere. To start, Deming is primarily concerned with man-made systems, though he does not go on to elaborate much on how a man-made system differs from naturally occurring systems. His basic point that any system consists of several, interdependent components is not, in itself, a radical departure from other conceptualizations of systems. A component might be a person, and it does seem that often Deming uses the term ‘component’ interchangeably to refer to people in a system. There’s no reason to think that components can’t include other items besides people though — equipment of various kinds, raw materials, etc.

Where he does differ are in other statements about systems and their dynamics. To understand how Deming arrives at his conceptualization of a system it is worth focusing for a moment on his idea of the aim of a system. Any system must, according to Deming, have an aim, and if it doesn’t then by definition it isn’t a system. All the components in the system need to be focused on the aim and pursue that to the best of their ability. He further recommends that the aim of a system should be that all the people involved in the system benefit over the long term from their participation in it.

What is perhaps an unexpected point is that he argues that competition is ultimately a destructive force, and a system must be actively managed to prevent that. If left untended, the components of a system will naturally turn towards pursuing their own individually-based aims at the expense of the overall aim of the system. This is what the role of management was then — to regulate the components and keep them focused on the aim. What is interesting here is that either this means Deming didn’t think that there could be any self-regulation in a system, at least among totality of the components. The other interpretation is that the management in a system constitutes the self-regulatory mechanism. That, however, seems to be a different kind of self-regulation than the emergent phenomenon that arises from the interaction of all the components in a system.

Deming had nothing good to say about competition. Deming explicitly states that competition is ultimately a destruction force, and instead we should strive towards cooperation. At the same time he argues that a system includes competitors, so it does seem that he acknowledges that competition exists. There is perhaps a bit of ambiguity here, but it seems that some form of competition could be beneficial; he describes a positive form of competition when competitors are seeking to expand a market or to provide better service in some way. It does seem that Deming has two views of competitive practice. In the first ‘win-lose’ form, this is ultimately destructive. When companies seek to increase their share of a market or to push other competing companies out, that is ultimately unhealthy. The second ‘win-win’ version would look a bit different: instead of looking to take over more of a market, all the players in some space looked to expand that market so that all would come out ahead. Such a ‘competitor’ is a kind of cooperative competitor ultimately.

The description of systems in The New Economics is not as fully fleshed out as one might hope. There are a number of statements that Deming makes that are simply asserted without any further discussion. The text often refers to another book Deming wrote some years prior, Out of the Crisis and it may be that Deming thought of these books as being companion pieces. Suffice to say in concluding that Deming, given his profession and most of his experience, was primarily interested in systems as they are found in organizations we create. This is clear from the full title of the book, referring to industry, education, and government. While Deming no doubt had a broad view of how systems thinking could be applied, it probably did have limits.

The New Economics for Industry, Government, Education
Second Edition
Deming, W. Edwards

© 1994 The W. Edwards Deming Institute

ISBN-13 : 978-0-262-54116-9

An initial post…

Somehow when it comes to my own site it’s taken me forever to get something up. At least something more than a totally blank page with my LinkedIn and Twitter links; I never seem to be quite happy with any designs I come up with….

So, I’m taking a new tact now. Knowing I probably won’t really be happy with the look of the thing for a while, I’m trying the a basic wordpress install with the thought that I’ll just evolve it over time as I use it. We’ll see how that goes.