Musings of an anonymous geek

September 17, 2007

UNIX mtime vs. ctime

Filed under: Linux,Sysadmin,Technology — m0j0 @ 8:27 am

Sometimes I get questions from people about stuff that I’ve long since taken for granted. One is “what’s the difference between mtime and ctime?”

The answer is simple, but I wanted to post it here in case it can help anyone. In UNIX:

mtime is “modification time”, and it is the time at which the last modification *TO THE CONTENT* of the file was made.

ctime is “change time”, and it is the time at which the last modification *TO THE PROPERTIES* of the file was made. By “properties” I mean things like ownership and permissions.  You can think of this as “chtime” – the last time a “ch*” command like chown, chmod, or chgrp was performed on the file.


September 7, 2007

New Job!

Filed under: Linux,Me stuff,Python,Scripting,Sysadmin,Technology — m0j0 @ 7:32 am

I started a new job about 6 weeks ago. I’m now doing infrastructure architecture at

GFDL stands for Geophysical Fluid Dynamics Lab. It’s a NOAA site that supports atmospheric and climatology research. So in other words, the work I do supports research into things ranging from global warming to what the atmosphere on Mars is like to the weather here on Earth to simulations of the shape and movement of Katrina. I think of it as sort of an Institute for Advanced Study devoted to climatology research. Great minds in the field are here.

The research actually takes place at three different sites, DC, Boulder and Princeton, and affiliations with academic institutions flourish as well. In fact, I knew at least 4 people who worked here because of interactions between this site and, my former employer.

My job, as it’s been described to me, is to provide a vision as to the design and direction of the infrastructure which supports the rather enormous high performance compute (HPC) cluster. This involves something of a learning curve to understand what’s here, how the systems are used, what the needs are, what people like and hate, where the redundancies and inefficiencies exist, etc. It also involves having meetings and coordinating with people who manage the network, the facilities (power & cooling, etc), the security policy, etc. I’ll be grilled on my ideas, and create prototypes and demos to get my ideas across. Lots of communication.

An aspect of my job will also involve getting my hands on the HPC clusters themselves as well, which are also at each site. All of the clusters are on last time I looked. Just go through the pages and search for GFDL and/or NOAA.

The systems here are all Linux. Even the standard-issue workstations are running Linux.  Scripting is done in Perl and shell, but Python is everywhere, so I’ll be doing either Perl or Python if I have the choice (because “shell” == “csh” here, which I never took well too, honestly). Some aspects of the environment are pretty fascinating. For example, how exactly do you store (*and* easily retrieve, on the fly) 9 PETABYTES of data? How do you back that up? How do you recover from hiccups? How do you instrument systems consisting of thousands of CPUs,  to pinpoint problems and get them fixed? And, by the way, how’s the best way to tune a system’s network stack to use a 50MBps pipe (that’s Mega *bytes*) efficiently enough to move multiple terabytes of data every day between collaborators at different sites? How, exactly, do you consolidate services and provide failover across geographically dispersed sites?

So that’s it for now 🙂  It’s too early to tell how things are going, really. It’s certainly not the cushy environment that Princeton U. was, but there are bigger challenges and problems to be solved here, and that’s the part I’m looking forward to.

March 30, 2007

Finding Needles With ‘sort’ and ‘uniq’

Filed under: Linux,Scripting,Sysadmin,Technology — m0j0 @ 10:34 am

I had to do this recently, and so I thought it would be useful to share this for two reasons:

  1. Someone else may need to do it and find this technique useful
  2. Someone else may know a better way of doing this

Quick ‘n’ dirty explanation: you have two lists. One list is a superset of the other list. You want to identify all of the items that exist *only* in the larger list. Here’s how you do that:

cat small_list >> largelist; sort largelist | uniq -u

Note that ‘uniq -u’ is not the same as ‘sort -u’. The former will display only the lines in the file that occur once. The latter displays all lines in the file, *once*, regardless of how many times they occur in the file.

Longer example explanation: I have an LDAP server, and at some point we added an objectclass and associated attribute to every user account. However, new accounts weren’t being *created* with the objectclass and attribute. At some point, I figured out that there was some inconsistency between account objects, and figured I had better get a list of accounts that didn’t have the objectclass and attribute so I could correct the situation. Problem is, you can’t negate a search using the standard ‘ldapsearch’ command line tools. So I can’t ask for all objects where ‘objectclass != myobjectclass’ or something.

What I did was two ldapsearches. One for all of the objects in that part of the tree, and then another for all objects in that part of the tree with the objectclass in place. Of course, the former list is a superset of the latter, and then we do ‘cat subset >> superset; sort superset | uniq -u’ – and that will be the list of people who do *not* have the objectclass associated with their account entry in the directory server.

Technorati Tags: , , , , , , , , ,

Social Bookmarks:

March 20, 2007

Google Calendar Syncing

Filed under: Linux,Me stuff,Productivity,Python,Scripting,Technology — m0j0 @ 11:24 am

So, I’m kinda tired of trying to find a solution to this. What I want is a non-commercial, freely available application (NOT service) that will sync bidirectionally between Google Calendar and Apple iCal, Evolution, and whatever Mozilla calls its calendar today (Sunbird?).

I’ve used Spanning Sync, which worked well enough, but I never liked that my data traversed their servers. Then, they decided to charge me for the privilege of sending my data through their servers, which I didn’t want to do in the first place.

Then I looked at gsync, but I decided not to bother with it because they have commercial aspirations as well, and while they say it’ll only be $20, and it’s a standalone application, I don’t really want to get all set up with it and be let down when they decide it’s worth much more or something.

So I downloaded gcaldaemon, which is an open source, freely available, standalone application, but after about a day of fiddling with it, I couldn’t get it to do much of anything that was useful. What’s more, it’s written in java, so I’m not going to go mucking about with the source code (I don’t code java).

I’ve decided to figure out a solution on my own, as a pet project to help me get used to programming in Python and using the various Google APIs. I just started last night, and I’ve gotten as far as logging in and getting a list of my calendars. I have an extremely long way to go, but between this project and another one I’m doing at work, I should become pretty adept at using Python, and it’s been great fun!

It should be noted that there’s already a Python library for the Google Calendar API. I downloaded it, and I’m using some of that code as example code to get me going, but I’m not using that library as of right now, because I’m more interested in learning than I am in getting a working product immediately. Maybe I’ll do something useful and will be able to contribute code back to that project. Maybe after some time of mucking with this I’ll see the light and decide to use the API. Either way, whatever code I produce will be available to whoever wants it in some form.

Technorati Tags: , , , , , , , , , , ,

Social Bookmarks:

March 19, 2007

Code Editor Goodness: Komodo Editor

Filed under: Apple,Linux,Python,Scripting,Sysadmin,Technology — m0j0 @ 7:55 am

Geez…. for a sysadmin I sure seem to write a lot of code. In the past year I’ve written an assignment type for Moodle in PHP, cobbled together an API in Perl to manage various LDAP resources, and I’ve just completed a prototype for an XML-RPC server that will be an interface to our data warehouse (which I designed, and wrote ETL scripts for in Perl, awk, and shell). Whew!

With all of this going on, a good editor is necessary. I’m a sysadmin *first*, and that is where my training is, so naturally I use vi. I understand that “real” programmers use Emacs, but it really doesn’t fit my brain, and if you use vi for any length of time, I believe it becomes dang near impossible to convert without a frontal lobotomy. 😮

Though I like vi a lot for day-to-day administration, I’m not always a fan of how it handles programming, and it seems to take a lot of work to get it to work the way you want it to. My biggest pet peeve about vi is when you enter insert mode and then paste in some code that was already indented. Vi likes to indent it again. Usually I can get around this by setting all of the *indent settings to “no”. For example “:set noautoindent”. However, I set everything I could find yesterday and it was still doing some goofy things with my pasted in code.

Once I got my code in there, and manually removed all of the stupid indentations, I realized another thing: I hate vi’s python syntax highlighting, and it doesn’t handle python indentation in a particularly “smart” way. For this reason, I generally use JEdit for programming, but there’s no Vi key bindings for JEdit, so I went looking for a new editor to see what I could find. What the heck, it was Sunday. What else did I have to do?

I scored. Komodo Editor is a free-of-charge code editor that runs on Windows, Mac and Linux, is as customizable as you’re likely to ever need it to be, and….. wait for it…. it has a Vi mode!

Since my current project is working with Python, it’s also super, super nice to have the indentation guides, and since I’m a Python newbie, it’s also fantastic to have some of the gentle reminders that I’ve indented wrong, or forgotten the colon after my def. I’ve also made use of the basic-but-useful file comparison interface, which does a diff on files of your choosing. It also does some things that I liked about JEdit, like handling file changes on disk in an intelligent (and flexible, should your definition of ‘intelligent’ differ from Komodo’s) way.

The simple code folding is as one would expect in most graphical editors, and it’s a tad nicer than JEdit, though I wish that there was a keyboard shortcut to collapse/expand the current block of code. I also wish I could define language-specific syntax highlighting based on a regex or something like that. For example, I’d like to color the word “self” in Python differently from what Komodo Editor calls “identifiers”. I also wish that I could find a way to split the window vertically in Komodo. I’ll miss that feature of JEdit. It might warrant me keeping JEdit around for some things.

For the record, I also tried SubEthaEdit, Smultron, a couple of Vim plugins, XCode, and I downloaded and tried (and failed) to get SPE running, too. Komodo fit my brain best. I recommend it if you want some of the graphical goodness of an IDE but don’t want to lose your Vi key bindings. Enjoy!

Technorati Tags: , , , , , , , , , , , , , , , , , , , , ,

Social Bookmarks:

March 16, 2007

Trying to make friends with Python… again

Filed under: Database,Linux,Scripting,Sysadmin,Technology — m0j0 @ 6:42 am

I like the idea of Python. I have diverse interests, technically, and I like to think that there’s a language out there that I can use to write small script, a large website, a stored procedure, or a distributed system. The same language is used to write a very large chunk of systems code on Red Hat systems can also be used to make pretty graphical interfaces. I like that it’s cross platform.

My trouble with Python has been twofold: time, and support. I actually *have* read the introductory tutorial, but it was in 2002. I’ve forgotten just about all of it. I have a copy of the printed Python Reference Library, but it’s from 2000 (if memory serves). I own *both* editions of “Learning Python”, because by the time I got around to reading the first edition, the second edition made it completely obsolete. The other side of the time issue was making time to actually do something useful with the language so as to cement the fundamentals into my brain. That’s sometimes difficult when you’re a sysadmin and don’t really program for a living.

On the support side, I’ve had a lot of problems. Every time I go to do something with Python, I have no idea which route to take. There are so many frameworks and modules that have overlapping problem scopes that it’s hard for me to make a decision. What’s worse, nobody seems to know which module or framework is the canonical way of doing things. I guess things are still young enough to be schizophrenic. With Perl, when they say “there’s more than one way to do it”, that’s speaking more about the syntax of the language than the modules you might use (though it speaks to that, too, somewhat). With Python, the syntax is the (relatively) stable part – it’s choosing modules that can be a challenge.

Right now I’m building an XML-RPC server and a small test client. The client calls functions on the server, and in response, the server queries a PostgreSQL database and returns the results. I got a simple working prototype working with real data yesterday, but it took me a long time to figure out exactly which module should be used to talk to PostgreSQL from Python, and which module should be used for implementing the XML-RPC server. I’m comfortable with psycopg2 for the database calls, but I’m using SimpleXMLRPCServer for the server implementation, and I’m just waiting for one of its limitations to bite me. However, Twisted doesn’t seem like it’s quite soup yet in this particular area, and using xmlrpclib to implement a server seems silly with a ready made solution already built in (I know a project that does that, maybe because SimpleXMLRPC didn’t exist at the time they started?).

So, wish me luck. If you have any input on what you’ve done in this area with Python, fill me in! Also, if you’re an admin who uses Python and knows of a good reference site for simple day-to-day UNIX admin scripting in Python, let me know that too!

Technorati Tags: , , , , , , , , , , , , , , ,

Social Bookmarks:

March 14, 2007

More news for Spanning Sync Refugees

Filed under: Big Ideas,Linux,Productivity,Scripting,Sysadmin,Technology — m0j0 @ 12:35 pm

First, there are lots of people who are pretty outraged by the new Spanning Sync pricing of $25/year for a subscription service or $65 for a one-time license. The people who are the most outraged are those who are intimately familiar with how buggy it is because they were beta testers. I’m in that camp myself. I no longer use Spanning Sync.

Second, I found this post talking about future pricing of gSync, which is currently in beta and plans to go commercial, but there are two important distinctions:

  1. There’s no central server involved. gSync connects directly to Google Calendar with no intermediary.
  2. They only plan to charge $20 for the download.

Finally, check out this quote from Charlie Wood of the Spanning Sync team:

“For example, another poster on this group (see
explained that he thinks, “Spanning Sync is a great product,” but that
he is, “unfortunately, a supporter of open source or free software,”
and therefore won’t be buying a subscription. My point is that
regardless of the price of the service (unless it was free), he
wouldn’t have ever been a customer of ours.”

This shows a complete lack of understanding about what open source and free software is about. To be clear: NEITHER THE OPEN SOURCE NOR THE FREE SOFTWARE COMMUNITIES SPECIFY THAT SOFTWARE SHOULD NOT BE A COMMERCIAL, MONEY-MAKING PRODUCT.

From the Free Software Foundation site:

You may have paid money to get copies of free software, or you may have obtained copies at no charge. But regardless of how you got your copies, you always have the freedom to copy and change the software, even to sell copies.

“Free software” does not mean “non-commercial”. A free program must be available for commercial use, commercial development, and commercial distribution. Commercial development of free software is no longer unusual; such free commercial software is very important.

And, from the Open Source Initiative website:

“How do I make money on software if I can’t sell my code?

You can sell your code. Red Hat does it all the time.”

Also, *I* am a supporter of free *and* open source software, and regularly pay for software, as do most people who have to get actual work done using tools for which there is no free/open alternative. “Free and Open” does NOT mean “no money changes hands”.

Please, if you’re a software developer, put some due dilligence into this, and if you’re a free/open source software supporter, try to work with the community on better relaying the message, because after, like, 20 years, people should’ve started to get this by now.

Technorati Tags: , , , , , , , , , , ,

Social Bookmarks:

Safety Precautions When Using the ‘rm’ Command

Filed under: Linux,Scripting,Sysadmin,Technology — m0j0 @ 11:47 am

Usually, if I have a bunch of files that need to go away, I’ll see what I can do to avoid using ‘rm’. Many times, I can move a directory containing the files out of the way, or I can make a backup directory and move the files there. However, at some point, those files are just taking up space, and need to be removed with ‘rm’. I treat this with a lot of caution.

The first thing I do before running the ‘rm’ command is run ‘which rm’. I’m in an environment where some utilities are in a mounted directory, and they duplicate what’s on the local system. I want to know what I’m using.

If I’m on an unfamiliar system, I run ‘man rm’, make sure that man page refers to the binary from ‘which rm’, and then check to see how it handles symlinks. I have yet to see an ‘rm’ that follows symlinks and removes things referenced by them, but I don’t make assumptions. I used to make assumptions like this until one day I ran ‘chown’ on a Solaris system without ‘-h’ and systems all around the department started having issues because they suddenly couldn’t access what they needed to get their work done. :-/

At this point, I used to *type* ‘rm -i’ using the full path to the directory I wanted to work on (which was confirmed using ‘pwd -P’ just to be safe).

Then I’d take my hands off the keyboard and just sit for a moment. I always do this, no matter how stressful the situation. It’s a weird meditative thing. Running the wrong command, or the right one incorrectly, will only make your day worse, no matter how bad it already is. Sit back, close your eyes, and think about what you’re about to do. Then open your eyes, take note of the directory you’re in, take note of the files in there, take note of what user you’re running as, take note of the command you’re running, inspect it character-by-character, and assuming everything is good, I’d hit enter.

The other day I thought of another safety precaution that, while it changes my ritual, might also save me some time. I had to delete all of the PHP files in a directory. In order to insure that only the intended files got removed, I cd’d to the directory, ran ‘ls -l *.php’ and inspected the output. Carefully. Yep – those are all the files I wanted to delete, so then I did this (in a bash shell, but it works in tcsh, csh, and ksh as well):

^ls -l^rm -f

And that’s it. It removes the possibility of having a typo in the *argument* part of the command, which, when rm is involved, is often what gets you in trouble. If you’ve never seen this notation before, it’s a way to repeat the same command line you just ran, substituting what’s after the first caret with what’s after the second caret. So if I do this:

ls -l *.php

And get the proper output, running

^ls -l^rm -f

Will cause this to be run:

rm -f *.php

Hope this helps!

Technorati Tags: , , , , , , , , , ,

Social Bookmarks:

March 10, 2007

Can Ubuntu Cut the Gordian Knot?

Filed under: Apple,Big Ideas,Linux,Technology — m0j0 @ 4:10 pm

When Windows was released, it united a vast but rather fragmented society around a single philosophy. On the one side, you had end users. They had to get work done, and they needed applications to do that. On the other side, you had application developers, who needed a platform conducive to making useful applications to sell to the end users. This is, of course, gross oversimplification of historical events. The point is that geeks and end users alike wound up rallying around Windows.

These days, it would seem that Apple is looking to do the same thing. They’ve lured in end users by enabling them to work with media in new, fun and interesting ways while still allowing them to get work done. Meanwhile, they’ve also attracted the technical crowd because under the covers OS X is really UNIX. In short, Apple has done what Linux has needed to do for a decade now but couldn’t get organized enough to do: they made what amounts to a distribution of BSD that’s so easy to use that the end user never knows what’s under the hood unless they’re curious enough to go and learn about it.

The ability to organize around the idea that a system should be easy to use *first*, and gratifying to geeks *second* has been the gordian knot of the Linux community. While Apple has pasted a bunch of slick “ooh aah” features onto the desktop, and provided a platform for developers to extend the environment (note I said “extend” and not “fork”), Linux is busy, for the most part, doing things with a mind toward “the community” instead of “the customer”.

The Linux community is making sure that the end user has at least 10 different mp3 players, 5 different desktop environments, 6 different photo management applications, and 20 different scripting languages built in and ready to go. They’re fighting over licensing, attribution, inclusion, exclusion, who’ll take over this project, who’s forking that project, what should this project be called today, what package format will be used, which package manager will be used and does it work with this format, and what’s the best way to support 32-bit programs on 64-bit hardware while still allowing the end user the freedom to build software from source in an environment that looks something like sanity?

I certainly understand that all of those arguments are in some way important. There needs to be this community of concerned technologists who provide so many important things to the technological landscape as it were. The community is a proving ground for ideas, a training ground to develop skill sets, a forum for the discussion on the directions different technologies might go in, and a united force against inane legislation.

However, as important as these arguments are, you have to admit two things:

  1. They don’t make getting things done any faster, and
  2. Apple has already proven that these things can get done quickly if you get organized.

Apple has hit a nice sweet spot in terms of what it delivers to the end user. It doesn’t come ready to do… um… well, nothing – like Windows. On the other hand, it also doesn’t come with 5 ways to perform a single task. Just this small amount of streamlining greatly reduces the amount of work involved in delivering the product, because energy that might otherwise be expended in testing all of the different ways you can set up your printer can now be redirected to solving a problem that doesn’t have a particularly great solution, or (gasp!) writing documentation!!!

What I’m hoping for Ubuntu is that they evolve into a project that can do two things, both of which I think the project is capable of:

  1. Accomplish on the Linux platform what Apple has accomplished on the BSD platform.
  2. Take the word “Linux” out of the larger desktop platform discussion.

Item 1 would involve making some difficult decisions about the applications that will not be included in the distribution, and employing some amount of diplomacy to try to unite developers to get them to work together on solving problems instead of forking every time someone gets their feelings hurt.

Item 2 is *going* to be done by some distribution at some point in time. It won’t be Red Hat, and it won’t be Novell. They’re not interested in you and me. They’re interest is in the “enterprise”. They’re smart to go that route. It’s a large market, and it’s a market that isn’t likely to care if there are no mp3 libraries or commercial NVIDIA drivers installed by default. But this doesn’t help us home users.

It’s also not going to be Debian, because their interest is in “keeping it real”, where “real” equals “open”, not “easy to use for non-geeks” or “bleeding edge”. It won’t be Mandriva because, as much as I love Mandriva, they don’t seem to know where to put their energies from one day to the next. Someone needs to get the discussion about the desktop to include the name of their distribution instead of this nebulous “Linux” thing.

Linux is a kernel. The distribution is what makes Linux useful to normal people. We sure as heck don’t talk about “win32” on the desktop, now do we? We talk about Windows. Likewise, we should be talking about “Ubuntu vs. Windows” or “Ubuntu vs OS X” and not “Linux vs Whatever”. People aren’t going to understand comparisons between a kernel and an operating environment. They’re not going to understand a comparison between Windows and an entire movement. There needs to be something identifiable to put up there, and right now, at least in the desktop space, that’s Ubuntu.

So, flame away. Here are a couple of replies up front:

  1. I’m not saying it has to be Ubuntu, I’m just saying that right now, it *is* Ubuntu. Read the article.
  2. Debian religion aside, you have to admit that you’re not likely to hand a Debian netinstall CD to your mom and wish her the best of luck now are you?
  3. Yes, MEPIS, KDE, Slackware, Gentoo, OpenSUSE, Fedora, {K,Edu,X}ubuntu are also very nice. That’s not the point of the article, though, so please move along.
  4. Yes, choice is good, but anyone who has ever worked in food service is no doubt familiar with people who sit there staring at the menu saying “there’s soooo many choooiiices” and not knowing which way to go. People on the by and large are indecisive. Having one tool to perform a task instead of five, just by itself, will do wonders for the perception of usability on a platform, as evidenced by Windows and OS X.

I’ll reply to the rest as they come in 🙂

Technorati Tags: , , , , , , , , , , , , ,

Social Bookmarks:

November 9, 2006

My LUG/IP Presentation: The Road to Geek Authorship

Filed under: Linux — m0j0 @ 9:00 am

A few months ago, Tom Limoncelli (noted author and, now, a member of the Google team) spoke at our LUG about topics from his newest book, “Time Management for System Administrators”. It was a great talk, and it’s a pretty darn good book as well. 

Anyway, the point here is that, after the talk, there were, of course, a good number of comments, but some of us noticed that a lot of the *questions* were along the lines of “how did you get to the point where you’re writing books and stuff?”

Well, I’m a LUG regular, and I’ve written a book, and write elsewhere besides, so one of the board members suggested I give a presentation to address those questions. 

There was a pretty good crowd, and I more or less know the crowd from going to meetings, so I know when they’re completely disinterested, and I don’t think they were. Well, partly maybe, but not *completely*. I think the talk went relatively well. 🙂 

Next Page »

Blog at