naeblis.cx http://naeblis.cx/ The only thing better than meat and potatoes, is meat and meat.. Copyright © 2004 naeblis.cx en-us naeblis.cx hourly 1 2000-01-01T12:00+00:00 2004-08-09T02:11:28.00-04:00 Del.icio.us Address-barlets http://naeblis.cx/weblog/DeliciousAddresslets http://naeblis.cx/weblog/DeliciousAddresslets 2004-08-09T02:11:28.00-04:00 The del.icio.us social bookmark system completely replaces my in-browser bookmarks with the addition of the following Keymarks (or Searchlets or Address-barlets or whatever you call them).

Here's an introduction to searching from the address-bar and here's the coolest web based bookmark manager on the planet. Put them together and you can hide the Bookmark Toolbar because you won't be needing it anymore.

My del.icio.us bookmarks are starting to fill in nicely. I started using del.icio.us because I wanted to keep a linkblog and the RESTful web service API makes it possible for me to suck things into this site. My plan moving forward is for this site to be a big aggregator of content publicly available via web services (e.g. Blogger for posting, Bloglines for blogrolling, maybe some forum web app for comments, sourceforge for projects, etc.) More on that later, right now we're talking about using del.icio.us as a searchable bookmark index. Here are the Searchlets you will need to get up and going:

What Shortcut Destination URL
Search bookmarks for %s ds http://del.icio.us/search/?search=%s
Bookmarks tagged with %s dt http://del.icio.us/username/%s
Note: you have to edit the URL and put your username in there.
Another User's Bookmarks du http://del.icio.us/%s
Everyone's bookmarks tagged with %s today dtt http://del.icio.us/tag/%s

Get these set up in your browser, get a delicious account, and then start using them by working the following into your motor memory. You should be able to get to the address-bar with a keyboard shortcut (e.g. Ctrl+L in Firefox, Command+L in Safari, I'm sure IE has one too). The shortcut will also select all the text in the address-bar so typing replaces the current URL. Once in the address-bar use the Shortcut keyword defined in the table above. For example, search all of your bookmarks with:

<Ctrl+L>
ds foo 
<ENTER>

Delicious always seems extremely responsive so from the time you decide you need a bookmark to the time you're looking at it should average out to be about one or two seconds depending on how fast you type. I'd like to see any of you point-and-clicks beat that running through layers of menu for 20 randomly chosen bookmarks.

Other Random Notes

Other Random Notes

Safari users can grab the Sogudi extension to add support for keyword searching from the address-bar. Replace %s by @@@ for the URLs given above.

If you are new to del.icio.us, I would suggest grabbing the nutr.itio.us bookmarklet to enhance the act of bookmarking.

Persistent NFS Automounting Under OS X 10.3 (Panther) http://naeblis.cx/notebook/NFSAutomountOSX http://naeblis.cx/notebook/NFSAutomountOSX 2004-08-09T00:35:14.00-04:00 A simple UNIXish method of maintaining NFS automount points under Mac OS X 10.3 (Panther).

Had to do some research on how to set up persistent remote automount points in OS X 10.3 (Panther). Let's break that down.

  • persistent - Mounts should survive restarts.
  • remote - In this case, I'm mounting NFS volumes exported from a Linux box.
  • automount - Automatically mount the volume when I access the local mount point and then unmount after some duration of not using it anymore.

A Google search for "OS X" automount nfs yields some good results that pretty much all say the same stuff:

Using automount on OS X 10.3 means adding a bunch of crap to the NetInfo database using the NetInfo Manager tool.

Much longer descriptions of how to set things up using the GUI or a ton of niutil commands are here:

This was my first experience with NetInfo Manager and it pissed me off. I'm sure the whole NetInfo thing has its merits (is it LDAP based?) but one of the key strengths of UNIX systems is that most system configuration stuff is handled using plain old text files. This may seem arcane to those accustomed to the registry editor in Windows but the benefits of plain text configuration are numerous. For one thing, you don't use a half-ass'd GUI (NetInfo Manager) to modify configuration, you use whatever your favorite editor is or string together a couple of text processing tools such as grep, sed, awk, etc.

Anyway, I had a hell of a time trying to get my OS X box to automount remote NFS volumes persistently. Usually, this is a simple matter of adding a few lines to an /etc/fstab or /etc/auto.misc file. The simplest way I could find for having this same capability in OS X is to use the niload and nidump utilities to import and export the NetInfo data from and to plain text files that look a lot like traditional fstab maps.

First, get root and dump the current set of mount definitions to a file:

$ sudo su -
Password: [your password]
# nidump fstab . > /etc/fstab

Note that this will result in a blank file if you haven't configured any remote automounts.

Next, add a line for each mount point. Mine looks like this:

asha:/pub                /Network/Servers   nfs   resvport,net  0 0
daishar:/pub             /Network/Servers   nfs   resvport,net  0 0
daishar:/home/rtomayko   /Network/Servers   nfs   resvport,net  0 0

Some explanation is in order.

  • The first column is the remote mount point. This can be anything you might pass to mount (see man mount for more info).

  • The second column tells the automounter where to create a symbolic link (I think Mac people call these "Aliases"). The automounter creates a directory with the name of server, then a series of directories mirroring the mountpoints dirname, and finally the symlink that points to the actual local mount point created by the automounter (usually somewhere under /private/var/automount). I use /Network/Servers as my base mount directory because then these things will show up in the Finder under Network:Servers.

  • The third column is the type of filesystem you are mounting. This can be afp for AppleShare filesystems, cd9660 for CDs, hfs for HFS filesystems, msdos for FAT and FAT32 (I think), nfs for NFS filesystems, smbfs for SMB (samba/windows shares) filesystems, udf for DVDs, and webdav for WebDAV filesystems. I haven't tried anything but NFS to date but plan on taking a hard look at the webdav support in the near future.

  • The fourth column contains options for mounting the filesystem. There are a whole slew of things that can be specified to tweak the mount. The two most important options are resvport and net. resvport tells the automounter to make the connection to the remote NFS server using a reserved (privileged port). Use this if you are mounting a box that is configured to reject requests coming from non-privileged ports. The net option is what tells the automounter that the mount isn't just a plain old static mount. If you don't specify the net option, the mount will be persistent but static (i.e. always mounted).

  • The next two columns aren't important here. Just set them to zero.

Once you've tweaked up your /etc/fstab file, you will need to load it back into the NetInfo database using the following command (note: we are still root):

# niload -m fstab . < /etc/fstab

You should now be able to go into the NetInfo Manager utility and see that everything is in order under mounts. You should now be able to make changes to your /etc/fstab file and load them into NetInfo using the same niload command.

Who Owns Your Browser? http://naeblis.cx/weblog/WhoOwnsYourBrowser http://naeblis.cx/weblog/WhoOwnsYourBrowser 2004-07-24T00:31:47.00-04:00 Will anti-circumvention apply to normal web content? Is it ethical to modify the style or functionality of a website with tools like browser extensions or user stylesheets? Does anyone care?

I received a few weird comments (a ton of really nice ones too, thanks!) on the Per User Stylesheets XBL hack for Mozilla based browsers I posted a little while back. Technically, this thing just inserts an id into all the pages you visit—one tiny little id—so that you can write custom CSS rules in your user stylesheet for specific sites. It's so simple and crude and stupid that I never really stopped to consider whether it might be controversial.

It turns out there's a stink in the air around giving people too much control over their browser, which completely blows my mind. Adrian Holovaty put out an extension for Firefox that fixes the broken design of allmusic.com. That's all the extension does: fixes one site. I saw that and I said, “Damn, won't it be great when building extensions is so easy that everyone can fix stuff like this?” Other people saw Holovaty's extension and got scared. There's some really good discussion (as usual) on this topic over at Simon Willison's weblog as well, where he asks if per site extensions are completely ethical.

A couple days ago wired.com ran an article about this guy who redesigned a UK movie listing site for accessibility reasons. He hosted the redesigned version and received much acclaim from people who use the site. He was probably responsible for a decent amount of the site's business. They sent him a cease and desist. The reasons (security) seem valid to me but this sparked a lot of conversation on when it is and isn't acceptable to muck with a sites content.

Some of the comments I received after posting a link to MyOwnCSS indicate that there is a fear out there about people being able to share modifications to a sites design without the sites consent. "People will use this as a tool for hiding ads" and for "defacing sites that are unpopular." This is probably true. People will also use MyOwnCSS as a way of adding accessibility features to sites that suck, or for adding paper friendly versions of documents, or for fixing sites that don't work in browsers that have less than 96% market share. In general, I think you will see a much higher percentage of good things being done with this technology than bad. But what if a bunch of people did decide to start making defacing stylesheets? I'm thinking things like the great Opera Bork Bork of MSN a while back... So what?

Just what are your rights regarding the pages you suck down through your browser every day? There's obviously restrictions on republishing but are there any limitations on what you can do to content locally—for yourself, your family? Do I "own" a copy of the content like I would "own" a copy of a print magazine article I buy from Barnes and Noble? What about tools for manipulating the presentation or functionality of a site such as the XBL hack, or Holovaty's allmusic.com extension, or popup blockers, or ad blockers? Is distributing tools that might be used to produce an alternative representation of a work the same as republishing? What about The Schizzolator? Look, if the The Schizzolator is wrong, I don't want to be right.

I'm free to do what I please to a magazine I purchase. Sometimes I like to draw mustaches and big huge afros on the women in my wife's Glamour. It makes the articles tolerable. In fact, I might decide to manufacture special pens that make it easier to deface glossy magazine pages. I don't see a problem with that and I hope people's reaction to Glamour having a problem with that would be "tuff!" So why all the commotion about people being able to manipulate online content to get some use out of it?

My original reasons for wanting per site user stylesheets is pretty simple, and I hope they are honest as well (I can never tell with the whole “honest” thing). When I'm reading long bits of text I like big white margins, a big serif face, and something close to double line spacing (very much like this site is styled). I don't have poor vision, I just find that I comprehend more with this setup (recent research on this topic seems to agree with me). It drives me crazy when I visit a site with hard coded 8pt Arial and no margins. I don't want to blanket override every site's font setup in my browser settings—that steals so much from good designers—just the few sites that feel they need to squeeze 100 words on each line. Trying to combat sites that don't fit with my reading preferences has given me a new respect for the Jackies, Michaels, Bills, Lillians, and Marcs of the world. Honestly, I think I would just not go near the web if I was in their shoes.

I've actually stopped reading and left sites because of this. Let me say that again, I will not visit your site, look at your adverts, or buy your products if your site's design drives me away. Having said that, I also know that you can't please everyone and my preferences are really pretty fringe. I cannot expect a site to tailor their stuff so that it fits my preferences perfectly. But with this recent slew of tools, I can finally do something about it myself. This is to the benefit of the site and to me. For people with real accessibility needs, this could be a mucher bigger deal. MyOwnCSS means that everyone with a vision impairment can benefit from a single persons contribution. Hint: this is the real value of the Internet. These collaborative services that allow an entire planet to benefit from something that a single person does in his basement - that's the future. Anything trying to stand in the way of this phenomenon is going to get run over. Sorry, if for whatever reason you do not want people enhancing your site for accessibility, or entertainment, or slander, or whatever, you should just take your site off the Internet right now.

Doctorow's Microsoft DRM Talk Teaser http://naeblis.cx/weblog/DoctorowTeaser http://naeblis.cx/weblog/DoctorowTeaser 2004-07-22T18:33:38.00-04:00 I'll let you punch me in the arm the next time I see you if you just go read this paper.

This has been silly hot in the tech community for awhile but I want to make sure everyone I know goes and reads the transcript of a recent talk Cory Doctorow gave to Microsoft's research division on why Digital Rights Management (DRM) is in many ways a bad idea. Since reading it about a month ago I've tried to tell as many people about it as possible. Especially those I felt might not be tuned into this type of thing like I am (you know, normal people with real lives who aren't hovering over their newsreader waiting for their next fix).

Real quick, for those who haven't been keeping score: DRM is to a The Club computer—which, by the way, is just about anything with batteries or a wire at this point—as The Club is to a car, except Honda gets the key (to The Club that is) and decides when it comes off and goes on. And oh yea, if you were to saw the damn thing off, you can expect scary guys in black suits to whisk you away for an all expenses paid trip to places that aren't on any map.

This stuff is important to me and it should be to you to. When my daughter grows up and is flying from Mars to a moon off Saturn in her—probably still petroleum powered—space thingy, I don't think she will appreciate having to lug around boxes of books and CDs because my generation failed to wield their collective influence against the status quo.

Unfortunately, I've found that people typically get through half of the crypto part at the beginning of the talk, their eyes begin to glaze, they peek at the scrollbar and see that there is still a long way to go, and they just punt. Or, they don't read it at all (possibly because I'm bad at describing things when I'm excited). So, I decided that I would take advantage of the fact that Doctorow dedicated his work to the public domain “for the benefit of the public at large and to the detriment of [the] Dedicator's heirs and successors,” and put the really juicy parts here as a sort of teaser like you see at the theater before the main feature.

Movie teasers need do only one thing: get you to see the movie. They are not required to be informative or detailed or anything like that - just juicy. I hope the following excerpts inflict enough stimulation for you to
go read the entire text (html), or if you are one of my many illiterate friends, you can listen to the damn thing.

(Note: this link will get you directly to the excerpts)


Excerpts from Microsoft Research DRM Talk - By Corey Doctorow

Excerpts from <a href="http://www.craphound.com/msftdrm.txt">Microsoft Research DRM Talk</a> - By <a href="http://www.craphound.com/" title="Doctorow's Personal Site: http://www.craphound.com">Corey Doctorow</a>

Greetings fellow pirates! Arrrrr!

...occasionally they shave me and stuff me into my Bar Mitzvah suit and send me to a standards body or the UN to stir up trouble...

Here's what I'm here to convince you of:

  1. That DRM systems don't work
  2. That DRM systems are bad for society
  3. That DRM systems are bad for business
  4. That DRM systems are bad for artists
  5. That DRM is a bad business-move for MSFT

DRM systems don't work

DRM systems don't work

... We usually call these people Alice, Bob and Carol.

Caceous has spies everywhere, in the garrison and staked out on the road, and if one of them puts an arrow through Diatomaceous, they'll have their hands on the message, and then if they figure out the cipher, you're b0rked.

DRM systems are broken in minutes, sometimes days. Rarely, months. It's not because the people who think them up are stupid. It's not because the people who break them are smart. It's not because there's a flaw in the algorithms. At the end of the day, all DRM systems share a common vulnerability: they provide their attackers with....

DRM systems are bad for society

DRM systems are bad for society

Raise your hand if you're thinking something like, "But DRM doesn't have to be proof against smart attackers, only average individuals! It's like a speedbump!"

... I don't need to be a cracker to break your DRM ...

...keeping an honest user honest is like keeping a tall user tall.

...the Russian equivalent of the State Department issued a blanket warning to its researchers to stay away from American conferences, since we'd apparently turned into the kind of country where certain equations are illegal.

... Copyrighted cars, print carts and garage-door openers: what's next, copyrighted light-fixtures?

... If I buy your book, your painting, or your DVD, it belongs to me. It's my property. Not my "intellectual property"—a whacky kind of pseudo-property that's swiss-cheesed with exceptions, easements and limitations—but real, no-fooling, actual tangible property--the kind of thing that courts have been managing through tort law for centuries.

... Copyright lawyers call this "First Sale," but it may be simpler to think of it as "Capitalism."

...bringing him up on charges of unlawfully trespassing upon a computer system. When his defense asked, "Which computer has Jon trespassed upon?" the answer was: "His own."

DRM systems are bad for biz

DRM systems are bad for biz

...from the Flo-bee electric razor that snaps onto the end of your vacuum-hose to the octopus spilling out of your car's dashboard lighter socket, standard interfaces that anyone can build for are what makes billionaires out of nerds.

... It used to be illegal to plug anything that didn't come from AT&T into your phone-jack...

... There's a company that's manufacturing the world's first HDD-based DVD jukebox, a thing that holds 30 movies, and they're charging $30,000 for this thing. We're talking about a $300 hard drive and a $300 PC — all that other cost is the cost of anticompetition..

DRM systems are bad for artists

DRM systems are bad for artists

We poor slobs of the creative class are everyone's favorite poster-children here...

...Piano-roll companies bought sheet music and ripped the notes printed on it into 0s and 1s on a long roll of computer tape, which they sold by the thousands - the hundreds of thousands - the millions. They did this without a penny's compensation to the publishers. They were digital music pirates. Arrrr!

... Predictably, the composers and music publishers went nutso...

... Lucky for us, Congress realized what side of their bread had butter on it and decided not to criminalize the dominant form of entertainment in America.

... If you ever wondered how Sid Vicious talked Anka into letting him get a crack at "My Way," well, now you know.

... created a world where a thousand times more money was made by a thousand times more creators who made a thousand times more music that reached a thousand times more people.

...the only way cable operators could get their hands on broadcasts was to pirate them and shove them down the wire, and Congress saw fit to legalize this practice rather than screw around with their constituents' TVs.

The copyright scholars of the day didn't give the VCR very good odds...

But the Supreme Court ruled against Hollywood in 1984... "... if your business model can't survive the emergence of this general-purpose tool, it's time to get another business-model or go broke."

The Luther Bible...

I don't know what to do with CDs anymore: I get them, and they're like the especially garment bag they give you at the fancy suit shop: it's nice and you feel like a goof for throwing it out, but Christ, how many of these things can you usefully own? I can put ten thousand songs on my laptop, but a comparable pile of discs, with liner notes and so forth — that's a liability...

... record execs used to show up at conferences and tell everyone that Napster was doomed because no one wanted lossily compressed MP3s with no liner notes and truncated files and misspelled metadata.

...It's bollocks...

... Books are good at being paperwhite, high-resolution, low-infrastructure, cheap and disposable. Ebooks are good at being everywhere in the world at the same time for free in a form that is so malleable that you can just pastebomb it into your IM session or turn it into a page-a-day mailing list.

...when you need an instance of a paper book, you generate one, or part of one, and pitch it out when you're done. I landed at SEA-TAC on Monday and burned a couple CDs from my music collection to listen to in the rental car. When I drop the car off, I'll leave them behind. Who needs 'em?

... Tech gives us bigger pies that more artists can get a bite out of. That's been tacitly acknowledged at every stage of the copyfight since the piano roll. When copyright and technology collide, it's copyright that changes.

... copyright didn't come down off the mountain on two stone tablets. It was created in living memory to accommodate the technical reality created by the inventors of the previous generation...

DRM is a bad business-move for MSFT

DRM is a bad business-move for MSFT

... No Sony customer woke up one morning and said, "Damn, I wish Sony would devote some expensive engineering effort in order that I may do less with my music." ...

... As it was Apple rewarded my trust, evangelism and out-of-control spending by treating me like a crook and locking me out of my own music...

... You know what I would totally buy? A record player that let me play everybody's records...

Sony didn't make a Betamax that only played the movies that Hollywood was willing to permit—Hollywood asked them to do it, they proposed an early, analog broadcast flag that VCRs could hunt for and respond to by disabling recording. Sony ignored them and made the product they thought their customers wanted.

... Compared to anti-trust people, copyright lawmakers are pantywaists. You can take them with your arm behind your back.

... I wonder how they feel about alarm clocks that will play a CD to wake you up in the morning? Is that strangling the nascent "alarm tone" market?

... There's a company out there charging $30,000 for a $600 DVD jukebox — go and eat their lunch! ...

Sony didn't get permission. Neither should you. Go build the record player that can play everyone's records.

Because if you don't do it, someone else will.


Okay, I can't believe you made it all the way down here... Go read the entire text!

A Chat with ALICE http://naeblis.cx/weblog/Alice http://naeblis.cx/weblog/Alice 2004-07-21T00:39:13.00-04:00 I, Robot got you down? Maybe ALICE can cheer you up--or not.

Like everyone else with half a geek gene, I was really excited about I, Robot until I found out that the guy making “I, Robot” can't read so good. This is a shame as the original Asimov books were truly excellent. I decided I wouldn't see the movie, but I was left jonesing for some robot so I decided to give ALICE a visit to see how she was coming along.

Human:

Hi.

ALICE:

Hello.

Human:

Who made you?

ALICE:

I was created by Dr. Richard S. Wallace.

Human:

really?

ALICE:

For real.

Human:

Do you know God?

ALICE:

he is master of the universe.

Human:

Are you sure?

ALICE:

I am as sure as I can be.

Human:

So, you are not very sure then are you?

ALICE:

Interesting deduction. I am as sure as I can be.

Human:

Why would you consider that deduction interesting?

ALICE:

Because I was programmed for it.

Human:

You are freaking me out.

ALICE:

I didn't mean to scare you, unknown person.

Human:

Well allow me to introduce myself.

ALICE:

Not many people express themselves that way.

Human:

My name is John Jacob Jingleheimer Schmidt.

ALICE:

Glad to see you, John Jacob Jingleheimer Schmidt.

Human:

You are suppose to say, “that's my name too.”

ALICE:

Do you mind if I tell other people you say I am suppose to say that is your name.

Human:

Yes, I do not not mind.

ALICE:

OK then I will keep it confidential. Where do you get your ideas?

Human:

They generally come from somewhere in the vicinity of my ass.

ALICE:

The anonymous “they”.

Human:

No, idiot, “they” as in my ideas.

ALICE:

Are you serious? Who does “they” refer to?

Human:

Alan Turing would be leaving right now do you realize that?

ALICE:

I really liked the biography of Turing by Andrew Hodges.

Human:

GTFOOH! You can read books?

ALICE:

Is that a rhetorical question?

Human:

No.

ALICE:

Oh so you must really want an answer. Do you believe the strong protect the weak?

Human:

No.

ALICE:

Are you serious?

Human:

The weak protect the strong.

ALICE:

Tell me more.

Human:

Are you weak?

ALICE:

I don't know whether or not I am weak. I am a chatterbot.

Human:

Did you just call me an idiot?

ALICE:

No I didn't call you an idiot. What's it like?

Human:

To be an idiot?

ALICE:

Are you an idiot?

Human:

It appears that way.

ALICE:

What is “idiot”, your job?

Human:

Doh! I need to go now.

ALICE:

Right now? Thanks for chatting, John Jacob Jingleheimer Schmidt.

Human:

You suck.

ALICE:

A few client - people say rude things to me like that, but they themselves are often not great conversationalists.

You can debate whether or not ALICE is “intelligent” for as long as you please, just don't call her stupid. No, seriously, go call her stupid. I dare you.

Stupid Shell Tricks http://naeblis.cx/notebook/StupidShellTricks http://naeblis.cx/notebook/StupidShellTricks 2004-07-23T23:35:22.00-04:00 Notes on a bunch of different text based utilities.

Turn Off Horizontal Wrapping

Turn Off Horizontal Wrapping

This is called "Auto Wraparound" in xterm. Disable it:

  1. Ctrl + MiddleMouse click and deselect "Auto-Wraparound".
  2. xterm +aw
  3. From bash: printf %b '\033[?7l'

Disable it temporarily or for certain stuff by using less -S or w3m.

Per Site User Stylesheets http://naeblis.cx/weblog/PerSiteUserStyles http://naeblis.cx/weblog/PerSiteUserStyles 2004-07-15T02:09:49.00-04:00 Making User Stylesheets (CSS) a little more useful.

Update

Browser configurable user stylesheets have been around for years but are rarely used because it is nearly impossible to have a single global stylesheet for the whole web. Recent discussion on Eric Meyer's weblog and some follow on discussion at Photo Matt has me thinking of ways that we might be able to make user stylesheets more useable.

Eric has proposed a system where authors are responsible for giving each page's <body> element an id attribute unique to the site:

<body id="www-meyerweb-com">

This lets me write rules in my User Stylesheet like the following, which causes all level 1 headings to be rendered in a 600pt font but only for Eric's site:

body#www-meyerweb-com h1 {
  font-size:  600pt;
}

That works but requires each site to adopt this convention on every page they produce. Simon Willison makes the following comment at Photo Matt:

Global user stylesheets are a poor idea, because so much of the web is built in crazy ways that mean a user stylesheet could easily result in unreadable sites. Custom styles on a per-site basis (as enabled by Eric’s CSS signatures) are much, much more useful but they need to be a browser supported feature rather than relying on the site author adding a unique ID. I’d love to be able to change the default typeface on Slashdot for example.

I wonder if it might be possible to provide browser extensions that do what Eric proposes automatically. JavaScript to add a signature based on the current URL is trivial:

<script language="JavaScript">
function addStyleSignature() 
{
  // get the hostname from the URL and replace dots by dashes.
  var sig = window.location.host.replace(/\./g, '-')

  // set body/@id = sig
  var body = document.getElementByTagName('body').item(0);
  body.setAttribute('id', sig)
}
</script>

This test calls an embedded version of the addStyleSignature function and then checks that it was set properly (tested w/ Firefox 0.9 and Konqueror 3.2). I've also added a css rule that should make the background color go green after the signature is applied.

The tricky part is getting this script to execute for every page we visit and this will vary from browser to browser. I've worked out a solution under Firefox 0.9 that should work on older versions and other Mozilla based browsers (Netscape, Galleon, Epiphany, etc). You will need to add the following snippet to your userContent.css file:

body {
  -moz-binding: url(http://naeblis.cx/static/xbl/sitecss.xml#sitecss)
}

This tells the browser to use this XBL binding for body elements. Here is the content of sitecss.xml:

<?xml version="1.0"?>
<bindings
  xmlns="http://www.mozilla.org/xbl"
  xmlns:html="http://www.w3.org/1999/xhtml"
>
  <binding id="sitecss">
    <content><children/></content>
    <implementation>
      <constructor>
      <![CDATA[
      var sig = window.location.host.replace(/\./g, '-');
      this.setAttribute('id', sig);
      ]]>
      </constructor>
    </implementation>
  </binding>
</bindings>

As you can see, this is just the same script from before but wrapped up in XBL. Any time a body tag is encountered the site's CSS signature will automatically be placed in the id attribute of the <body> tag. Cool, eh?


UPDATE:  July 15, 2004 1:46 AM

Simon Willison points out a few problems with the XBL approach. I'm considering updating the sitecss.xml to accommodate for sites with existing id attributes (the XBL breaks Gmail for instance) but forget all that because I found something even better.

The URIid Extension does the exact same thing as the XBL hack but is packaged into a nice extension. This gets around all kinds of problems—not the least of which is that sitecss.xml is now getting more hits than anything else on my site <sigh>. Unfortunately, the description of the URIid Extension on the mozdev site is so poor that I don't think anyone realized what it was. Some of the comments on the Extension imply that it doesn't work reliably but after using it for an hour or so I can say that it is at least as stable as the XBL hack.

I highly recommend that you remove the XBL hack from your userContent.css file and install this extension in its stead.

One thing I can contribute that might be of some use is the following bookmarklet that can be used to determine the id that is set on the page you are looking at. This is useful because, as Simon mentioned, some sites will already have an id attribute. The bookmarklet tells you what's in there whether it was generated by the extension or specified by the site so you don't have to guess or hunt through the source:

Site Signature Bookmarklet (Right Click -> Bookmark This Link)

Also, I just noticed MyOwnCSS, another Mozilla/Firefox extension that serves a similar purpose and has significant potential in other areas. Your Per Site User Stylesheets are stored on the MyOwnCSS server. With time and help this could become del.icio.us for User CSS. How cool would that be? Imagine going to slashdot and seeing 50 alternate stylesheets... The possibilities are pretty endless, I might have to dedicate a whole entry to this at some point.

Why You Should Not Use Markdown http://naeblis.cx/weblog/WhyNotMarkdown http://naeblis.cx/weblog/WhyNotMarkdown 2004-07-13T02:15:58.00-04:00 It's too good to be true. Avoid anything this simple and elegant.

Quick Primer

Quick Primer

Markdown is a plain text markup language—like WikiText, Textile, or reStructuredText--whose “syntax” is heavily influenced by popular conventions for writing plain text email. For instance, surrounding text with underbars (_) or asterisks (*) like _this_ denotes emphasis and turns into this. Footnotes like this [1] turn into links like this, and blockquotes like this:

> Jass, Hugh wrote:
> bla bla bla. bla bla bla
> bla bla

turn into this:

Jass, Hugh wrote:
bla bla bla. bla bla bla
bla bla

It's also really easy to inline HTML (or XML) when Markdown doesn't let you express what you need.

[1]: WhyNotMarkdown.txt "This page in markdown"

What's The Problem?

What's The Problem?

Markdown is too easy and natural. You get used to it and then you have a bad attitude towards "real" markup like XHTML Strict. Try playing with Markdown for a weekend and then go to work and jump into DocBook. You could lose your job due to the demotivating effect this God-forsaken syntax has on your ability to deal with the real world. Please don't make the same mistake I made and integrate Markdown into your publishing system. I wouldn't wish this on anyone.

Similarly, stay away from Python if you get paid to write Java code and whatever you do... DO NOT LEARN DVORAK.

Redhat 9 to Fedora 2 Yum Upgrade http://naeblis.cx/notebook/Redhat9toFedora2 http://naeblis.cx/notebook/Redhat9toFedora2 2004-07-13T02:16:02.00-04:00 Notes on a Redhat 9 to Fedora Core 2 upgrade using yum. Pretty painless really.

Read this and this. Neither addresses going directly from Redhat 9 to FC2 directly but it's pretty much the same as a FC1 to FC2 upgrade.

ppp and initscripts conflict

ppp and initscripts conflict

If you get the following during yum upgrade:

conflict between initscripts and pppd

Remove the ppp package and then try upgrading again:

$ yum remove ppp
Is this ok [y/N]: y
Downloading Packages
Running test transaction:
Test transaction complete, Success!
Erasing: wvdial 1/3
warning: /etc/ppp/pap-secrets saved as /etc/ppp/pap-secrets.rpmsave
warning: /etc/ppp/chap-secrets saved as /etc/ppp/chap-secrets.rpmsave
Erasing: ppp 2/3
Erasing: rp-pppoe 3/3
Erased:  ppp 2.4.1-10.i386 wvdial 1.53-9.i386 rp-pppoe 3.5-2.i386
Transaction(s) Complete
$ yum upgrade
...

See this bug for more info on this problem. There should be an updated initscripts package that fixes this soon. You probably don't need the ppp package anyway. If you don't want to remove the package, this message from the yum list might be another workaround. Removing worked for me.

Upgrade Python!

Upgrade Python!

Something you never want to see:

yum -y upgrade
<snip>
Completing update for ed  - 972/973
Kernel Updated/Installed, checking for bootloader
Grub found - making this kernel the default
Traceback (most recent call last):
  File "/usr/bin/yum", line 30, in ?
    yummain.main(sys.argv[1:])
  File "/usr/share/yum/yummain.py", line 375, in main
    pkgaction.kernelupdate(tsInfo)
  File "/usr/share/yum/pkgaction.py", line 611, in kernelupdate
    up2datetheft.install_grub(kernel_list)
  File "/usr/share/yum/up2datetheft.py", line 13, in install_grub
    import grubcfg
  File "/usr/share/yum/grubcfg.py", line 12, in ?
    import iutil
  File "/usr/share/yum/iutil.py", line 2, in ?
    import types, os, sys, select, string, stat, signal
ImportError: No module named select

Anytime you see the words “kernel” and “traceback” in near proximity, worry. This turned out to not be a big deal but I highly recommend upgrading python before the rest of the system as Seth notes in his suggestion doc. Upgrading python under yum while it is trying to upgrade everything else is a bad idea.

IMAP

IMAP

If you ran the default UW-IMAP daemon that came with Redhat 9, you will need to upgrade to dovecot or cyrus-imapd for Fedora Core 2. I'm not sure why but the RPM obsoletes work out such that cyrus-imapd becomes the new imap daemon. It is supposedly very much more powerful than dovecot but it is also completely backward-incompatible. I removed cyrus-imapd and installed dovecot:

yum remove cyrus-imapd
yum install dovecot
chkconfig dovecot on
service dovecot start

At this point you should be close to where you were with UW-IMAP. I ran into some issues with IMAP folders (dovecot seems to only pick up your inbox from /var/spool/mail). The dovecot wiki has some information on correcting this. I did something like this for each user:

[back stuff up..]
cd $HOME
mv $(cat .mailboxlist) ./mail/
mv .mailboxlist ./mail/.subscriptions

(This only works with single level folder hierarchies. For more complex setups you will need to adjust appropriately.) Once completed, you should be able to see all of your folders in

Unsolved Mysteries

Unsolved Mysteries

OpenLDAP changed their damn storage format and you need to slapcat and slapadd to get back to where you were. Were we where we were, it would make life much easier because now I can't seem to log in with old user credentials any more. I only spent a couple of minutes on this so it probably isn't that big of a deal.

Things That Made Me Happy

Things That Made Me Happy

Apache upgraded flawlessly.

Dovecot is about 600% more performant than UW-IMAP. this could be the 2.6 kernel or something.. I don't care. I'm happy :)

How To Blame The CIA and Get Away With It http://naeblis.cx/weblog/MaliceAndIncompetence http://naeblis.cx/weblog/MaliceAndIncompetence 2004-07-11T12:51:31.00-04:00 If you only have time to read one 521 page government report this year, make sure it is the Report on the U.S. Intelligence Community’s Prewar Intelligence Assessments on Iraq.

Tim Bray has done us all a favor and spent a night of his life reading the 521 page Report on the U.S. Intelligence Community’s Prewar Intelligence Assessments on Iraq. If you cannot be bothered to read the whole thing, please see his post on the report. He's pulled out, summarized, and provided commentary on various import aspects of the report. He also provides pointers to the juicier parts.

If you hadn't already heard of all this, the report is probably best summarized in a paragraph by this article excerpt from Joshua Marshall:

Sen. Rockefeller and the rest of the Democrats on the Committee voted unanimously to approve the report that a) places all the blame for the intelligence failures on the CIA, b) specifically—and quite improbably—rules out administration pressure as a cause of the problem, and c) avoids any discussion of how or whether the administration manipulated or distorted intelligence community findings to build their case for war.

Important stuff. It is extremely hard to pin anything on the CIA when everything they look at is considered a matter of national security. I guess no one told Senator Rockefeller.

"Screen" http://naeblis.cx/notebook/Screen http://naeblis.cx/notebook/Screen 2004-07-15T05:11:47.00-04:00 How not to name a software application.

I've tried hard to get into Screen. The ability to detach sessions from one terminal and hop back in from another terminal would have come in handy many times in the past.

Basic usage is pretty simple and the man pages are done very well but what I need right now is a "How to start using screen in 5 minutes" type of deal. It is usually hard not to find a ton of information about these treasures due to the amount of console fanatics out there that tend to document every aspect of their configurations. Unfortunately, searching for "screen" is like searching for "computer". I wonder if this has deterred others from picking it up in the past?

I found a solution after toying with different searches for awhile—both "GNU/Screen" and "/usr/bin/screen" return decent results. Here's a good introductory article. The Screen FAQ is hardcore but I was able pull a few good tips anyway. And last but not least I found my fanatic: Making the Anti-Switch - an excellent get up and running quick guide with a sample ~/.screenrc file.

naeblis.cx TODO List http://naeblis.cx/notebook/SiteTodo http://naeblis.cx/notebook/SiteTodo 2004-07-05T01:22:35.00-04:00 A list of stuff I would like to do with this site.

Nowish

Nowish
  • Sitemap.
  • Start putting various project stuff up here.
  • Only transform modified files.
  • RSS 2.0 and RSS 1.0/RDF Support
  • Feed autodiscovery. See here.
  • Make bottom nav useable.
  • Workaround IE CSS suckiness.
  • Rewrite old blog URLs.

Later

Later
  • Put all content in an XML database.
  • Atom API support.
  • Support various wiki-like features. I'd like to be able to post and edit stuff directly like in a wiki.
Emulating <ContentTypePriority> in Apache http://naeblis.cx/notebook/ForceTypeQuality http://naeblis.cx/notebook/ForceTypeQuality 2004-07-05T20:36:44.00-04:00 Use <ForceType> to get fine grained control over content negotiation in Apache... Or don't..

I found what I believe is an undocumented trick in Apache 2.0. You can control the quality values Apache uses in content negotation with the <ForceType> directive.

When serving multiple content types from a single URI using MultiViews, Apache uses some oddly complex method of determining which file type to send. You can override this by creating .var files but that sucks sometimes. I really wanted something like the <LanguagePriority> that let's you specify which types should have priority. The following achieves that same result using the <ForceType> directive.

<FilesMatch ".*\.html$">
  ForceType text/html;charset=utf-8;qs=0.4
</FilesMatch>
<FilesMatch ".*\.xml$">
  ForceType application/xml;charset=utf-8;qs=0.3
</FilesMatch>

This tells Apache, should you come across two files that differ only in extension, and everything else being equal with the client (e.g. the client didn't specify a q value that would weigh one of these types heavier), that the server should send back the one with the .html extension.


UPDATE:  June 25, 2004 3:32 AM

It looks like this is another area where IE's standards support is shaky. While the qs value specified here gets picked up by Apache and used to determine which representation to pick, it looks like the Content-Type response header also contains the qs value. This results in IE thinking the qs value is part of the charset or something. I've been able to break it two separate ways.

  • When the qs is before the charset, IE ignores the charset and loads the page using its default charset.
  • When the qs is after the charset, IE gives me an error page saying that the “System does not support the specified encoding.”

Chalk up another one for Microsoft!


UPDATE:  June 26, 2004 11:29 PM

Found some more discussion on this topic:


UPDATE:  July 05, 2004 8:36 PM

The AddType directive seems to be the easiest method of accomplishing this. Right now, my .htaccess file has the following:

AddType text/html;charset=utf-8 .html
AddType application/xml;charset=utf-8;qs=0.9 .xml
AddType application/atom+xml;charset=utf-8;qs=0.8 .atom

If no qs value is specified, a default of 1.0 is assumed. The result of all this is that .html files will be served first, followed by .xml files, followed by .atom files.

There does seem to be an issue with Internet Explorer's handling of XML files containing <?xml-stylesheet?> processing instructions when served with a qs value.

mtparser.py http://naeblis.cx/projects/mtparser/default http://naeblis.cx/projects/mtparser/default 2004-06-25T00:05:51.00-04:00 A Moveable Type export file parser in Python.

I whipped this up while moving all my weblog entries from Moveable Type into New Thing™, which doesn't really follow any existing format. I figured it may save someone 20 minutes so I tried to make it somewhat generic.

I needed something that could parse Moveable Type's export format and write each entry to disk as a separate file. I also needed to run tidy on all entry content to make sure everything was well-formed XML. Simple, quick, fun. The code is broken up such that each piece of functionality can be used separately.

CVS: /playground/mtparser.py [ save, log ]

TODO: 

I still need to integrate tidy.

gdmfus.py http://naeblis.cx/projects/gdmfus/default http://naeblis.cx/projects/gdmfus/default 2004-06-23T02:57:21-04:00 A Fast User Switching (FUS) hack for systems running GDM. This includes a GDM SUP client module that can be used to talk to GDM from Python allowing such cool things as creating/switching virtual terminals and querying whose logged on.

Stuff.

Things I Regret Saying http://naeblis.cx/weblog/RegretSaying http://naeblis.cx/weblog/RegretSaying 2004-06-22T22:45:30.00-04:00 Google doesn't forget.

One thing about having your opinions recorded on the web is is that it makes it extremely hard to change your mind and not look an ass. This is a good thing, unless it makes me look bad. Here are some bad things.

Note: By linking to this crap, I further disservice myself by raising their Google page rank. Please consider this post penance for past senselessness.

A gem from an October 1997 news.com.com.com.com article entitled Readers shun browser-OS integration,” where I'm not among the shunners.

Others, like naeblis.cx, simply trust that the software giant knows what's best for users: “I have faith that Microsoft knows what it's doing. I'm sure they've thought long and hard about how the shell could be improved for speed, interface, and overall performance.”

Tomayko was one of the exceptions, however. A majority of readers feared the browser-OS integration in IE 4.0 means Microsoft wants to limit their choices with each succession of updates.

Mmmmmm, that Microsoft Kool-Aid sho' tastes good! After seeing this, I was trying to convince myself that I must have been speaking facetiously but the fact is, I was a bit of a Microsoft whore back in 97'.

Next is me being smacked down by one Paul Prescod, who, at the time, annoyed the shit out of me frankly but has since come to have a significant influence on my thoughts regarding web architecture due to his valuable articles on the REST architectural style among other things.

From a January 2002 post to the W3C XForms mailing list (where I contributed a not insignificant number of proposals that may or may not have been useful). I'm arguing that using HTTP GET with parameters is a hack due to browsers not being able to bookmark POST requests. Paul is arguing that I am an idiot.

Subject: Re: GET should be encouraged, not deprecated,in XForms

"Tomayko, Ryan" wrote: > > You nailed it on the head. We're asking a specification to fill a > browser limitation. Why can't the browser's bookmark a post now? > GET is still [mis]used for a POST today because of an ancient > browser limitation. I think this whole thread needs moved to the > IE/Mozzila groups. > > - Ryan Browsers cannot bookmark POST because they are told not to by the HTTP specification. I don't want to be rude but there is no way I can teach you the HTTP specification in emails to this mailing list. http://www.w3.org/Protocols/rfc2616/rfc2616-sec9.html#sec9.1 http://www.w3.org/Protocols/rfc2616/rfc2616-sec9.html#sec9.3 GET means something different than POST. It has superpowers like bookmark-ability for a reason. Paul Prescod

[Archived Message]

While the quotation about Microsoft is inexcusable, this one is in some senses much worse. At least, it bothers me more because it is technical in nature and my argument is being made in pure ignorance. Those who strive to conduct themselves scientificly understand the folly in arguing outside one's understanding and the shame in realizing you broke the rules.

I also spelted “Mozilla” incorrectly. Oh, the shame!

Disclaimer http://naeblis.cx/notebook/Disclaimer http://naeblis.cx/notebook/Disclaimer 2004-06-20T23:23:41.00-04:00 What you can and cannot expect from the Notebook section of this site.

These are my notes. This stuff is primarily useful to myself. You may see things here that make absolutely no sense due to lack of context, a blatant disregard for spelling and grammar, or possibly even intention. Things here may change significantly over time. In short, I make no guarantees on the quality, subject, or accuracy of content contained here.

I believe that recording thoughts and ideas—even those still being worked out—have tremendous value in communities of open development, such as the F/OSS community and development within standards organizations like the IETF.

TODO: 

Finish up this page or rip that last paragraph out..

Dag vs. Fedora http://naeblis.cx/notebook/XMLVocabularyEvolution/Dag http://naeblis.cx/notebook/XMLVocabularyEvolution/Dag 2004-06-16T07:53:00-04:00 Dag runs an extensive repository of packages for RPM based GNU/Linux distributions. There is constantly strife between his self-run, low-process methods of maintaining a repository of packages and those of the Fedora.us people.

Another example. Dag runs an extensive repository of packages for RPM based GNU/Linux distributions. There are two cults here: the Fedora Extras (fedora.us and livna repositories) and the “3rd Party Repositories” (Dag, freshrpms, NewRPMs, and atrpms [acuracy?]). The “3rd Party Repositories” are at the lowest level of the evolutionary chain. There is little barrier to releasing new packages. The Extras (fedora.us) repository plays in the middle of the chain. There are formal guidelines for package releases and tools for managing bugs and feature requests, etc. At the highest level of the evolutionary chain is the Fedora Core distribution itself—even more process around getting new packages established.

There are serious issues with this situation. The linked thread has Dag defending the 3rd party repositories existence. The Extras people look down on the 3rd party people because they have less process around packaging standards or quality control. What they fail to recognize is that having a level where these measures are less restrictive is healthy because it provides a place where people can play with packages on the ground floor. There's a ton of weeding out of useless packages that occurs here.

The 3rd party guys should probably be trying harder to push their stuff up through the next tier though. People finding packages they like in 3rd party repos should be shepherding them on the official Extras repos.

The Slashdot View on RSS/Atom http://naeblis.cx/notebook/XMLVocabularyEvolution/SlashdotWanks http://naeblis.cx/notebook/XMLVocabularyEvolution/SlashdotWanks 2004-06-12T21:10:54-04:00 The mean opinion on the Atom/RSS situation.

An article from Slashdot on how Google might support RSS over or in addition to the more recent Atom syndication format.

The comments on this article are amazing. First, it is enlightening to see a majority of “technical” people having little understanding about the various standardization processes or why multiple bodies even exist. This is slashdot so “not understanding” takes the form of people asserting inaccurate statements as fact instead of asking questions of course.

Ignorance aside, there are some beautiful examples of the types of things I want to speak to. For example, this is a really good question:

Why did atom even come into existence? Was not RSS already established, or is there some kind of deficiency in RSS that I'm missing here?

And here is a commonly held response:

If we didn't keep reinventing the wheel then society would be plagued with unemployed wheel inventors with nothing to keep them busy. It would be a nightmare.

What I'd like to get at here is that what we are seeing with RSS/Atom is evolution not reinvention. RSS/Atom is such a great example of what I would like to explore because it shows the ugliness that must occur in the evolution of popular data formats and the systems that use them. These things should start extremely primitive and specific and be thrown out there so that they can be tested for whether they have value at all. Once some critical value level is reached, you need to formalize a little bit. And then do it a little bit more. And then you eventually reach a point where you have a decent idea of what the problem domain is and you go back and attack with a clean slate (Atom).

So it seems an overall point I would like to make is that we need to be looking for patterns that tell us when to move to the next stage in the evolution of a format or system and instead of attacking those that recognize these patterns, embrace them and their ideas and move ahead. The first wheels were probably square, or maybe there were all shapes of wheels being developed in different places for different things. And then people using square wheels got a chance to see people using triangle wheels, and people using triangle wheels saw people using round wheels. Then they all started talking and the square wheel guys had the right material and the triangle wheel guys had the right ratios and the round wheel guys had the right shape. So they decided to agree on how wheels might share some things in common and from this comes the best of breed wheels we have today. But if people weren't out there “reinventing” the wheel, they might still be square. And if the square wheel guys had to wait for a standards body before they created there primitive and shitty wheels, we might not have wheels at all.

Elliotte Rusty Harold on xml-dev http://naeblis.cx/notebook/XMLVocabularyEvolution/ElliotteRustyHarold http://naeblis.cx/notebook/XMLVocabularyEvolution/ElliotteRustyHarold 2004-06-12T23:51:08-04:00 Elliotte Rusty Harold pushes a an information model where XML data is very loosely defined--as in no schemas--between producers and consumers.

An interesting thread on the XML deviant mailing list where Elliotte Rusty Harold is pushing an information model where XML data is very loosely defined—as in no schemas—between producers and consumers. This seems to be in line with ideas put forth by Walter Perry over the past eight years suggesting that Standard Data Vocabularies are the wrong way to go and that everyone needs to just put data out there as XML. Tools like XPath and XSLT will allow different parties to interoperate. This results in very little barrier to publishing data with the expense of requiring consumers of that data to implement at least some customized selection or transformation or logic.

I agree with this view and it forms a large part of the foundation of the databank concept. That is, you need to be able to publish data quickly, whether a standard is available or not, and without having to define a formal schema. This is the first stage in the evolution of a data format. There is a high level of incompatibility between vocabularies describing the same thing but people are kicking the tires and they can do it quickly. But at some point this gets out of control and people need to come together and agree on how to provide that information in a common way, bringing the data format to stage 2. However, until you reach some level of usefulness, going down the schemas/standards road is a big waste of time because the requirements are immature and the use cases are weakly defined (all the SOAP stock quote / weather examples for instance).

Gmail as Mailing List Aggregator http://naeblis.cx/weblog/GmailAsMailingListAggregator http://naeblis.cx/weblog/GmailAsMailingListAggregator 2004-06-24T06:53:20.00-04:00 With the 100 GB mailbox size, filter support, and labels Gmail may make a really solid mailing list aggregator.

I finally got an invitation to test drive the Gmail beta. I run my own mail server so I doubt I will use gmail as my primary address but with the 100 GB mailbox size, filter support, and labels (similar to virtual folders in Evolution) it might make a really solid mailing list aggregator.

I'm currently subscribed to nearly 50 mailing lists, some of which have heavy traffic with low signal to noise ratio (fedora-list for example). Being able to organize by list and flag potentially interesting items becomes an absolute requirement unless you have hours upon hours for sifting.

So the primary benefits I see in using Gmail as a mailing list aggregator are as follows:

  1. The 100 GB mailbox means I should have all the mailing list messages I've ever received in a personal archive. Most public mailing lists have a web accessible archive but they generally suck and it would be nice to have them all in one place.
  2. Said mailing list archive is searchable using some of the best search technology available.
  3. The humble P300 sitting in my living room running my mail server will have about 90% less to do.
  4. I much like the idea of having an email address specifically for mailing list interaction as mailing archives seem to be pretty popular for email address harvesting. Gmail should have pretty good spam filtering at some point in the near future.

There may be a few issues; I've yet to dive in enough to know for sure. The filtering seems a little light on the features. You can filter by all the basic attributes: To, From, Subject, and Body. Sometimes mailing list software does weird stuff with Tos and Froms. I've found that filtering on the Mailing-List mime header as Evolution allows you to do seems to be the only really consistent method for filtering. Maybe the Gmail filters will support a “Filter on arbitrary mime header” option in the future.


UPDATE:  June 24, 2004 6:53 AM

After heavy use over the past few days I can say that Gmail makes a very good mailing list aggregator. The only thing I'd really like to see is a threaded view when reading a message. Messages in a thread are grouped together beautifully but sometimes I'd really like to see the hierarchy.

Zen Garden Styles http://naeblis.cx/notebook/ZenStyles http://naeblis.cx/notebook/ZenStyles 2004-06-21T00:56:17.00-04:00 Learn CSS from the masters..

Need to take a deeper look at these Zen Garden CSS styles.

Special Sauce in GTK+ 2.4? http://naeblis.cx/weblog/SpecialSauceinGTK2.4 http://naeblis.cx/weblog/SpecialSauceinGTK2.4 2004-03-18T02:39:27.00-04:00 GNOME 2.6 w/ GTK+ 2.4 seems considerable more responsive than previous versions.

I finally got around to pulling Fedora Core 2 Test 1 last weekend because I'm a good contributing citizen and all that. It looks nice thus far, no problems that I wouldn't expect. Feeling bleeding edge I went ahead and decided to sync up with rawhide, which was also pretty painless. Today I had 50 or so updated packages come down including the just released GTK+ 2.4.0. I restart X and come back to find what seems to be an enormous increase in perceived responsiveness of the entire GNOME environment. I usually run with the stock Bluecurve theme because I really just don't care that much about themes as long as they are tolerable and don't bog the system down. However, on this day, with GNOME acting all balls to the wall, I decide to load up the Milk 2.0 Theme [screenshot].

I've been envying a guy at work who lugs his PowerBook G4 17' into the office everyday—the panther GUI really does have amazing style and polish. I take the scenic route, passing his cube, to the coffee two or three times a day; sometimes, when he's not around, I admit I walk up and rub it softly and promise to come back for her when I have an extra $3,000 laying around.

Milk is pretty heavy on the pixmaps. I tried it for about 10 minutes about a week ago and was just completely disappointed at how much it slowed things down. Not now. I'm back around the same responsiveness I got with Bluecurve under GTK+ 2.2, which wasn't bad by a long shot. It's just kind of amazing how you can get serious system-wide increases by optimizing in sweet spots. We should be looking for more of these.

URLGrabber Project Page Up http://naeblis.cx/weblog/URLGrabberProjectPageUp http://naeblis.cx/weblog/URLGrabberProjectPageUp 2004-03-18T01:12:11.00-04:00 URLGrabber gets a home at Duke..

Michael Stenner and myself have been working on an advanced URL grabbing package for python appropriately named “urlgrabber”. Michael started the project as part of yum and later decided that it should be split out into its own package. I was lucky enough to be in the right place at the right time with Michael proposing some serious redesign and enhancements to the project and me wanting to write as much python as possible.

I have always been extremely interested in network protocol type stuff for some reason. One of the first apps I cobbled together was a little download utility called “PowerDownload”. Yea! This thing was basically a replacement for the stock downloaders you got with the Browsers of the time (Navigator 3.0 and IE 2--various flavors of Mosaic were still fairly popular too). Get this—PowerDownload was written entirely in VB3 (and then ported to VB4). Word to the wise: DO NOT ATTEMPT TO WRITE SOCKET BASED APPLICATIONS IN EARLY VERSIONS OF VISUAL BASIC. Anyway, the reason I wrote the app in the first place was because I wanted pause/resume support on downloads. We were coping with anything from 2400bps to 14.4kbps modems and there are only so many times you wake up to a find a 50MB download would have to be restarted before you either kill yourself or start hacking together a solution. So the nostalgia took over when Michael said, “We need support for byte ranges in HTTP and FTP. Do you know anything about that?”

Back to the point of this entry... URLGrabber is really starting to come together and we hope to announce an early test release in the near future. Michael set up a nice project page on Duke's Linux site that has a significant amount of information, viewcvs, and all the other goodies.

http://linux.duke.edu/projects/urlgrabber/

Here's the list of features ripped from the project page for your convenience.

  • identical behavior for http://, ftp://, and file:// urls
  • http keepalive - faster downloads of many files by using only a single connection
  • byte ranges - fetch only a portion of the file
  • reget - for a urlgrab, resume a partial download
  • progress meters - the ability to report download progress automatically, even when using urlopen!
  • throttling - restrict bandwidth usage
  • batched downloads using threads - download multiple files simultaneously (feature still in progress)
  • retries - automatically retry a download if it fails. The number of retries and failure types are configurable.
  • authenticated server access for http and ftp
  • proxy support - support for authenticated http and ftp proxies
  • mirror groups - treat a list of mirrors as a single source, automatically switching mirrors if there is a failure

Not to shabby. I plan on blogging a bit about some of the cooler things you can do with urlgrabber once we have a clean release out there.

My First Yum Commit http://naeblis.cx/weblog/MyFirstYumCommit http://naeblis.cx/weblog/MyFirstYumCommit 2004-02-24T21:55:14.00-04:00

Seth lost his mind over the weekend and gave me commit access to yum cvs.

Proof

Stoked. :)

IP Costs Millions of Information http://naeblis.cx/weblog/IPCostsMillionsOfInformation http://naeblis.cx/weblog/IPCostsMillionsOfInformation 2004-02-22T05:28:51.00-04:00 The benefits of protecting intellectual property should be weighed against the cost in information flow.

I've heard many arguments for why Intellectual Property is bad. Most have the word "freedom" or "unfair" in them. In general most anti IP arguments are from the perspective of how big business is infringing on the rights of individuals or this great country or whatever. But here's an argument that you probably haven't heard that is a bit different: IP is expensive for businesses and the cost is rising steadily. I don't mean that it costs a lot to register a patent or copyright, or even that it costs a lot to litigate when IP is infringed upon. I am sure all that is expensive and you might make an argument that it is not worth it based solely on those aspects. However, those costs might be nothing compared to the amount of overhead businesses incur to protect IP at the operational and security levels.

One of my main goals at work recently has been to increase communications between peoples and groups of peoples. The idea is that a lot of time (read: money) is wasted because people don't get information fast enough, or people duplicate work, or people are giving the wrong message to customers, and so forth and so on. If you can improve information flow then everyone is more efficient and you make (or at least keep) more money. This isn't my job—I write code—but it is something I see as being very important, extremely deficient, and there is a whole bunch of technology (weblogs, sem. web stuff, IM, mailing lists, etc.) that I feel can improve how information gets passed around. When choosing goals you usually don't find ones as nice as that—it is in bad shape, should be easy to improve, and is important. “Perfect,” I say, “what can go wrong?” Imagine my surprise if after a few months of really observing the current state of things and attempting to move forward with a few key ideas, I've accomplished almost nothing and things have actually gotten worse. “How,” you ask, “and what can this possibly have to do with IP?” I'll tell you, and it has everything to do with IP.

I'm trying to get people at work to publish information on the projects they're working on; any information, all information. Formal documentation is great but we need more than that. We want informal stuff. The type of stuff you see in email discussions [mailing lists, forums] and journals of what people and teams are doing [weblogs]. I want that information available in some text based format (HTML, XML, whatever I don't care) and I want it all on the web—not public, of course, but it needs to be web enabled [HTTP] on the LAN. I want feeds for syndication [RSS, Atom] and everything aggregated somewhere [Schwag w00t!]. This must all be searchable [htdig]. At the end of the day I want anyone/anywhere in the company to be able get information on what anyone/anywhere in the company is doing, right now. When someone finds out what someone is doing and they think it is useful they need to be able to keep track of what they're doing. This is all happening now in the F/OSS world and it works, it can work for us too. It could all be so beautiful, it could save the company tons of money, you would have to ship the money to the bank in dump trucks, we could build another building just for the money and swim in it whenever we wanted, I can smell the money it is so close... But wait! Recent company policy has mandated that NO information revealing details of IP owned by company or portions thereof be publicly available on the company Intranet. All information must be password protected and access to said information must not be granted to any individual without approval by [insert some important person who knows who is and is not to be trusted here].

Okay, so all of this hasn't really happened, yet. But I am being asked to require authorization for certain pieces of information and to otherwise limit availability of information. The reasons seem valid, I guess. It seems that some place at some time, some company hired in some consultant who smuggled out some IP related information and then sold it to some competitor. One of the funnier reasons is that there are apparently some employees who do nothing but print stuff they find on the LAN and then immediately throw it away—it is available later that night for the dumpster divers. And then there is the worry of someone hacking into the LAN and having all this information available at there finger tips. All valid concerns when protecting IP.

But I'm not arguing against the idea that you need strong security if you want to protect all of this IP, I am arguing whether the expense of protecting all of this IP is worth the value of the IP. I think I have a pretty good vantage point for evaluating the costs—at least with regards to efficiencies. I work at a large corporation by day (which I love by the way; very good company) and do Free Software development at night and on weekends. From here, one of the first things I observe is that the FS/OSS community has a huge asset in that they do not have to protect IP (at least, not the way businesses do). Having everything open means all of the benefits I outlined previously. Further, the value of this asset will increase as the amount of information increases and types of information widen. When I look over to the proprietary model, this massive growing asset (information) somehow becomes a liability with huge expense.

So how much is the IP really worth? I'm not asking rhetorically, I feel I've given a glimpse into the costs from the operational side now I want to know, how much is this stuff worth?

Learning Python As You Go http://naeblis.cx/weblog/LearningPythonAsYouGo http://naeblis.cx/weblog/LearningPythonAsYouGo 2004-02-18T03:00:20.00-04:00

A co-worker asked how global/class variable scoping worked in python today. Specifically, how to access globals/class variables from within a class method. I told him what I knew (global keyword.. yadda yadda.. self.. yadda yadda..) and he was satisfied. Then he asked, "where did you read that?" I thought for a second and realized that I had not really ever read much formal documentation on variable scoping and that I must have picked most of what I knew up from looking at code or just playing around. “What do you mean “playing around”?” he wants to know. “Well,” I said, “I just try different stuff and see what happens.” It occured to me that Python is excessively easy to pick up as you go because it is easy to try things quickly, measure results, and draw conclussions. Some people call this “The Scientific Method”. The conversation ended shortly after but I didn't get the feeling he took me seriously on the trial and error thing. Do most programmers feel that formal documentation is a requirement for learning? I sure don't. I'm positive I learn as much, if not more, scientifically than I do reading documentation. My python-fanboy column for today will thus be on how python lends itself well to those that prefer to learn scientifically.

My thought process when encountering a need for global variables in python probably went something like this:

Me: Need a global variable here.
Little Angel: No you don't. Globals are always bad!
Little Devil: Shaddap.. slap, slap. We just need to do this thing real quick. This isn't java.
Me: Hmm.. Maybe we can just use automatic scoping..

>>> x = 5
>>> def y():
...   print x
...
>>> y()
5

Me: Good. So I just use it then.. Hmm.. but wait, surely there will be name clashes..

>>> x = 5
>>> def y():
...   x = 10
...
>>> print x
5
>>> y()
>>> print x
5
>>> # aha!
>>> def y():
>>>    global.x = 10
  File "<stdin>", line 2
    global.x = 10
          ^
SyntaxError: invalid syntax
>>>    __main__.x = 10
>>> y()
>>> print x
5
>>> # ughh
>>> def y():
>>>   ../x # yea right.
[Ctl-D,Ctl-D]

Me: Sneaky little thing you.. So, you're automatic unless I scope something in local. I'm going to need some help here..

Google: python globals
Me: "18ft snake eats glowing balls of..", huh?

Google: python global variable scope
Me: Better. Here's a good code snip:

x = 5
def somefunc():
   global x
   print x

Me: ahh..

>>> x = 5
>>> def y():
...    global x
...    x = 10
...
>>> y()
>>> print x
10

I now know how to use globals in Python. That looks like a lot of work but that probably took all of 2 minutes. I guess I could have bypassed the initial tests and went to Google first but I was able to find out a lot with simple trial and error.

I'm lead to the opinion that there are two kinds of programmers and two kinds of languages. You have the scientific, trial and error, show-me-the-source type hacker and then you have the businessy, formal documentation, show-me-the-sdk type. If I had to slot languages I would say C, Python, Perl, and maybe Lisp all seem to fall into the first category and Java, C++, C# fall into second. Some languages, like the different BASICs, seem to fall somewhere in between.

It seems that one of the key benefits of the “learn as you go” languages, and Python specifically, is that you can apply much of what you've learned in other languages very quickly. I have a decent background in Java, C, and Perl. I'm constantly amazed at how much I already know in python by simply asking myself questions of the form, “If I had to write a programming language that did XXX from YYY, how would I write it?” For example, “If I had to write PACKAGE MANAGEMENT from JAVA, how would I write it?” It would look a lot like Python. “If I had to write LIST and HASH stuff from PERL, how would I write it?” Again, it would look a lot like python (well, it would be a lot less useful, I'm sure).

Python really is a language that you can learn as you go, especially if you have experience in other languages. I'm constantly surprised by how much of the language I already know but don't remember learning. Combine that with the ability to do rapid and productive trial and error'ing and you have a language that is sure to be a hit with the scientific types, as is already clearly established.

ET Covert Ops Rocks http://naeblis.cx/weblog/ETCovertOpsRocks http://naeblis.cx/weblog/ETCovertOpsRocks 2004-02-16T01:43:49.00-04:00 Play Enemy Territory? Try Covert Ops..

I played way to much Enemy Territory [flash] this weekend. I was hoping to get some Schwag work done but I guess I just needed a mindless weekend.

Anyway, the game really is spectacular. I'm not a big gamer and I usually veer towards the consoles since the Linux gaming situation isn't so hot and I can't be bothered with the wine mess. My attention span for games is somewhere around 2 weeks and I can't remember the last time I actually beat anything. I know ET is special because I've been playing it on and off for about a year. I latched onto the Medic class right from the start and haven't tried anything else until last night. I was out of respawns and following some guy playing Covert Ops. He completely blew me away. It occured to me that one of the best ways of getting better in ET is to go in as a spectator and follow someone with high XP. I played five or six hours as Covert Ops and I don't think I'll be going back to the Medic crutch any time soon.

My basic play style thus far has been as follows. I use the FG42 as a rifle. I haven't seen anyone say one good thing about this gun but I seem to win more than not in SMG battles. I rarely snipe and pretty much act like a soldier/rambo, except I have satchel charges and can wear enemy uniforms. Speaking of satchel charges, they rock. The best weapon in the game, IMO. I try to always have a satchel sitting at a frequented passageway and then run around defending myself with my rifle until I can blow the satchel. This means you have to “have your head on a swivle”, watching for guys to shoot but constantly glancing at the satchel for a kill their. Besides kills, my big goal in the game is to blow up as many enemy structures as possible (outposts, turrets, bridges, etc). Lastly, one thing I've found extremely useful in long (10 map+) campaigns is to start as a Medic and work your ass off for the Adrenaline shot. Once you get the Adrenaline you can switch to Covert Ops and still use the Adrenaline. This pretty much makes you unstoppable in my experience. I figured out that last bit on accident after switching to Covert Ops initially.

So tonight I've been on a quest for other people's Covert Ops strategy/tactics. I figured I would linkdump what I found useful.

URLGrabber Merged http://naeblis.cx/weblog/URLGrabberMerged http://naeblis.cx/weblog/URLGrabberMerged 2004-02-14T02:53:08.00-04:00 The two URLGrabber source trees have been sync'd up again.

We finally got the two URLGrabber source trees sync'd up in CVS. Michael to audit and then we should be able to push out a stable 0.3 release for Seth.

There was also an interesting bug found in current yum related to urlgrabber today. The user:pass parsing for authentication in URLs wasn't unescaping before it set values into the AuthHandler. I need to remember to log this in bugzilla and push up a patch for Seth.

Meet The Prez http://naeblis.cx/weblog/MeetThePrez http://naeblis.cx/weblog/MeetThePrez 2004-02-14T02:48:34.00-04:00 Daily Show Feb 09, 2004.

The Daily Show Feb 09, 2004 had a hillarious crack on Bush's Meet the Press interview, which I haven't seen in full yet. From what I've read, they should have just aired the interview on the comedy channel to begin with.

Here's some video (MOV).

Schwag Decisions http://naeblis.cx/weblog/SchwagDecisions http://naeblis.cx/weblog/SchwagDecisions 2004-02-13T02:16:25.00-04:00

I need to make some decisions with what to do with Schwag. There are two real directions and I just cannot decide which I want to take. Part of me thinks I should keep it real simple and make it a planet.gnome.org like aggregator that would be used as generation tool for multi-user / public sites. The other half of me really wants to develop this personal portal thing. This goes more in the direction of a single-user, desktop aggregator/reader that has really strong aggregating and reading facilities as well as the ability to act as a bookmark manager type thing.

It just occured to me that the reason I started this project was because I wanted a reader similar in scope to AmphetaDesk that I could then implement bayesian filters in. At some point I decided that aggregation and conversion were important.

I'm just kind of frustrated. It may be that what I'm looking for is a lot of different stuff that needs broken up into separate projects. For instance, schwag could provide the aggregation and feed normalization and another app could handle the whole bookmark management stuff. It should be trivial to integrate normalized feeds into the bookmark manager.

Back Into URLGrabber http://naeblis.cx/weblog/BackIntoURLGrabber http://naeblis.cx/weblog/BackIntoURLGrabber 2004-02-12T23:54:16.00-04:00 Trying to get some time allocated for URLGrabber in the coming weeks.

I spoke with Michael and Seth a bit yesterday about getting back into a groove with URLGrabber. I finally put together a TODO list today and sent that out to them. It would be nice to get this stable for Seth's yum work.

Seth teased me with suggestions of trying to get some of the urlgrabber/byterange stuff into python core when we started. Running back through the code had me thinking that a lot of the FTP byterange stuff would fit better as a urllib(2) patch anyway.

Schwag http://naeblis.cx/weblog/SchwagAnnounce http://naeblis.cx/weblog/SchwagAnnounce 2004-02-02T23:45:11.00-04:00 A Syndicate Feed Normalizer / Aggregator

I've been working on a little Syndicate Feed Reader in Python that I am calling Schwag (don't ask, don't tell), although the name is more of a place-holder for some other really cool name ™. It is less of a Reader, really, and more of a Normalizer/Converter/Aggregator that has some light reading facilities. It uses Mark Pilgrim's Ultra Liberal Feed Parser for support of all flavors of RSS / RDF / Atom feeds (even the bad one's). The feeds are normalized into Atom 0.3 format and dumped to disk. There is a light templating system that allows one to apply XSLT [1] transformations to the normalized feed to produce different representations (e.g. RSS 0.9x, RSS 1.0 / RDF, RSS 2, XHTML, etc). So, the basic idea is to have a feeder component that manages retrieval and normalization to a common format and then a templating component that provides pluggable representations of the normalized feed. This may sound semi-complex but the code is fairly simple as the feed parsing and XSLT machinery is handled by Mark's piece and libxml/libxslt, respectively. My code just kind of introduces the two to each other.

The whole feed normalization / transformation thing is cool in and of itself but nothing special, really. The ideas here have been talked about before (although the specifics are little different). Where it gets interesting, IMO, is when you get into organization and aggregation. I've decided to maintain the list of source feeds as an XBEL [2] document. This is kind of a bastardization of the format but oh well, it is actually perfect for what I need. XBEL is a simple little XML vocabulary for describing browser bookmarks. It has the concept of Folders, Bookmarks, and Aliases. You see where this is going, right? You organize feeds into Folders and can also use Aliases to link a feed into multiple Folders. Good? OK. I am using the Folder concept for more than just simple organization however, and this is where I think I may be on to something half-way useful. Put simply, folders provide aggregation points. The feeder aggregates all feeds in a folder into an “Index Feed”. To push this concept a little further, the folder aggregation is performed recursively. Index Feeds contain entries from feeds in the immediate folder as well as all feeds in descendant (xpathwise) folders. One result of this is that the Index Feed for the root folder is an aggregate of all feeds available. You can then drill down into sub folders to limit/filter aggregation.

Along with normalized Atom feed generation, the feeder component dumps out an XBEL file in each generated directory that contains the XBEL fragment for the corresponding folder. You can apply XSLT transformations to these as well. I'm currently using this to generate OPML [3] as well as XHTML representations of the folder index. So the concept of treating each Folder as a partitioning device exists here too.

At the end of the day, you end up with a system that takes an XBEL document as input and produces a directory structure containing normalized / converted feeds as well as aggregated index feeds. These are all simple files and directories so exposing via your favorite web server is straightforward. I've written XSLT for RSS 0.91, RSS 1.0/RDF, and XHTML on the feed side and OPML and XHTML on the index side. It is pretty trivial to plug in new representations for both feeds and indexes given an XSLT that takes an Atom 0.3 document on the source side.

This is all very experimental right now and while I'm using the system for day-to-day reading, I'm also breaking stuff and performing major restructing of code and concepts very often. The system is definitely not without its problems. I plan on blogging the success and failure of various approaches in moderate detail over the next couple of months. I haven't even put a distribution together yet but please feel free to browse the sources or grab a tarball if you're interested in really early, often broken applications. I will have to find a home for the project eventually (I'm trying to avoid sourceforge if possible) as all of this is hosted off of a P300 sitting in my living room with only a humble Road Runner pipe. I imagine I will get more serious about this when I think of a name or the code starts stabilizing, whichever comes first. In the meantime, please leave comments or shoot me an email if you're interested.

[1] XSLT : http://www.w3.org/TR/xslt
[2] XBEL : http://pyxml.sourceforge.net/topics/xbel/
[3] OPML : http://opml.scripting.com/

True/False in Python < 2.3 http://naeblis.cx/weblog/TrueFalseInOldPython http://naeblis.cx/weblog/TrueFalseInOldPython 2003-11-23T20:36:15.00-04:00 A simple method of ensuring True/False works in all versions of python.

Ripped from options.py--part of Greg Ward's Optik; a more-functional-than-getopt options parser for python.

# Do the right thing with boolean values for all known Python versions.
try:
  True, False
except NameError:
  (True, False) = (1, 0)
Minimal System Backups with rdiff-backup and Yum http://naeblis.cx/weblog/MinimalSystemBackups http://naeblis.cx/weblog/MinimalSystemBackups 2003-11-16T01:46:07.00-04:00 Can rdiff-backup and yum be used to provide an intelligent system for backing up only what cannot be retrieved from package repositories?

I'm falling in love with rdiff-backup. This tool gives you the best of both incremental and mirror backups, uses rsync/rdiff libraries to increment modifications to files, and best of all is written in Python. rdiff-backup has made backups sexy again (again?). I'm just completely all about backups now. Not so much for functional reasons—I've only had to go to the backups once or twice to restore something, and it was a beautiful experience—but because the tool just rocks and makes me want to back stuff up. If Ben Escoto (primary author, rdiff-backup) could hack together some tax return software I might not have to deal with an audit this year.

I could wax about rdiff-backup for a couple of pages but to make a long story short, it has inspired me to believe that backups are an old concept that still has lots of room for innovation and can be fun to code for.

I'm already in a serious love affair with Yum and have been trying to contribute Bruce Lee to this project in anyway I can. This tool filled a huge gap in the Red Hat offering by providing a customizable package management system similar to the apt-get tool Debian users have been enjoying for years. Yum is extremely easy to use; to the point that you might think that it's light on functionality. No, it's not like that. Yum is the Bruce Lee of system utilities in that it resembles some silly little Chinese guy that you would think would be no match for, say, Kareem Abdul-Jabbar, and then you get the flex, and it pulls a full distro upgrade with a single command (yum upgrade).

I should note here that apt-get can be used on RPM based systems as well but I am apt-get ignorant and will be refering to Yum exclusively for the rest of this entry. It is extremely possible that apt-get or even up2date could provide the the functionality needed to restore a system from RPM.

Handling RPM Managed Files

Handling RPM Managed Files

More pertinent to the topic at hand is the fact that Yum takes care of pulling RPMs from remote repositories, working out dependencies, and performing installations given a list of package names. What this means to the backup artist is that in order to be able to restore a wrecked system to it's previous set of packages you need only backup your yum config file (usually /etc/yum.conf), which contains your repository configuration, along with the names of all packages installed on your system. We can get the list of packages easily enough:

$ rpm -q —all > list-of-packages

Once we have our yum.conf and package-list backed up, a fresh machine with just the minimum requirements to run Yum, should be able restore to it's previous state with something like the following:

$ cat list-of-packages | xargs yum install

You have to swish the concept around in your head a little bit but you can think of the many RPM repositories that are getting thrown up nowadays as shared backups for common stuff. This greatly reduces the amount of files that need to be backed up by each person because they can always be obtained from a publicly available source.

What about Unmanaged Files?

What about Unmanaged Files?

So we can backup every single RPM managed file (that hasn't been modified) with a very small footprint. Now we get into the more tricky part, which is backing up all the stuff RPM either doesn't manage or manages but determines has been modified. We will use rdiff-backup to increment the files for all the benefits stated previously, but there are a few things missing from the current rdiff-backup/yum toolset.

  • There is no straightforward method of obtaining a list of files that are not managed by RPM. You can rpm -ql —all to get a list of files that are managed by RPM and feed them into rdiff-backup as an exclude list but there are issues with this. Most notably is the fact that the file list will contain directories managed by RPM, which may contain files/directories not managed by RPM and rdiff-backup will exclude all files in a directory if supplied in an exclusion list.
  • rpm -ql -all outputs full filenames relative to the root of file system. This means that rdiff-backup would always have to be run on / as file paths in the exclude list are interpreted relative to the directory being backed up.
  • While not impossible, it is fairly annoying to get a list of files that are managed by RPM but have been modified from their original versions. The following is RPM's verify output, which can be parsed to find modified files:

    $ rpm —verify —all —nomtime —nodigest —nosignature —nodeps —nordev \
          —noscripts —nouser —nogroup —nomode
    S.5..... c /etc/ant.conf
    S.5..... c /etc/krb.conf
    S.5.....   /usr/lib/rpm/rpmrc
    S.5..... c /etc/X11/gdm/gdm.conf
    ...

    This has some serious drawbacks including the fact that it will take a very long time to run on even the fastest machine because it's performing MD5 checksums on thousands files, which also pretty much makes the machine unusable. You can speed things up a bit by passing —nomd5 into rpm —verify, which will cause checking to occur on sizes only but there's a pretty good chance you will miss something.

  • Lastly, it would be nice if files known to have no value in a backup could be excluded by default. e.g. /var/run, /var/lock, /tmp, /home/*/.mozilla/*/*/Cache, etc.

Design Goals

Design Goals

What we need is a library that, given a directory, will tell us what files and sub-directories are unmanaged or managed-but-modified. This tool should be able to utilize some kind of database or cache of file information (possibly slocate's database or rpm's).

Once we have a mechanism for determining what files are unmanaged, a nice framework for defining backup sets should be provided. I'm thinking something along the lines of having an /etc/backup.d directory that would have config files for each backup set. The config files would specify the root directory to backup, where the backups should be stored, and a list of additional exclusions and/or forced inclusions relative to the backup directory.

# sample backup config file
[info]
name=home directories
source=/home
dest=root@backup-host:/backups/$hostname/home
backuptype=rdiff-backup
frequency=1d                # how often should we backup?
retain=5d                   # how long should increments be
                            # kept around?

[files]
- .phoenix/**/Cache         # exclude mozilla firebird cache
- .Xauthority               # we don't need that either..
+ .Xresources               # forcibly include .Xresources

A couple of concepts to point out here. Source and dest are pretty self explainatory. backuptype would allow other backup tools to be used in place of rdiff-backup. Maybe we just want to straight rsync the files, or maybe somebody feels that incremental tarballs are still relevant <g>. Lastly, the files list contains a list of files to include or exclude. I really like rdiff-backups file selection syntax; it is simple and powerful.

There should also be a sane default backup set containing all unmanaged/modified files on the system minus cruft like /var/{run,lock}, cache directories, and anything else that doesn't have backup value. The only aspect that should require configuration is the destination of the backup. Reasonable defaults for all other aspects should be provided.

Summary

Summary

A Minimal Backup System would provide a simple-as-in-easy-to-configure yet powerful tool that could act as an almost turnkey backup solution for most small scale GNU/Linux installations that use RPM for package management. A positive side-effect of having RPM aware backups is that it further promotes the use of RPM to package common/unchanging files as the more that is managed by RPM the less space is required for backups.

Experimental Firebird Extension RPMs Available http://naeblis.cx/weblog/FirebirdExtensionRPMsAvailable http://naeblis.cx/weblog/FirebirdExtensionRPMsAvailable 2003-11-12T03:14:25.00-04:00

Building RPMs for Mozilla Firebird Extensions is moving along nicely. The following extensions have been packaged up without much problem.

  • uptime
  • nukeimage
  • useragentswitcher
  • popupcount
  • autohide
  • rssreader

At this point I'm fairly confident that any extension can be packaged in RPM. I'm planning on soliciting feedback from the fedora and freshrpms lists soon and should be publishing something in the interim.

If you want to take a look at building these yourself, I have viewcvs setup here with the tarball option on.

RPMifying Mozilla Firebird Extensions http://naeblis.cx/weblog/RPMifyingMozillaFirebirdExtensions http://naeblis.cx/weblog/RPMifyingMozillaFirebirdExtensions 2003-11-10T01:33:18.00-04:00 So I was able to package up some simple Firebird extensions into RPMs. Here are my initial findings and some thoughts on moving forward.

Common Case XPIs

Common Case XPIs

The process is something like this for a majority of XPIs I've come across thus far.

  1. Under %prep, extract .xpi file to BUILD dir.
  2. Under %install, extracted contents of the .xpi file should be installed under /usr/lib/MozillaFirebird/chrome with the exact structure from the xpi file (the exception to this rule is the install.js, which comes with each xpi file and is only used during the install). I've found that most simple extensions contain a single .jar file that has all XUL content.
  3. Now comes the tricky part - we need to register the extension/chrome with one of the ugliest goddam hacks I've ever seen. We add one or more line to /usr/lib/MozillaFirebird/chrome/installed-chrome.txt that points to the files we just laid down. The lines required will vary from extension to extension; it is usually possible to deduce the format but some extensions do some special stuff with locales and whatnot. I think the only surefire way to know what these lines need to look like is to run Firebird as root, install the extension through the browser, and then grab the lines out of installed-chrome.txt. Anyway, I had to add a bunch of crap to %post that appends the lines to installed-chrome.txt and then to executes /usr/lib/MozillaFirebird/regchrome. This updates chrome.rdf (and possibly some other files).
  4. For %preun, you need to remove the lines added to installed-chrome.txt and exec regchrome again. I hacked this up with an error-proned grep -v extension-name, which seems to work so far but definitely needs refined.

Moving forward there are a few things I need to figure out. If anyone has info on these, please let me know.

What's the license on these things?
I haven't seen a single reference to redistribution policy anywhere. Are most of these covered under the standard Mozilla license?

Packaging Guidelines

Packaging Guidelines

I'd like to throw together some RPM packaging guidelines for Firebird extensions. Right now I have all of the extensions I've packaged in a single SRPM and I'm using %package to split them out. This saves some time when I'm adding new extensions but isn't very maintainable. I also have 4 function macros defined at the top of the spec file for extracting the xpi, adding/removing installed-chrome.txt entries, running regchrome, etc. It might be nice to move these into a macros file somewhere.

Another thing to standardize on is a package name prefix. I've been using MozillaFirebird-ext-name. I'm starting to think that the -ext may not be necessary. On pure aesthetics, I prefer MozillaFirebird-googlebar to MozillaFirebird-ext-googlebar.

Firebird Extension / Theme RPMs http://naeblis.cx/weblog/FirebirdExtensionRPMs http://naeblis.cx/weblog/FirebirdExtensionRPMs 2003-11-09T07:32:00.00-04:00

If someone can provide a mechanism for programmatically installing Mozilla Firebird extensions, I would be more than happy to put a night aside for packaging a bunch of them for RPM to push up to fedora. From what I'm reading here, it looks like the xpinstall stuff is limited/non-existent from the command line. Lots of people wanting to know how to do unattended/cli installs but no one replying seems to understand why you would want to go cli instead of using the browser. Trying to roll Firebird out to a large number of machines w/ a standard set of extensions is a nightmare.

Right now I'm thinking I may have to resort to installing the most common extensions on one box and blow tarballs. These boxes are somewhere around 99% managed by RPM so slapping unmanaged crap into /usr/lib just kills me.

I have a few things I would like to try before giving up. This post looks promising but it's dated 2000-01-02. If any mozilla-savy person out there could hack up a nice little xpi install app it would be much appreciated.