Tag Archives: Ruby

Ditz – the distributed issue tracker

If you are using a distributed version control system 1 you get some really cool benefits and some really strange problems.

When I was toying around on a project with Aaron, I fell in love with ditz. We needed a quick way to keep track of bugs, without taking the time to set up a central bug repository. We wanted a bug tracker that could live in the same place as the code, where adding a friend to kick in on code didn’t require more accounts being set up and maintained.

Ditz did all of that, works straight from the command line, and even outputs some sweet html pages for display to the world.

Ditz is kind of abandonware right now as the original author has gone on to other things – but the state it is in right now is just perfect for my personal projects. If you are using it, I’ve added an RSS feed for the html output.

And the really good news? I just convinced the maintainer to make me a co-maintainer. So that means that I can integrate features! Once we get enough in for a new release, I’ll post an update right here!

  1. like git or darcs or mercurial, etc.   (back)

The Magic Ruby vs the Url of Excel: a short story with long code

Once there was a hacker who needed to rescue a beautiful princess from her prison on a creaking ship moored in the middle of the sky.   He made it onto the noisy old ship, slipped past the guards and tiptoed down the swaying halls to the room where she wept, chained to an excel spreadsheet.
Continue reading The Magic Ruby vs the Url of Excel: a short story with long code

Using Ruby for command line web lookups

Common Problem

You frequently have to look up customer information on the company website.  Firing up a web browser takes time and invites you to start dawdling away on facebook and such.  If only anyone in the company had bothered to write a decent webservice or command line utility to look up customer information.

Find the right url

The first step is to dig into the company website and find out what happens when you click search.  You are looking for an element in there of type “form”.  A form is what gets submitted when you click search.  It will submit information to a page, and that page is the “action” attribute of the form element.  Then you need to find the inputs to that form.  Look for elements of type “input”.  These guys are the information you are sending to the action page.  Once you have the “action” and the “input” names, you can come up with a URL that represents this lookup.  It’s dead easy and it always follows the same pattern.

If you have a form like this:


<form action="/admin/clientsearch.asp" method="post">
<table border="0" align="center">
<tbody>
<tr>
<th class="QueryHeader">Search Options</th>
</tr>
<tr>
<td align="center">
<table border="0">
<tbody>
<tr>
<td><strong>Search For:</strong>

<input name="SEARCHPARAM" size="15" type="text" /></td>
</tr>
</tbody></table>
</td>
</tr>
<tr>
<td></td>
</tr>
<tr>
<td align="center"><input type="submit" value="Search" /><input type="reset" value="Reset" /></td>
</tr>
</tbody></table>
</form>

You can see the “action” is”/admin/clientsearch.asp” and that the input is named “SEARCHPARAM”.  From this we know that the URL is going to be “http://www.example.com/admin/clientsearch.asp?SEARCHPARAM=”.  That’s how simple it is.

Automate the lookup using Ruby

If this is a task you have to do often, try using Ruby to automate it.  Ruby has a utility for doing repetitive tasks called Rake or Sake and a utility for parsing web pages called Hpricot.  Install them like so:


gem install rake hpricot sake

Now we write up a file called “Rakefile.rb” and put in a rake task

desc "sets up the following tasks"
task :setup do
require 'open-uri'
require 'hpricot'
end

desc "lookup a client"
task :clients, :client_name do |t, args|
doc = Hpricot(open("http://www.example.com/admin/clientsearch.asp?SEARCHPARAM=#{args.client_name}", :http_basic_authentication => ['username', 'password']))
puts doc.search("//center[2]/table")[0].to_plain_text
end
task :clients => :setup #put in here bc named args seem to conflict with dependencies.

Most of what’s going on there is happening on line 9. We are opening a url, then passing it to our parser. If you don’t have a username and password for this website, you can remove the whole “http_basic_authentication” argument to open.

Where is my data

In line 10 you’ll see a little handy XPath going on to narrow down the document to what we care about. If you aren’t so hot with XPath, there is an easy way to find it out. In Firefox, install an extension called Firebug. Do a search on your webpage, then activate firebug by clicking the bug icon in your statusbar. Click inspect in Firebug and then click where your data is. Firebug will display a bunch of elements on the top. Move your mouse along them and you’ll find one element that highlights your data in blue. Right click on this and “Copy XPath”. That’s what you will put in “doc.search()”.

Using it

From the commandline type rake clients[myclient] and ruby will do the lookup and return the information you care about.  That will only work if you are in the same directory as your rakefile.rb.  We can install these tasks into sake by typing sake -i rakefile.rb. This makes these tasks system wide, so you can call sake clients[myclient].

A couple of caveats

  1. You may have to do a little tweaking to get open-uri to play nice with expired https certificates. Shouldn’t be a problem for most folks.
  2. The world of screen-scraping as it is called, doesn’t end there. If you need more advanced techniques for screen scraping a page, behold the power of the internet.

The beauty of Ruby’s array subtraction operator

Today I had to a set of email addresses, one per line, from which I had to remove the addresses of folks that said “Don’t email me.” Those emails were in a separate file, one address per line.

I figured I’d have to do this again, so I wrote a ruby script to automate it. Below, stripper.rb

#put each address in an array, remove whitespace and make it all lowercase
 potential_emails = IO.readlines("potentials.txt").map! {|email| email.strip.downcase}
 delete_emails = IO.readlines("donotemail.txt").map! {|email| email.strip.downcase}

#use the beauty of ruby's array subtraction operator
 puts potential_emails - delete_emails

Simple, terse and readable.  Lovely!