Category Archives: Ruby

The Magic Ruby vs the Url of Excel: a short story with long code

Once there was a hacker who needed to rescue a beautiful princess from her prison on a creaking ship moored in the middle of the sky.   He made it onto the noisy old ship, slipped past the guards and tiptoed down the swaying halls to the room where she wept, chained to an excel spreadsheet.
Continue reading The Magic Ruby vs the Url of Excel: a short story with long code

Using Ruby for command line web lookups

Common Problem

You frequently have to look up customer information on the company website.  Firing up a web browser takes time and invites you to start dawdling away on facebook and such.  If only anyone in the company had bothered to write a decent webservice or command line utility to look up customer information.

Find the right url

The first step is to dig into the company website and find out what happens when you click search.  You are looking for an element in there of type “form”.  A form is what gets submitted when you click search.  It will submit information to a page, and that page is the “action” attribute of the form element.  Then you need to find the inputs to that form.  Look for elements of type “input”.  These guys are the information you are sending to the action page.  Once you have the “action” and the “input” names, you can come up with a URL that represents this lookup.  It’s dead easy and it always follows the same pattern.

If you have a form like this:


<form action="/admin/clientsearch.asp" method="post">
<table border="0" align="center">
<tbody>
<tr>
<th class="QueryHeader">Search Options</th>
</tr>
<tr>
<td align="center">
<table border="0">
<tbody>
<tr>
<td><strong>Search For:</strong>

<input name="SEARCHPARAM" size="15" type="text" /></td>
</tr>
</tbody></table>
</td>
</tr>
<tr>
<td></td>
</tr>
<tr>
<td align="center"><input type="submit" value="Search" /><input type="reset" value="Reset" /></td>
</tr>
</tbody></table>
</form>

You can see the “action” is”/admin/clientsearch.asp” and that the input is named “SEARCHPARAM”.  From this we know that the URL is going to be “http://www.example.com/admin/clientsearch.asp?SEARCHPARAM=”.  That’s how simple it is.

Automate the lookup using Ruby

If this is a task you have to do often, try using Ruby to automate it.  Ruby has a utility for doing repetitive tasks called Rake or Sake and a utility for parsing web pages called Hpricot.  Install them like so:


gem install rake hpricot sake

Now we write up a file called “Rakefile.rb” and put in a rake task

desc "sets up the following tasks"
task :setup do
require 'open-uri'
require 'hpricot'
end

desc "lookup a client"
task :clients, :client_name do |t, args|
doc = Hpricot(open("http://www.example.com/admin/clientsearch.asp?SEARCHPARAM=#{args.client_name}", :http_basic_authentication => ['username', 'password']))
puts doc.search("//center[2]/table")[0].to_plain_text
end
task :clients => :setup #put in here bc named args seem to conflict with dependencies.

Most of what’s going on there is happening on line 9. We are opening a url, then passing it to our parser. If you don’t have a username and password for this website, you can remove the whole “http_basic_authentication” argument to open.

Where is my data

In line 10 you’ll see a little handy XPath going on to narrow down the document to what we care about. If you aren’t so hot with XPath, there is an easy way to find it out. In Firefox, install an extension called Firebug. Do a search on your webpage, then activate firebug by clicking the bug icon in your statusbar. Click inspect in Firebug and then click where your data is. Firebug will display a bunch of elements on the top. Move your mouse along them and you’ll find one element that highlights your data in blue. Right click on this and “Copy XPath”. That’s what you will put in “doc.search()”.

Using it

From the commandline type rake clients[myclient] and ruby will do the lookup and return the information you care about.  That will only work if you are in the same directory as your rakefile.rb.  We can install these tasks into sake by typing sake -i rakefile.rb. This makes these tasks system wide, so you can call sake clients[myclient].

A couple of caveats

  1. You may have to do a little tweaking to get open-uri to play nice with expired https certificates. Shouldn’t be a problem for most folks.
  2. The world of screen-scraping as it is called, doesn’t end there. If you need more advanced techniques for screen scraping a page, behold the power of the internet.

The beauty of Ruby’s array subtraction operator

Today I had to a set of email addresses, one per line, from which I had to remove the addresses of folks that said “Don’t email me.” Those emails were in a separate file, one address per line.

I figured I’d have to do this again, so I wrote a ruby script to automate it. Below, stripper.rb

#put each address in an array, remove whitespace and make it all lowercase
 potential_emails = IO.readlines("potentials.txt").map! {|email| email.strip.downcase}
 delete_emails = IO.readlines("donotemail.txt").map! {|email| email.strip.downcase}

#use the beauty of ruby's array subtraction operator
 puts potential_emails - delete_emails

Simple, terse and readable.  Lovely!

The shotgun approach to recruiting is counterproductive

> Hi Matthew,

>

> I'm very impressed with your rating/experience.

> At the current time, I have a few clients that are looking for experienced

> Ruby Rails developers (both contract &amp; permanent) like yourself. There

> is

> lots of room for creativity and growth at these places.

>

> So, my contacts are below if you're still looking. Also, if possible, can

> you pls forward me

> your current resume?

>

> Thanks much!

>

> Leslie Doan

> ******************

> Managing Partner

> MINDSPHERES, INC.

> 2570 North First Street

> Suite 200

> San Jose, CA 95131

> C: 408.386.7246

> E: leslied@mindspheres.com

> W: www.mindspheres.com
Hi Leslie Doan,

I know it's tough to be a recruiter.  Cold calling is difficult and much

of recruiting is a volume game.

But you aren't doing yourself any favors with this email.

You're starting this relationship with me by lying to me.  You say you've

looked at my working with rails profile and been impressed.  But that

can't be true.  I've never worked on a real rails or ruby project.  I have

no ratings or experience, so how can you be impressed.My resume is also clearly linked from my working with rails profile.  If I

can't rely on you to know that, how can I trust you with my career?  I've

worked with a lot of recruiters, and I know that high volume folks treat

you like a tiny number. They are usually more interested in getting you

hired anywhere at any price so they can collect a commission.  I smell

that big time in this email.

If you show any level of familiarity with who I am or any of the many

links on my profile, I'm so much more likely to work with you.

Hope this helps you, and good luck in the recruiting game,

Matt

p.s. To make this letter worth my time as well as yours, I'm putting it up

for my pals on my website.  Don't worry, I don't have a very high

readership.



As you might imagine, I haven’t gotten a response from Leslie. I don’t think she’s interested in investing time in my career.