All posts by MattK

About MattK

I like you.

How to get around a proxy system

This sounds complicated but it is really simple.  That it is so simple is why the internet is amazing and awesome.

from flickr user Bright Tal with a CC licenseProxies are used by people in positions of authority who want to control what you view on the internet.  Such groups include the governments of Turkey and China.  Also, the internet security team of most major corporations.  Some of these motives are good:

  • Blocking you from visiting websites that will infect your computer with spyware.
  • Blocking you from looking at naked people at work and totally creeping your coworkers out.
  • Blocking you from using webmail or instant messaging to communicate with customers in insecure ways or in ways that can’t be audited for a lawsuit.

Some of these motives are bad:

  • Blocking you from learning about problems at the group.
  • Blocking you from “wasting” company time or resources.

Generally you will eventually find a situation where you want to look at a website that has been blocked improperly.  I’ve often seen sites that discuss internet security vulnerabilities classified as “hacking” – but I need to know if those sites affect my work.

kindly sourced from flickr user Dazzie DWhether your intentions are pure or not, here is a simple way to give yourself internet freedom.

Download CGIproxy and install it on something that faces the unfiltered internet.  This might be your web host if you have one.  If not, you can install a web server on your home computer.  It is easier than you might think, and with DynDns, you can have your own domain name for your home computer.

You are done.  Now you can navigate in your browser to where you installed CGIproxy.  It will surf the sites you are blocked from.   Doing that is a hassle, though.  You have to go to CGIproxy when you want to go to a different site.  Lame.

Let’s make it easier through the magical power of bookmarklets.  We will put two little buttons in your browser that let you proxy blocked sites and unproxy them when you are somewhere safe again.

I wrote up a little page for you that generates proxy and unproxy bookmarklets for CGIProxy.  Go there, put in the URL of your CGIproxy, and choose your options.  I’ll automagically generate the bookmarklets for you.  You just drag them up to your browser quick links and now you have the keys to the kingdom.

Let me know if anything isn’t clear – I did the extra work so that it could be useful for you.

Ofanya

As you come out of the forests and first spy Ofanya, you mistake it for a ruin or a bombsite.  Closer to the crumbling towers and half-roofed houses it becomes apparent that the people hurrying about are not in peril or panic.  They are going about their business calmly but quickly among the wasted blocks of Ofanya.  There is no danger, save for when a building collapses.

Spending time in the falling, failing city of Ofanya, the people reveal themselves to be full of great ambition.  No one is a banker or a grocer or a shopkeeper.  Everyone is a writer or a musician.  All are working on projects of staggering beauty and terrible deep complexity, so they have no time to spend on day jobs.  talk to anyone and they will tell you about the three hour underwater dance cycle they are dedicating to the battle of normandy.  A shy young man will show you his preliminary sketches of for a full-body tattoo of his life, the lives of his ancestors and the predicted lives of his someday children.

They cannot stay with you long, these poets and sculptors.  There is no one who will keep a shop in Ofanya, so there is nowhere to buy bread.  In the morning, the artists all wake up and scour the countryside for wild wheat they can handmill to eat.  The muralists go to the river to catch trout.  All the time everyone complains about how they can’t get a cup of coffee.  They greet each other mainly by asking for cigarettes.

You grow weary of Ofanya as everyone you meet asks you for favors and loans, promising they will remember you when their script gets made.

Ofanya is a city of infinite desire and little execution.  The houses and towers are not destroyed, they were never finished by balladeers who are writing songs about love and death.  The shit and piss stinks in the streets as there are no sewers dug, no street cleaners.  Everywhere the thin starving artists plead with you that they cannot delay their art to move to another city, but they cannot complete their art as thy have to spend all day searching for food, firewood, and shelter.  Something must be done about this hell that is Ofanya.  With a little planning and cooperation this could be a great bohemia.  Ofanya just wants you to set up a bakery, where your labor will be repaid in songs of glory and monuments to your industry.  Ofanya wants you to build an apartment block, which would be covered in heroic murals in tribute to you.

Don’t ponder this ridiculous proposal for too long.  You will begin to compose an essay in your head, a masterful argument that will strike the people of Ofanya with reason and put them into a well ordered society.  While you prepare this powerful rhetorical thunderbolt, you will grow hungry and make your way to the woods to hunt for some walnuts or blackberries.

Using Ruby for command line web lookups

Common Problem

You frequently have to look up customer information on the company website.  Firing up a web browser takes time and invites you to start dawdling away on facebook and such.  If only anyone in the company had bothered to write a decent webservice or command line utility to look up customer information.

Find the right url

The first step is to dig into the company website and find out what happens when you click search.  You are looking for an element in there of type “form”.  A form is what gets submitted when you click search.  It will submit information to a page, and that page is the “action” attribute of the form element.  Then you need to find the inputs to that form.  Look for elements of type “input”.  These guys are the information you are sending to the action page.  Once you have the “action” and the “input” names, you can come up with a URL that represents this lookup.  It’s dead easy and it always follows the same pattern.

If you have a form like this:


<form action="/admin/clientsearch.asp" method="post">
<table border="0" align="center">
<tbody>
<tr>
<th class="QueryHeader">Search Options</th>
</tr>
<tr>
<td align="center">
<table border="0">
<tbody>
<tr>
<td><strong>Search For:</strong>

<input name="SEARCHPARAM" size="15" type="text" /></td>
</tr>
</tbody></table>
</td>
</tr>
<tr>
<td></td>
</tr>
<tr>
<td align="center"><input type="submit" value="Search" /><input type="reset" value="Reset" /></td>
</tr>
</tbody></table>
</form>

You can see the “action” is”/admin/clientsearch.asp” and that the input is named “SEARCHPARAM”.  From this we know that the URL is going to be “http://www.example.com/admin/clientsearch.asp?SEARCHPARAM=”.  That’s how simple it is.

Automate the lookup using Ruby

If this is a task you have to do often, try using Ruby to automate it.  Ruby has a utility for doing repetitive tasks called Rake or Sake and a utility for parsing web pages called Hpricot.  Install them like so:


gem install rake hpricot sake

Now we write up a file called “Rakefile.rb” and put in a rake task

desc "sets up the following tasks"
task :setup do
require 'open-uri'
require 'hpricot'
end

desc "lookup a client"
task :clients, :client_name do |t, args|
doc = Hpricot(open("http://www.example.com/admin/clientsearch.asp?SEARCHPARAM=#{args.client_name}", :http_basic_authentication => ['username', 'password']))
puts doc.search("//center[2]/table")[0].to_plain_text
end
task :clients => :setup #put in here bc named args seem to conflict with dependencies.

Most of what’s going on there is happening on line 9. We are opening a url, then passing it to our parser. If you don’t have a username and password for this website, you can remove the whole “http_basic_authentication” argument to open.

Where is my data

In line 10 you’ll see a little handy XPath going on to narrow down the document to what we care about. If you aren’t so hot with XPath, there is an easy way to find it out. In Firefox, install an extension called Firebug. Do a search on your webpage, then activate firebug by clicking the bug icon in your statusbar. Click inspect in Firebug and then click where your data is. Firebug will display a bunch of elements on the top. Move your mouse along them and you’ll find one element that highlights your data in blue. Right click on this and “Copy XPath”. That’s what you will put in “doc.search()”.

Using it

From the commandline type rake clients[myclient] and ruby will do the lookup and return the information you care about.  That will only work if you are in the same directory as your rakefile.rb.  We can install these tasks into sake by typing sake -i rakefile.rb. This makes these tasks system wide, so you can call sake clients[myclient].

A couple of caveats

  1. You may have to do a little tweaking to get open-uri to play nice with expired https certificates. Shouldn’t be a problem for most folks.
  2. The world of screen-scraping as it is called, doesn’t end there. If you need more advanced techniques for screen scraping a page, behold the power of the internet.

Your facebook applications suck

Hey there buddy. I want to talk to you about all these applications you’ve been making and putting on the Facebook. This is going to be a difficult conversation, so take a seat.your_facebook_apps_suck_what_type_of_person_quiz.jpg

I know you want to go viral like that goddamn werewolf/ zombie/ vampire/ coprophage army thing. But please. You have to offer value to me first. Then I will recommend you to my friends.

My daddy taught me never to be held hostage. I know the type of person I attract. It appears to be blond Londoners named Sam.  Lucky me!  But shame on you for playing on people’s insecurities!

I’d love to find out what type of disaster I am, but apparently the “skip” button is broken and I can only find out by inviting my friends.

Also, how sad is this got love application? I have to invite folks before it will tell me that I am loved for some randomly generated reason.  Kids, do we trust programmers that can’t master subject-verb agreement?

your_facebook_apps_suck_got_love.jpg

It’s ridiculous.   Why do they not give you anything for free unless you install the application and invite your friends?  Because these applications get access to your personal information, your friendlist, etc.  And then they sell them.  Shocked?  Here’s the thing: Facebook doesn’t host these applications.  All the hard work gets done on outside servers – paid for by the guy who wrote the application.   So the guy who is displaying pieces of flair for your Facebook page is also scraping out your friendlist and your contact info, anything you’ve allowed him access to.  And he’s selling it to his pals.  Once the info is on the market, you can’t get it back.

Moral of the story?  I like Facebook so that I can find out you had a kid or that your car was stolen, but I don’t want it to lead to you getting your identity stolen.  Be good out there.

Books: Anathem by Neal Stephenson

Includes a cd and some geometry lessons!Neal Stephenson’s Anathem has been called a space opera, but that seems inaccurate.  The characters eventually make it out of the atmosphere, but time is the subject of the book – not space.  Some of the best parts are about contemplation, piecing together puzzles and following the threads of deductive logic through to a conclusion.  A long character-building scene has to do with an art project where the characters recreate a famous battle by planting a garden full of weeds that will battle for dominance and advance their growth in predictable ways.  The kind of ideas where events play out over months, years and centuries hardly belong in space opera.  It is Long Now fiction.

The Setup

In a world far far away, the monasteries are a place where mathematicians, scientists, and rhetoricians have sequestered themselves away from the working world.  They practice a method of separation that allows for regulated exchange of ideas between the monasteries (called concents) and the outer world.  Each concent has a series of gates and subdivisions, all regulated by an enormous clock designed to run on a millenial scale.  For some, the gate opens once a year for a week.  For others, the time between openings is 10 years, 100 years, or even 1000.  This lets the secluded folks work away at their ideas without being interrupted or polluted by popular culture.  The setting projects timelessness, order, safety and ritual.  Obviously, that isn’t going to last, but it’s an idyllic sort of world for nerds, one where you can devote yourself to a higher purpose, abandon ambition, and be recognized solely for the worth of your mental work.  There are analogues between much of what monks do and what these guys do, and lots of the same sort of psychological motivation.

The Gripes

Actually, let’s take a moment to discuss the biggest failing of the book.  The vocabulary is tedious.  There’s a lot of vocabulary and world building going on here, and most of it is a waste, a distraction from the ideas and the characters.  Sure, it’s set in a faraway world and they have different word’s for different things.  But why?  In the end, there’s no real need for this story to take place on a different planet: if set here on Earth you’d have a history for free, you could reference the work ideas of folks like Plato or Pythagoras directly and you’d only need to invent new words for concepts that are actually new.  Too much of the book is set on giving alternate histories for ideas like Platonic ideals, too little on explaining the actual new ideas in the work.  Stephenson’s books are generally not great storehouses of characterization – they are a box of whizzy fireworks for your brain to set off.   That’s great – it’s fun.  But if that’s what you are going for, get to it.  The reader doesn’t benefit from learning that in this world the science monks are called “avout” rather than “devout” and their convents are called “concents”.   With so many analogues between the avout and the monastaries that we know, why not just use those words and explain the differences?  Stephenson’s path means he’s got to explain both the similar and the dissimilar, which draws the plot to a stop.  That’s why it takes a third of the book before our hero gets moving and the action starts forward.

The Push

Or rather, it shoots off like a rocket.  Once things start moving, they pulse on for 600 pages.  Ah, there you go.  That’s the rush you were waiting for.  Once it starts moving you’ve got dashing stories of survival, ninjas, instructional parables of math and geometry, explorations of Graham-Everett-Wheeler Cosmology, etc.  There’s a lot going on.  Like “Snow Crash” or “The Diamond Age”, Stephenson’s technique is to ramp up the book in a hyperbolic fashion.  Picture an asymptotic curve where there is a long flat head as the book builds the world and characters it needs, then a sudden rising motion when the real story begins to show. As you near the end, the drama, intensity and stakes have risen to staggering heights.  Unlike the previous books, this one actually seems to end.  With a real ending.  And there is resolution for the characters.  This is a pleasant surprise, given the past performance of Stephenson’s novels.

The Good

Once the action begins, it kicks hard and continuously.  Danger and excitement are ever-present, the nature of reality is challenged, exploded, put back together, and then smashed to bits again.

The Bad

You have to read 300 pages of setup.  This isn’t unpleasant, in fact these are some thoroughly written stories and they lay a great foundation for the rest of the book.  The vocabulary choice is also grating.

The Ugly Conclusion

I dug it, but I’m a fan of everything the guy’s written.  Some of the best bits of the book surface once you’ve completed it.  The length and complexity speak to the ideas of the Long Now, which apparently inspired the book.  The constant mapping between concepts and words of our culture and the book world brings to mind the Godelian mappings that  I finally began to understand in Doug Hofstadter’s “I am a Strange Loop”.   Also, no, Enoch Root doesn’t appear in this book.