Category Archives: Python

Automated export of your goodreads library

Goodreads used to have an API but they stopped giving access and it looks like they are shutting it down. A real garbage move.

I like to be able to use my data that I put in so I wrote a script to automatically download my data regularly. Then I can do stuff like check to see if books I want are in the library or keep my own list or analytics, etc.

Here’s the python script to export your good reads library, hope it helps you. I’ll put it in the public domain.

Updated to add: I got tired of dealing with places that do garbage moves. I left GoodReads for BookWyrm and it’s better.

Week 2206

Politics

I donated money to the Alex Morse campaign, a progressive candidate who’s trying to unseat Richard Neal, a greedhead Democrat. That happened earlier, but recently it appears that there was a sex scandal accusation against Morse. He’s accused of having consensual sex with adult students at the university he teaches at that are not in his class and also messaging people he’s met on Tinder. Sexual harassment and consent are incredibly important, but weaponized accusations are exactly the sort of thing that conservatives have professed concern over. In Alex’s case, the investigation by the Intercept certainly makes it seem like people who want to work for Richard Neal have been manufacturing a scandal instead of uncovering one.

Other campaigns I’m looking closely at:

  • The State Slate – The great slate didn’t do great in 2018, but I still like these ideas and I’m willing to give again. These candidates are all good chances to flip a district and any campaigning they do is good for upballot races.
  • Donna Imam – an engineer who might be able to flip a texas district.
  • Dani Brzozowski

Family

The fam out at the Esopus Creek trail

We’ve been doing more hikes again. I’m trying to make sure the little monsters leave the house every day. We’ve been going out to the village a little bit as well. I haul the kids in our expandable wagon and we can eat at an outside restaurant called The Partition.

We’re getting an eensy bit more social (in safe and measured ways).

ZZ had an extended encounter with a nice lady named Alexa and her dog Chacho. They spent an hour hanging out and I can’t recall having a nicer meal in ages. Here’s pro tip – if you hang out with the children and amuse them while we have drinks and dinner I’m grabbing your bill!

Beer Club had a mini executive retreat when Ray showed up in Rhinebeck! We took the Ho’s to the FallingWater trail where I finally got to meet Finley! He loves Max and Zelda loves him.

The Scott’s dropped by! We took them out to Fallingwater as well, where Max and Ben got along really well and explored up the waterfall all the way to its source. Zelda is in love with Zoe and asks about her.

Max and Ben never usually play together, but for some reason this day was just perfect. Everyone got along famously.

DIY

Around the house, we’ve been struggling a little to knock out more projects. It just seemed like we lost steam. So we dug out the back yard next to our house and put in a bunch of marble rock chips over garden cloth. Now things are better looking and won’t require any weeding – instead of a dirt patch next to the house we have clean white stone which doesn’t need maintenance.

We cleaned out the trampoline, which had been under a mulberry tree, trying to become a mulberry jam strainer. Yecchh.

We spent a couple of meeting looking at adding solar panels to our roof – I really like it for a lot of reasons including my predilection for distributed systems over centralized ones. Sadly, the tradeoffs right now don’t seem worth it. Even with incredible financing and all sorts of incentives it would take forever to pay off the panels and require trimming trees.

This helped me feel like we can really start getting going again. I’m gonna finish that Patio!

Code and nerdery

Great news here! I’ve been thinking at work about ways to better handle and test documentation across multiple languages. The key here is to make sure that you can extract code samples from documentation and then push it out to a testable format.

I found mkcodes, an excellent tool for pulling code out of markdown documents. It worked great, but only for Python. I submitted a pull request extending it to work with multiple language code blocks and it was a real treat to work with Ryne Everett on getting this live. Which is to say, now it can handle java, dotnet, any other language you like that’s embedded in your docs. Expect more on this as I make progress building a docsite with eleventy.

I also type this on the linux laptop as I managed to resize partitions without destroying anything. I thought 80 gigs would be enough for my Ubuntu partition, but it seems to be growing and I had to give it a few hundred more gigabytes to grow.

One more thing to do after upgrading python

This is mainly a note-to-self to remind me for my next upgrade – but hope it helps you too. TL;DR – install python-dev tools, export old installs into a requirements file, import into new python site-packages.

I upgraded the python on my laptop from 3.6 to 3.7 so I could use dataclasses for a little project. Then I got on a plane to Zurich and planned to get some work done on the 9 hour flight. Unfortunately I spent most of that time wrangling my python install – little cli tools I like to use like black, glances, sphinx, cookiecutter, etc – none of them worked!

When you do a python pip install of a library, it puts the library in a directory called site-packages under lib/python3.<your-version>/site-packages/<your-package-name>. If the library has defined command line entry-points, you will find it has also installed a file under bin/<your-package-name>. If bin is in your PATH, it means you can just type something like cookiecutter and it will call a function in the library to do stuff for you! Here’s what one of those files looks like:

➜  ~ bat ~/.local/bin/cookiecutter
───────┬───────────────────────────────────────────────────────────────────────────────────────────────
       │ File: /home/matt/.local/bin/cookiecutter
───────┼───────────────────────────────────────────────────────────────────────────────────────────────
   1   │ #!/usr/bin/python3
   2   │
   3   │ # -*- coding: utf-8 -*-
   4   │ import re
   5   │ import sys
   6   │
   7   │ from cookiecutter.__main__ import main
   8   │
   9   │ if __name__ == "__main__":
  10   │     sys.argv[0] = re.sub(r"(-script\.pyw?|\.exe)?$", "", sys.argv[0])
  11   │     sys.exit(main())
───────┴───────────────────────────────────────────────────────────────────────────────────────────────
➜  ~

It tells the shell to execute it using /usr/bin/python3 – that’s a link to the latest version of python3 – in this case it’s now python3.7, while the library was installed under python3.6. So the command line now fails! Python3.7 doesn’t know about libs installed for 3.6 at all.

On the flight I realized I could either copy from the python3.6 site-lib to the python3.7 one or just switch python3 to point back to python3.6 to make things work. But once you have internet access, here’s the magic to get it all working.

First, install the python3.7 dev stuff so you can compile anything that needs compilation.

➜  ~ sudo apt-get install python3.7-dev

Then let’s export everything we used to use under 3.6 and then reinstall it under 3.7.

➜  ~ python3.6 -m pip freeze > old_requirements.txt
➜  ~ python3.7 -m pip install --user -r old_requirements.txt 

Week 3008

Welp, I’m getting this in a bit late, so I’ll also put in some Music Monday.

But since the world is a pit, you get Cotton Eye Joe Gregorian Chant Nightcore Hardcore Dubstep remix. Do better this week, world! The Senate confirmation hearings for Brett Kavanaugh were a reminder of how little women’s pain matters. The guy is also clearly lying.

Code

My team won one of the 2nd place slots in the Hackathon as “Most Complete Hack”! We are talking about bringing it through to complete prod deployment.

Read up on Conflict Free Replicated Data Types which are a really interesting way to do mergable data structures. Underlies lots of interesting stuff like Redis.

Experimented with Mozilla’s Configman – the docs are NOT good enough, because they actually have a great drop in replacement for the standard ArgumentParser in there.

Bike

Got in all but 1 day last week. Rode in the rain 2 days! The first convinced me that I need to either own rainproof pants or just plan on riding in shorts in the rain. I bought a pair of Showers Pass transit pants and they worked great on Friday. Also got to randomly ride in with my buddy Lance again and reminded how nice that is.

Family

Zeebus is really smart and understands a lot of what we say! She picks things up and puts them where you ask, she understands that we need sleep, she’s got a lot going on upstairs.

She puts on Max’s bike helmet to let us know she wants to go riding!

When she slapped Sam hard in the face she realized she had hurt her- then Z gave her a big hug and patted her. That’s what Max does when he plays too rough with her!

Max and I did a really good hike together. We went a little crazy and climbed up next to a little waterfall and did some semi-safe bouldering nearby.  Max and I actually climbed way up a small cliff and got to the top. We also found a cool Puffball mushroom. But we didn’t eat it because we didn’t know how to tell if it was poisonous or not. (now we know and we’ll eat it next time!)

Week 3006

Work

We are  having a hackathon! I’m excited.  Its my first since Music Hack Day NYC. We’re going to try out Amazon Lex, Lambda, and some containers.

Some folks are already using my code for working with ELK, so that’s nice.

I’ve been focusing around automating my testing and such, and I really am centering around just learning to write better makefiles. Make is installed everywhere, like vim and other things I like. It just works. And makefiles do almost everything you want. The downside is that it is an ugly syntax. The upside is that if you learn one ugly syntax, you don’t need to learn everything about Rake, Yarn, etc.

Around the web

Best Practices for Staging Environments – this article by the excellent Alice Goldfuss came up at work as we wrestle with big calcs and datasets for our clients.

Why you hate Contemporary Architecture – This is grrrreat. One little note for the computer nerds. If you’ve heard of the Gang of 4 Design Patterns book, read the article and come back. The Design Patterns book was based in part on Christopher Alexander’s “A Pattern Language” which is a great guide to things that seem to work in architecture.

I discovered two things that similar to a POC I was working on (voracious-etl)

  • Datasette provides a readonly JSON api for any SQLite DB.
  • Dataset provides an ORM for any CSV or JSON file

The plugin system at the core of pytest is a library: Pluggy. I like that and I might use it earlier.

A good podcast: Flash Forward. Explores a new future every week. The most recent focuses on fungal enslavement, which I love. If you like that one, I highly recommend the mindblowing Parasite Rex by Carl Zimmer or Sensation by Nick Mamatas

 

 

Politics

None of the big progressive candidates made it in the democratic primaries. Cuomo still won. Tish won, which is OK. However, I see that  some progressive candidates made it through.

I hope they can do some real work and win in November. I’ll be calling and doing work to support them.  I long for a day when I can start moaning about free speech vs hate speech and trying to reign in some liberal excesses. However, right now, the work has to get done.

I also note that children are still in cages and parents are getting deported without hearings, but it isn’t in the headlines anymore. I’m still pissed about it. I’m still pissed that the probable governor of NY seems to only work for progressive issues when pushed and won’t use his clemency powers.

Exercise

I biked around 37 miles this week! Had to take off Monday and Tuesday due to rain and being pretty sick. But otherwise, I rode in. Got to stop on Christie street and help a guy who’d been knocked down by a cab.

Worked to a slightly lower pistol squat and my butt hurts sooo bad. Also, my dragon flag work is getting better. I can kind of hold it.

I started doing partial handstand pushups against the wall and they feel pretty good. I can do 5 at a time part way down and up. Next I’ll try lower and lower, then try freestanding ones.

 

Dataclasses coming in Python 3.7

I’ve been loving my time writing in Python. I started with 3.4 I think, and every release has brought something new and useful to the table. All the speed and async improvements are great, but the thing that I loved most in Python 3.6 was the new f string formatting. Removing boilerplate and providing the simplest easiest path just makes every task easier.  Less code on a page means fewer places to make mistakes. So it’s much better to see simple than complex code.


foo = 'bar'
# this is so clear and direct
message = f'Meet me at the {foo}'
# versus
message = 'Meet me at the {location}'.format(location = foo)

In 3.7, I’m excited about dataclasses. It’s like the attrs library – just a simple place to store data where you don’t have to re-implement all the standard dunder methods (__repr__, __str__, __eq__ etc). Adding a dataclass decorator and a list of the fields gives you a class with a standard constructor and all the other bells and whistles. The more you can use the standard library to accomplish high level concepts without having to type more code and write more bugs, the better. It’s coming in Python 3.7, but you can use dataclasses today with Python 3.6 using this backport on github – totally same functionality.

I’ve been playing around with using them here.

Backing up a SalesForce instance

SalesForce is an interesting beast. You gotta work within its limits, and it is great within them. As soon as you want to venture outside of the normal flow, things get complicated.

They suck  into SalesForce, but never out – it’s designed as a lobster trap for your information.

Weirdly, there’s not much on the SalesForce AppExchange that helps you easily back up your data on site. There are some tools that help you easily back up to another cloud, but little that helps you get your data back within your own walls.

Still, there’s a little layer over the SalesForce API in Python called simple-salesforce. Here’s a quick script I threw together to help put all your data into csv files.

I love PETL

When I started at my current job I noticed we we had lots of room for improvement about how we imported and exported data.  Folks had been using the MicroSoft SSIS platform as a way to Extract, Transform and Load data in and out of our database to various files.

SSIS is great for lots of things and has a lot of upsides. It is very drag and drop, folks don’t have to know a lot of programming to get it to do things, and it has lots of functions built in. If you need more programming power, you can execute C# or VB scripts to do the fiddly bits.

But I hate it. ( Don’t worry, we’ll get to the love soon.)

My biggest problems with SSIS:

  • It is unversionable. Try reading a git diff of an SSIS change. The xml is designed for a machine to read, not a human. If you want to know what has changed over time in your world, it’s a problem.
  • You can only use Visual Studio to edit it. Many of our SSIS packages include VB or C# scripts. That sounds fine – but apparently these compile to an undiffable, uneditable blob in the xml that is only recompiled if you save using visual studio. So if you want to change something across many SSIS package scripts, you have to open and resave each one.
  • It hides options under rocks. Finding out how something works requires lots of delving into lotsa windows and dialogues.
  • It changes things unexpectedly. Click in the wrong dialogue and it helpfully re-infers datatypes from a file for you. You don’t know until you go to execute.
  • It slapped my momma. Etc.

I wanted to move my team to something that was better for people.

We need something:

  • That we can diff
  • That we can do code reviews and pull requests on
  • That is simple, expressive and clear.
  • That is powerful.

To me that sounds like a programming language.  I encouraged folks on the team to try accomplishing a couple of tasks that might use an SSIS package instead to use Python. Immediately, things got better. Our code reviews made sense. Code quality improved with every single pull request.

We used pymssql to connect to SqlServer and inserted records as needed after processing them. Navigating and transforming XML docs was easy, CSV files were eaten up by the native DictReader.

And then Derrick found PETL. It’s beautiful. You point it at data and make simple moves to completely transform it. I’m smitten.

I had dozens of files to read from, each a quarterly file for a year – only noted in the file name. Each had a crappy heading line that preceded column headers. I needed to put them into 1 file for loading into SalesForce Wave. Whacking together a solution with PETL was effortless. Line 36 is where the PETL starts, and it’s so small and good that it is nice to see how much it encapsulates.

How to migrate your WordPress Blog between hosts.

My boss Mike needed to move his wine review blog from a friend’s hosting on lunarpages. I suggested he try dreamhost and he liked it – in a few minutes he had signed up for a free trial and used their 1-click install to set up a new install of wordpress.

Before he moved his domain to point from lunarpages to dreamhost I got him to prep by writing down a few important pieces of info. I’m trying to make sure I make this easier for other friends like I did when I helped Tove’s Thread For Thought move from WordPress.org to her own host.

Things to do before you change your domain to point to your new hosting

      Write down the name of your THEME. If you want to use the same theme, it’s important to write this down before you make the switch.
      Export your blog content from wordpress.
      Download your images. The wordpress export guide pretends this is easy, but it isn’t. If you are using the same domain name, I’m not sure what the easy way to do this is.

How to download your images

I wrote a python script that does this for you.
Make sure your system supports python. Next install BeautifulSoup – a great html parser for python.
Once that’s done, download this little script and change home and filesUrl to be your domain name.
Run the script, it should crawl your domain and download all of the images you host. Now follow the same steps of editing your export if needed and upload it all into your new blog at your new domain.

Hope that helps!

Project Idea – Syncing ebook reader

The bookworm logo

Here’s the setup. O’Reilly hosts a django based open source ebook reading website called bookworm.  You can run bookworm on your own server.  I opened a ticket on bookworm’s bugtracker to provide an api method to update where you are in a book.  Next you update Aldiko (Not open source, but perhaps we can write a plugin for it) and FbreaderJ to use that method when they exit to update where you stopped reading.

Upshot: You open a book on your phone and read it.  It syncs with your server with a bookmark of where you stopped reading.  Then you go to your website, and begin reading from where you left off.  And so on. Perhaps your phone also detects when it gets a new ebook and uploads that to your server or downloads a new book from your server when one shows up as well.