I recently switched from a dedicated Windows 7 PC for my home server to a Mac Mini, mostly for the better electricity consumption and the fact the the PC was having nightly bluescreen crashes and restarting.
I have always been a fan of RDP, and my office uses PCs - so to keep the convenient RDP access to home, I installed VirtualBox and created a Windows VM. This has a bridged network adapter, so it just looks like another computer on my home network.
However, when Mac OS restarts, or after a power failure, the virtual machine is powered off. This won’t do.
Daemons in Mac OS
Mac OS has the usual suspects like cron, but has a neat daemon launching system, appropriately called launchd, introduced in 10.4.
Launchd works by “loading” (think of it like a soft-install) objects called ‘'’plist’’’. Plist is a serialized object format like JSON or XML that tells launchd the properties of how to execute a particular daemon.
If you want to play with creating your own plists, head over to http://launched.zerowidth.com, where Nathan Witmer has created a plist generator.
Automatic Tasks in VirtualBox
VirtualBox comes with a command-line interface to automate tasks on VMs. My need is simple - just boot the box:
This follows the syntax for VBoxHeadless:
VRDE is the Virtual Remote Desktop extension, which allows RDP out of the box through a special Oracle tool.
Booting my VM at Login
launchd has multiple “runlevels” - there are System level daemons, and daemons for whenever a given user logs in. User daemons are stored at ~/Library/LaunchAgents/.
With the help of the launched tool, I made a plist for my command:
Notable options in here:
KeepAlive = True - If the VBoxHeadless process crashes for some reason, or someone manually shuts down the box (perhap from within RDP), the box restarts itself
RunAtLoad = True - Run this daemon when the daemon is loaded. “Loading” occurs at login, and when a user manually loads the daemon.
WorkingDirectory & UserName - These are special directives needed because of the peculiar way that VirtualBox runs. Just set it to the home folder of the user running the daemon.
This xml is saved into a file in /Library/LaunchAgents. Navigate to that directory, and execute
launchctl is the program that Mac OS uses to control launchd processes. Once the plist has been loaded, it should persist after reboot.
A similar plist can be used for the command ‘vagrant up’ to launch vagrant vms.
I switched from Octopress to Jekyll. I find that it is cleaner, easier to understand, and doesn’t require me to puzzle through Git branches and remotes.
There’s a great package manager for Mac called Homebrew that acts sort of like Aptitude on Ubuntu; it makes it really easy to quickly install software for your Mac from the command line.
The Casks platform is built on top of Homebrew - it allows you to install your commonly-used desktop apps for Mac. This makes writing a setup script for when you reformat easy.
It also makes Casks a great place to explore to find new Mac software you didn’t know existed. Unfortunately, the most you can see from the Casks master list on Github is the name of the software - you have to go into each .rb to find the URL for the website of the software.
I wrote a quick and dirty script to go through all of these Casks and make a text file that I can put in Excel to open all the websites at once.
First, we download all the files in the github repo to our target folder, then we run:
Now we have casks.txt, ready to import into Excel to easily open all the links within and browse for new Mac apps.
An enterprising fellow from Milwaukee built a site to automatically pay parking tickets in his city and in Madison. Unfortunately, the city of Madison changed around their ticket-payment website, so the site is not working. I thought it would be a good programing challenge for me.
How Do You Work with Web Services When They Have No API?###
Since the City of Madison doesn’t have an API on their website, it’s not a simple matter of sending a query and getting back a nicely-formatted JSON response to use in my app. So how can I get the information that I need from the website?
Fortunately, all websites have at least one available point of entry - via the web, as a user! There’s a Python package to emulate user behavior in a browser and return the HTTP response, which we can further manipulate as we want.
This is Mechanize, originally developed for Perl. I install the Python package and make a script to go to the City of Madison’s parking ticket payment site at https://www.cityofmadison.com/epayment/parkingTicket/. I based this on an example at http://stockrt.github.io/p/emulating-a-browser-in-python-with-mechanize/.
This returns us the html of the city’s parking ticket page as a big string of HTML.
Searching for Tickets##
In order to search for a parking ticket, it looks like I need to first accept the terms and continue.
Since Mechanize is ‘stateful’ - it acts like a browser tab that a user would have open - I can submit this form and continue working with my browser class.
Before we do that, here are some other things you can do with Mechanize:
Back to the main objective. We need to first:
Find the form on the page with the checkbox
Select the checkbox
Submit the form to continue
In order to do this, I needed to find the name of the form and the checkbox control. I used Chrome Developer tools to inspect it:
So we follow that action in our simulated browser and are brought to the Search page:
Great news! This is where we can enter our first dynamic input - a license plate to test. I do a similar inspection and find the names of the form and controls I need to manipulate:
Here’s what the HTML of the search returns for a plate with existing tickets:
Manipulating the HTML Search Results to Work With Them in Python
We need to take the HTML output and parse the page. The problem is that the output is not always regular. We need something like regular expressions that will let us recognize certain, sometimes repeated parts of the page.
The LXML library is a good starting place to process XML and HTML in Python.
We’ll also take advantage of XPATH, which is like regex for XML:
Let’s inspect the HTML of the search results to find what is regular about these ticket results.
I’ve skipped over the header of the page and gone right to the meat of where the tickets are:
We see that each discrete piece of data about each ticket is contained in a table cell <td>, and that each of these is the child of a <tr> that stands for each ticket. To get each ticket, we’ll figure out what is common to those <tr>s.
Each is the direct child of a form. Since there’s only one form on the page, we can start there. Note that XPATH requires some strict syntax and won’t necessarily follow the design of your element nesting here.
Here’s the XPATH syntax that finally worked to return each ticket and nothing else:
Here’s how we read this, from right to left:
//text()
Two slashes: Select ALL of
text() - Select the text inside this element (as opposed to the class, or the href, or another attribute)
//td - Select ALL <td> elements (even if they’re not the direct child)
/tr[position()>2] - This is the complicated one. Select <tr>s that are the direct children of the parent (one slash). Only select <tr>s that are the 3rd or greater <tr> child of their parent. This is to accommodate for these two <tr>s in our page that come before the tickets:
//form/table - Select tables that are the direct children of ALL forms on the page.
Whew.
This give us (semi) nice lists like:
There’s a lot of garbage in there, like carriage returns, tabs, and spaces. Additionally, we’re only interested in some of the list elements:
Apart from the neatly printed output, we also now have a dictionary of tickets. We’ve achieved the goal of a fake ‘API’ for searching for parking tickets.
Part 2: Getting the Python EWS Client to Send an Email
Now that I’ve got a client connect to the Exchange server, I can actually use the SOAP API methods as documented in the WSDL and on Microsoft’s documentation.
Suds has great built-in methods and classes for working with SOAP, but as this post confirms, bugs in both Suds and EWS mean that I’ll have to manually build the XML and inject it directly into the message.
So far, I have code that will connect a Suds client to my exchange server and send an XML message:
Now, we just need to tell it what to send in that message.
Building the XML Message
There are all kinds of things you can talk to EWS about, but I just want to talk about emails.
First, I wanted to place a copy in my SentItems folder, so I added
In my first iteration, I had the body sent as text but I wanted to replace it with an HTML email template to make it prettier. This was tough for me to figure out; I kept getting XML validation errors. Eventually, I learned about CDATA, a tag that tells the XML parser to ignore whatever’s inside it.
I replaced the Body tag with:
And will replace the #Body# with some HTML from an email template later.
Lastly, I put in some information about an automatic Reminder to get the final message:
Each of those #Variable# tags I put in the XML will be string-replaced on the Python side.
Putting Variables into the Message
I wrote a quick little function to take a list of tuples and a template text and replace the text in the template:
Now that I have XML as a big string, I can send my message: