A How-To on Setting Back Diversity, or, “we hired women to bring you beer”
Posted: 2012-03-21 00:18

What a horrible month it has been for diversity in technology. March started off with an awful article on Brogrammers, and today the Boston API Jam, hosted by Sqoot of New York, took their turn at ruining things. If you thought the "brogrammer" article and everyone involved with it was beyond stupid, check out what the API Jam was promoting: the details of their event.

As ReadWriteWeb took note of, they weren't short of apologies, but it's too little, too late. Rather than tweet a weak apology after the fact, try thinking ahead of time on promoting an environment of equality. I would certainly hope they learn from their mistake and begin to take an active stance towards diversity, but given some of their responses on Twitter, I'm doubting that will happen.

As word got around, many of the early responses to the event's original description, which included "Need another beer? Let one of our friendly (female) event staff get that for you," indicated they didn't really care. Responding that the text was just a little humor shows them missing the point early on. Plus, it wasn't even funny. Shortly after that, they respond with boom to backup a commenter who marginalizes the female role in a technology event as "a perk". A few hours go by and then starts the stream of "we're sorry" messages to seemingly anyone who mentioned them in relation to this blunder.

The message links to an "apology" letter which includes:

While we thought this was a fun, harmless comment poking fun at the fact that hack-a-thons are typically male-dominated, others were offended. That was not our intention and thus we changed it.</blockquote>

I'm not sure why poking fun at the men who attend these events needs to objectify women, but maybe that's what makes it "fun" for them? The worst part of this is the "others were offended" piece. They still don't acknowledge that it's not just that their words are wrong, but their views are damaging to the community. The message effectively says, "We think objectifying women is fun and harmless, but some of you were offended'. It's an unacceptable position to take, and the great news is that they've lost sponsorship because of it.

apigee pulled their sponsorship because the API Jam's message wasn't consistent with their values. Heroku did the same. CloudMine went on to write a post of their own about not only their withdrawal of sponsorship, but their feelings on sexism in tech. Good on all of these organizations for removing their support of this event.

The next time you plan an event like this or do any sort of outreach, please think with diversity in mind. It's mind boggling how far behind the times people in technology can be. There shouldn't have to be a women's tech suffrage, but as long as events like the API Jam promote the idea that women aren't a first-class citizen at their event, diversity in technology will keep taking one step forward, two steps back.

minidumper - Python crash dumps on Windows
Posted: 2011-09-29 08:04

If you're writing software on Windows, you've likely come across minidumps. They're a great help when your project encounters a crashing scenario, as they record varying levels of information to help you reproduce the problem.

The main product I work on at my day job, a server written in C++, has had minidump functionality since the beginning. We keep PDBs around for our releases, then when customers encounter a crash, we grab the minidump, match it up with the binaries and PDBs, then try to figure out what the scenario was. I think that's fairly standard operating procedure, and it tends to work alright. Release crash dumps are obviously less helpful than debug dumps, but you can still get enough out of them to get started in the right direction. So while one part of my job has that, the other part - the Python part - has had me wishing for it. So I wrote it.

The extension modules I maintain internally for our server's APIs occasionally come crashing down during our test automation. That's fairly alarming at first since the tests just drop out and you don't get much of an indication of why. Was it the extension? The underlying C++ API? Python itself? The unittest logs are all we have to go off of, so then it's a matter of piecing together what was happening at the time, then either manually re-running it from the REPL and/or attaching the Visual Studio debugger to catch the problem.

In comes minidumper. By importing minidumper and enabling it, you can receive crash dump files whenever your Python process goes down. It's there for you.

import minidumper


Now if you do some crazy stuff and cause a crash in your extension code...

    int x = 1;
    int b = x / 0;

...you'll get a crash dump that will tell you exactly what just happened. In my case, I got example_20110929-071529.mdmp. Now if you open that up in Visual Studio, ideally the one that Python was compiled with, you'll get a look into what happened once you hit F5 (or Debug > Start Debugging).


The first thing you'll see is a popup telling you what the problem was and where it occurred, then Visual Studio will show you exactly where in the code the issue lies. As we all know, division by zero is a no-no, and it crashed. If you hit the break button, you can poke around in a ton of information that was gathered from your crashed process. Depending on what value you gave to the type parameter of minidumper.enable(type=...), which defaults to MiniDumpNormal and has a full list of options here, you'll have different amounts of information.


You can walk around the call stack and see what functions were called with what values, and from there you can inspect variables within a function by hovering over them with the cursor. The Debug > Windows menu contains a whole bunch of other pieces of information, including memory, disassembly, value watching, and more.

As far as examples and tests go, I only have some of the basics down, although I plan on bulking those areas up and coming up with more useful and interesting code to prove this extension's worth. I just threw the source up on https://bitbucket.org/briancurtin/minidumper, but I'm going to wait on getting it on PyPI until I figure out the best way to organize and distribute it.

If you're looking for more info on minidumps, http://www.debuginfo.com/articles/effminidumps.html and http://www.codeproject.com/KB/debug/postmortemdebug_standalone1.aspx were helpful sites, as well as the various MSDN documentation.

The following setup steps are what I do to get started, using the CPython default branch, aka, CPython 3.3. Also note that I'm using a debug-built Python, and telling the minidumper extension to do a debug build as it's what I usually use at work, as well as when I'm working on CPython.

  1. hg clone https://YOURNAMEHERE@bitbucket.org/briancurtin/minidumper minidumper-dev
  2. C:python-devcpython-mainPCbuildpython_d.exe setup.py build --debug install
  3. C:python-devcpython-mainPCbuildpython_d.exe -m tests

Running the tests will build a tester extension, which contains two crashing functions. Right now, the few tests just call the crash functions with different minidumper.enable settings in order to make sure the right dumps are being created in the right places.

Hope it helps.

Note: Until I fix http://bugs.python.org/issue11732, the crash windows asking you to debug or close the program will stay around until you click something. Ideally I'll be able to add functionality to temporarily disable Windows Error Reporting for the http://docs.python.org/dev/library/faulthandler.html module, as it currently requires manual intervention while running the CPython test suite on Windows, as :code:`minidumper` does.

PyCon 2011 CPython Sprint Newcomers
Posted: 2011-03-16 15:35

Following up two tutorial and summit days, then three days of the conference, the sprints got off to a great start on Sunday evening. I'm back at home now but wanted to put together a summary of the first two days: A lot of great projects got up on stage to pitch their sprint ideas including Brett Cannon speaking for CPython, letting people know where the sprint would be, mentioning the "dev-in-a-box" CDs, and encouraging people to come out and hack. Within 15 minutes of the end of announcements, we had 7 first-time sprinters eager to dive in and get going right away. The new developer guide was instrumental in getting everyone through the initial setup. The plan was to get a Mercurial checkout and coverage.py as a starting point, as one of the suggested sprint targets was increasing test coverage. By 6:30 on the first day, we were up to 9 people fully up and running, pouring over the coverage results (which were handily pre-generated on the "dev-in-a-box" CD), and diving into code. Here's what everyone worked on:

  • Alicia Arlen started tackling the expansion of string tests and got a patch written and checked in within first day.
  • Scott Wilson noticed some failing urllib tests on his Mac and got to work on fixing them. After that he started on increasing urllib test coverage.
  • Denver Coneybeare mentioned a dbm patch he made a few days before the sprint, then got it reviewed and checked in. He followed that up with test coverage patches to fileinput and _dummy_thread.
  • Jeff Ramnani came up with several documentation and code changes, along with some tracker triage to get a few older issues closed.
  • Michael Henry spent some time on the email package, including some documentation updates and a port of test_email_codecs to Python 3. He's also working on timeit test coverage.
  • Natalia Bidart noticed several test failures after the initial build and test, then wrote up a few patches to make sure her configuration passes all of the tests. She's also working on logging test coverage.
  • Matias Bordese read the dev guide pretty closely and patched a step that didn't jive with his system. He's currently expanding coverage of the dis module.
  • Robbie Clemons started by reviewing a few issues, then took cgitb up to 75% test coverage by starting a test suite for it.
  • Evan Dandrea came up with patches to posixpath, shutil, and tarfile for test coverage and a few bugs.
  • Jonathan Hartley looked into a unittest issue and wrote up a fix plus tests that got checked in pretty quickly. He's also working on site.py coverage.
  • Piotr Kaspyrzyk used a tool he made to find typos in his research work and applied it to the Python documentation, coming up with several patches and many more on the way.
  • Tim Lesher spent time investigating a pydoc issue that was being discussed on the mailing list about named tuples
  • Brandon Craig Rhodes started by running coverage and ended up diving into the order of imports on interpreter startup to fix coverage results before going further with them. He took the new results and is working on copy test coverage.
Here's a picture of some of the group, hard at work: Many thanks to those listed and everyone else who came out to sprint. Hopefully you learned something new and had a fun time contributing -- the effort is definitely appreciated and we look forward to working with you in the future!

FileSystemWatcher on Python 2
Posted: 2011-02-25 19:11

Alright, alright, you guys win. Enough people emailed me to say they would use a 2.x version of watcher from my previous post, so here it is: version 0.2 now supports Python 2. The changes are pretty simple. The "biggest" part of this happened in changeset 96f3f9e4511c, where I handle a few 2 and 3 specific parts split by ifdefs. It's a few sections of handling Unicode/strings/bytes, and then a small change for 2.x to receive the action number as an int rather than a long. I think I did all of this correctly since it works, but that's a poor definition of "correctly" and my Unicode knowledge is definitely lacking. I haven't done a ton of testing on it, but it seems to work alright in my simple test running between 2.7 and 3.1. If you have any issues with it, feel free to submit them or email me.

The five year project: .NET FileSystemWatcher clone for Python
Posted: 2011-02-18 16:51

In the time I've been using Python, no project has started and stopped, and started and stopped again, more than my goal of writing a file system monitor. Sure, it's a small and simple project in the grand scheme of things that could be accomplished over that time, but I like to finish what I start. The idea originally came from my father, also a Python user, suggesting something to work on, likely to help me learn but it'd also help him out. Years ago he wanted a multi-tabbed text editor with tail -f functionality. I think I was reading through a wxPython book at the time and figured, "sure, I can learn this and make that tool." Started it up, had the shell of a simple GUI written, then came time to get the file system updates. I probably got distracted by something, got hooked on something else, then totally forgot about the whole thing. For whatever reason, this happened every few months. About three years ago I tried to rejuvenate the whole thing and found Tim Golden's great "How do I..." page (pretty sure Dad sent this to me before). He has an example, three of them to be precise, covering exactly what I wanted to do: watch a directory for changes using Mark Hammond's pywin32. Awesome. I got something coded up pretty quickly and took the library in a different direction, using it at work to write a Windows service that would monitor our servers and look for crash dumps and email the team. It was super simple and paid off big time, but I kinda just whipped it together and it was poorly designed. Fast forward to a few months ago. I was bored and looking for something fun to work on -- ah, that file system watcher I've been half-assing for years. I thought to myself, "now that I actually know wtf I'm doing, I should do that, and I'm sure my Dad would get a kick out of it." Somewhere in the middle of all of this I was writing C# and used the System.IO.FileSystemWatcher API, which was really nice. I've always wanted the same functionality in Python and liked what they had, so it would be cool to do what they did. A few blogs around the web claimed the Win32 ReadDirectoryChangesW API was behind the scenes of FileSystemWatcher. True or not, it made sense and I was familiar with that from the Tim Golden examples and my watcher service. I've been writing and reading a lot of C code lately so I started hacking. After reading up on a few things, I came up with a much better C equivalent of what I had in that Windows service. It's multi-threaded, uses IO Completion Ports, and seemed to work pretty well. Pass in a directory and a callable, call the start method, then you'll get callbacks for creating files, renaming files, etc. Sweet, we're on the way. After fiddling around with that a bit, I figured it was good enough to build on. I started writing some tests and had simple things like the following working. [code lang="python"] >>> import watcher >>> import os >>> callback = lambda action, path: print(action, path) >>> w = watcher.Watcher(os.getcwd(), callback) >>> w.flags = watcher.FILE_NOTIFY_CHANGE_FILE_NAME >>> w.start() # Then I opened up vim and created a file called "hurf.durf" 1 .hurf.durf.swp 1 hurf.durf 2 .hurf.durf.swp [/code] That was cool and all, but I want to be able to follow one specific file, or files that match a certain pattern. I also want to be able to set callbacks for specific actions. Hmm, FileSystemWatcher can do that. Maybe I'll just build out a clone and see how it works. One of the first things I wanted to figure out was how to emulate the callback attaching and detaching like on Changed events. I needed a container that supplies += and -=, which is none of them. Easy enough, just inherit from one and provide the __iadd__ and __isub__ operators. Before you get outraged: I know that's "unpythonic", but I'm going for a clone here. Filling in the rest was pretty easy. There's a bunch of properties in FileSystemWatcher that map to the attributes and methods of the underlying Watcher. For example, FileSystemWatcher.NotifyFilter sets Watcher.flags, which is an OR'ed group of NotifyFilters, which are constants exposed by watcher from Win32. The weirdest part of the whole thing is that starting and stopping FileSystemWatcher is done by setting EnableRaisingEvents to True or False. It's not a method called start or stop like in the underlying Watcher (or anything else that needs to start and stop). It felt wrong perpetuating this weirdness, and again I know it's "unpythonic", but I'm going for a clone here. As for translating Watcher callbacks into FileSystemWatcher callbacks that work with all of the fancy filtering, it's just a simple queue, a regex, and a big if/elif block. Watcher calls its callback which puts the action and relative path into the queue. FileSystemWatcher pulls it out, sees if it matches the filter, then we figure out from the action which callback to call. If it's a rename, do a special dance, but otherwise create an update object, fill in the details, then start calling back to the user. [code lang="python"] >>> from FileSystemWatcher import FileSystemWatcher, NotifyFilters >>> import os >>> callback = lambda event: print(event.ChangeType, event.Name) >>> fsw = FileSystemWatcher(os.getcwd()) >>> fsw.Created += callback >>> fsw.NotifyFilter = NotifyFilters.FileName >>> fsw.EnableRaisingEvents = True >>> # Opened up Explorer and right clicked to create a new file 1 New Text Document.txt [/code] There you have it. It took 235 lines of pure Python for FileSystemWatcher and 466 lines of C for watcher for this five year project to be completed. If any future employers are reading this, I'm capable of writing more than 140 lines of code per year to complete a five year project, I swear. The project is now on PyPI under the name watcher, complete with a few binary installers. It's 3.x only because 2.x is dead, but I'll do a backport if people are interested (email me: first name at python.org). The project is up on bitbucket: https://bitbucket.org/briancurtin/watcher. It's not really complete but it works pretty well for most usages. I know of a bunch of bugs that I'll eventually fix, but feel free to report more or even fix some of them. Thanks for the idea, Dad.

Contents © 2014 Brian Curtin