Posted: May 31st, 2013 | Author: Pete Hunt | Filed under: Uncategorized | No Comments »
Hey all -
Time to stop kidding myself that I have enough time to maintain PyMySQL (http://pymysql.org/). Since I stopped using Python at work I haven’t really had the time to maintain this package, even though I keep thinking I’d get around to it. And with the release of React (http://facebook.github.io/react) this week it’s only going to get harder to find time. I think it’s best if I let someone else who’s more active take the reins.
Anyone interested? Send me an email (floydophone@gmail.com)
Thanks!
Pete
Posted: April 1st, 2013 | Author: Pete Hunt | Filed under: Uncategorized | No Comments »
The debate about “web assembly” rages on.
The problem with this debate is that JavaScript is a multifaceted piece of technology. Here are a few different ways to look at it.
JavaScript is a kind of okay programming language.
JavaScript gets some things right and some things very wrong, but if used correctly (i.e. if you read “The Good Parts,” use a module system and a few high-quality libraries) you can actually build something big and useful.
JavaScript is a platform.
If you take only “The Good Parts” you can start to think of JavaScript as a small language similar to Scheme. Its semantics are well-understood and it’s easy to target as a compiler writer.
So in this sense you can think of JavaScript as “web assembly” but I prefer to think of it as Scheme with C syntax and poor support for macros.
And the VMs are really, really good these days.
JavaScript is a packaging format.
The penultimate example of JavaScript as a packaging format is JSON, however it can be used for so much more. JSONP, the defacto standard way to do cross-domain communication, uses JavaScript as its protocol. At work we wrap our CSS files in a bit of JavaScript so we can require() them as JavaScript modules. And JavaScript is even a packaging format for itself; lots of people write CommonJS modules and then run a tool to package them for the browser.
So what?
Technologies like source maps and hygienic macros are steps in the right direction. We need to embrace that JavaScript is not just a programming language and build tooling and libraries that support each facet of this technology.
Posted: November 20th, 2012 | Author: Pete Hunt | Filed under: Uncategorized | No Comments »
It’s a widely accepted truism that you should write semantic markup whenever possible. It’s unfortunate, but the traditional semantic markup techniques that many subscribe to present lots of problems. Here are some ways to work around those problems.
1. Stick to classes in your CSS selectors, and avoid tag selectors.
There’s three main reasons for this. The first is that they’re slow! Believe it or not, CSS performance is often an issue in modern Web apps and style calc is a big part of that. Since selectors are evaluated right-to-left, if you select for, say, the img tag at the end of one of your selectors in your CSS you have to check every img rule for every img tag in the document rather than only the nodes that contain the class you’ve specified.
The second is that they make specificity confusing. If you stick to classes, all you need to do is count the number of class selectors to determine the specificity, so it makes debugging conflicting CSS rules a bit easier.
Finally, if you’re using a component architecture and refactoring lots of markup, you can quickly scan the class attributes of your markup to figure out which rules you need to update. If you’re targeting tags everywhere this can be a bit more problematic depending on how your markup has been structured. In my opinion, this is an instance of “explicit is better than implicit” from The Zen of Python.
2. <span>s and <div>s behave more consistently than semantic tags, even with a CSS reset. And they don’t kill accessibility!
Check out this jsfiddle: http://jsfiddle.net/floydophone/8rSSj/. Adding a <div> as a child of a <p> at any depth implicitly closes the <p>, so <p><div>test</div></p> actually gets rendered as <p></p><div>test</div><p></p>. It also happens when you try to programmatically construct the DOM: http://jsfiddle.net/floydophone/N3wuY/
You may say that I’m an idiot for embedding a <div> inside of a <p> because that isn’t semantically correct. But if you’re using a component architecture with multiple teams, it’s possible that someone will inject a <p> or a <div>. That is, every time you use one of these tags in one of your components you need to check every possible instance where it’s used, which could be prohibitively difficult. And when this issue manifests itself, if the engineer doesn’t know about this browser quirk the inner <div> could potentially be inserted in a really weird place in the DOM.
Finally, check out this fiddle: http://jsfiddle.net/floydophone/9MKCX/. It’s a bug in Webkit that the borders on the corners get cut off. It works in Firefox though. The workaround I’ve used here is to use background-image instead of <img>.
But what about accessibility?
I’ve talked to some accessibility people at work and they say the most important thing to worry about are links, form elements, and headers. Fortunately they all behave pretty normally, so keep using them!
Posted: January 23rd, 2012 | Author: Pete Hunt | Filed under: Uncategorized | 5 Comments »
There’s a lot of discussion on HN right now in response to Paul Graham’s call to disrupt Hollywood.
A few commenters hit the nail on the head. With the movie business it’s no longer about distribution anymore. The music business has been disrupted even more, to the point where distribution was (sort of) solved in the late 90s and production is now so cheap and available that it’s no longer relevant.
The problem is filtering. How do I cut through all the crap to find the music I actually want to consume?
Traditionally, it’s been solved by tastemakers like radio DJs and syndicated label-sponsored playlists. Many think that it’ll be disrupted by an Amazon-style personalization system.
I think they’re missing the point. For the average person, entertainment is as much about creating a shared culture as it is about the content itself. Most people want to feel like they’re part of a movement. They want to go to a huge arena filled with people similar to them and enjoy an event together. They want to find common ground with each other and gossip about celebrities in their spare time. Or they want everyone else to do these things, and differentiate themselves by enjoying more obscure artists or genres.
Pandora, perhaps the most well-known music personalization service, realized this a long time ago; that’s why most of their non-niche channels very closely resemble the Billboard charts of their respective genres.
This is why the music industry is consolidated into a few powerful multinational corporations who own or influence the full stack. This consolidation has traditionally been the only way to have enough marketing firepower to create a shared culture around an artist. If you want to disrupt this industry, you need to solve this problem — otherwise you’ll never create a product that fits the masses. Right now, the only things that come close are Pandora and social media like Facebook and YouTube, but a service that deftly walks the line between providing a personalized service to the listener while being opinionated at the same time could be disrupt this industry in a major way.
Posted: January 11th, 2012 | Author: Pete Hunt | Filed under: Uncategorized | 2 Comments »
Recently the following presentation has been circulating the Python community: http://python-for-humans.heroku.com/#1
I think it hits the nail on the head regarding the usability of Python packages, and towards the end he talks about a lot of places for improvement in the Python world. Specifically, we need to get back to our philosophy of “there should be only one obvious way to do it” and come up with a set of conventions for common tasks. Here’s my take on it.
Packages and distribution/dependency management
pip and requirements.txt are great, but they feel crude. Additionally, it’s well-known best-practice to use virtualenv for all of your projects, but it’s as if it’s a glued-on Python feature (which it is). What we’re really trying to do with virtualenv is specify a small set of required Python packages at specific versions for the package you are working on. There’s no reason why we can’t build a better way of doing this.
I propose adding a __dependencies__ attribute to __init__.py which specifies dependencies and module names, effectively running each Python package in its own pseudo-virtualenv. It’d look something like this:
# __init__.py
# include Django 1.3.1 and any version of PyMySQL
__dependencies__ = (('Django>=1.3.1', 'django'), ('PyMySQL', 'pymysql'), ('deb:apache2', None))
Now, any Python file in this package can import modules from “django” and “pymysql” and they’ll go to the right version. Note that we specify a name as the second item of the tuple — this is the name that we’ll use for importing. This allows us to use multiple distributions that have the same package names.
The last one (apache2) is an idea that I’m kicking around. It would be nice if we could specify platform-specific dependencies as well (using a tool like yum or apt-get to resolve them). This part requires more thought.
When importing this package, we have enough information to automatically install dependencies from PyPI (no more python setup.py install or easy_install!), which makes the whole getting started process extremely easy. Additionally, if we try to import a module that isn’t there, we can rig the ImportError to include a message that says “hey, we couldn’t resolve pymysql, maybe you should add the PyMySQL dependency to __init__.py”
You need to be careful with this approach (for example, it could be possible to be accidentally using two different versions of the same package and passing their objects around haphazardly), but this is a tractable problem I think through, perhaps, runtime warnings.
I also thing setup metadata belongs in the package, rather than in a setup.py script. We could accomplish this by including a __meta__ attribute that includes package metadata.
“Blessed” packages
Since the above system makes it so easy to install any package from PyPI, we should have a drop-dead, obvious place that provides concrete answers for questions like “what is the best process manipulation library” and “what is the best HTTP library.” This would ideally replace the notion of an included standard library.
You could also imagine a package or author “karma” system to decide how good a package is. This could be based on test coverage, lint information and user reviews.
Configuration management
There are infinite ways people create configs for their Python packages, but there isn’t a clear standard for this yet. We need a standard way to do this, while at the same time maintaining flexibility.
I propose adding a __configure__() method to every package which takes an arbitrary set of parameters. Through here you’d pass all of your configuration data for a given package. Now, most of the time you won’t need it since packages just contain code that you call without configuration, but many projects do need configuration, like Django.
Often, we’ll want to create configs specific to an instance of our project. Our project will have its own __configure__() method which may or may not call __configure__() on its dependencies. But we’ll need to put the initial configuration data somewhere. I propose a file called __configure__.py which is auto-loaded when running a package. Its sole purpose is to call __configure__() on the correct packages.
If you don’t want to do your configs in Python, that’s fine; just call __configure__() with the path to your custom config file format.
You’ll notice that the system I’ve just described is almost a complete rip-off of Django. They have django.conf.settings.configure() and settings.py files. I’m proposing a generalized version of this.
Providing services
Check this out: http://blog.ianbicking.org/2011/03/31/python-webapp-package/. The key takeaway for me is: “an application requests services, and the container tells the application where those services are.” We already have a way for configuring those services, we just need a way to provide them. The canonical examples for this are DB-API modules and WSGI applications.
I propose a new package method, called __service__(), which takes a service name and a set of unspecified arguments and returns an object implementing this service. You can imagine how great it would be to just __configure__() your web app in __configure__.py in the current directory and then call __service__(‘wsgi’) to get the WSGI app.
We would come up with a standard set of service names as well (like wsgi, db, unittests etc), and standard __configure__() parameters for these services.
System integration
Last, I’d like to briefly touch on system integration. I find myself dealing with roughly three pain points when doing platform-specific integration:
- Where do my command-line scripts go and how do I ensure they’re set up correctly?
- My package needs a cron job. Where do I set this up? How do I do it in a platform-agnostic way?
- Ditto daemons
I think providing ‘scripts’, ‘cron’, and ‘daemon’ services as described above would be a natural way to provide this functionality. The API for this needs to be fleshed out more, but I would love to just be able to do this for a package:
python –install –package=mywebapp –configuration=__configure__.py
Or even:
python –install –distribution=”MyWebApp==1.0.1″ –configuration=/home/web-user/__configure__.py
and it would call __service__() appropriately, install the scripts to /usr/local/bin and /etc/init.d/ and ensure that __configure__.py was being used.
In closing
So that’s it. I may play around with building a prototype when work slows down. If you’re interested in building this let me know, I think it could be done in 1-2 day sprints.
Posted: December 27th, 2011 | Author: Pete Hunt | Filed under: Uncategorized | 1 Comment »
I made this when I was bored during the holidays. Maybe someone out there finds it funny.
http://hipstergrammers.tumblr.com/
Posted: November 8th, 2011 | Author: Pete Hunt | Filed under: Uncategorized | No Comments »
I just released PyMySQL 0.5. This version should be much more stable than previous versions as I’ve fixed a lot of the unicode handling.
As always, we support an extremely broad range of Python versions, including the 2.x and 3.x series. The 2.x version is in PyPI as PyMySQL, the 3.x version is in as PyMySQL3.
Check it out at http://www.pymysql.org/
Posted: October 18th, 2011 | Author: Pete Hunt | Filed under: Uncategorized | 13 Comments »
Over a year ago when I was a lowly graduate student I wrote a blog about web templating engines. I’ve recently seen some posts on HN that are related to this topic and I’d like to clarify and expand on my position with this.
I’ll come right out and say it: templating languages have no valid use cases. Well, except for “platform X doesn’t have any better tools,” which is a lame excuse.
Most web templating languages are designed to be accessible to non-programmers or something. I think the rationale is to allow designers, who are presumed to be inferior engineers, to write the templates, and to let the engineers just write the backend code.
This is a mythical use case and it has never happened. Either your designer is not an engineer, in which case you need to have engineers adapt the design to the project you’re working on, or your designer has engineering chops, in which case you should trust them to work with real tools (provided that you have good ones).
At this point I’m sure someone will point out that ERB and its kin are designed for engineers. Yeah, maybe, but it’s really just a crude way of generating text that isn’t much better than string concatenation.
So now if you’ve committed to using one of these templating engines, you have condemned your front-end engineers to use a language that throws out most of the lessons we’ve learned about programming language design, best practices, and encapsulation over the past 20 years.
Damn. So what are we supposed to do? There isn’t a one-size-fits-all solution; here are two examples that I think fit a lot of use cases.
Use case #1: you are prototyping and you don’t have a huge front-end engineering staff
Chances are your designer hands you a bunch of HTML, CSS, and images, and you need to turn that into a dynamic template as quickly as possible since you’re strapped for time, engineering resources, or both. This is the case when you use something like PyQuery (or another user-friendly DOM manipulation library) to programmatically manipulate the markup. Your designer doesn’t need to know anything, and your engineers get a tool that is actually good at its job. They get to use a real programming language rather than half-baked templating language constructs.
Use case #2: you are building at scale
Facebook has a ton of front-end code. We use XHP to write it. Long story short, markup becomes a first-class expression in PHP and you can leverage all of the battle-tested object-oriented techniques to manage it. We build XHP components in a modular fashion which allows incredible levels of code reuse, protection from XSS attacks, and easy cross-team collaboration. Working with XHP vs something like Smarty (or even worse, vanilla PHP) is like building a huge project in Python vs C. Your level of abstraction is much higher, it’s safer, and you can move much faster.
The important thing to take away here is that we need to stop thinking of this as generating raw text. Instead, we need to understand that we’re working with markup that has semantic meaning. This lets us tap into the power of abstraction and encapsulation to make our job easier.
It just pains me to see all of these projects jump through hoops to make their projects fit into their templating engine’s idea of what the world should be like. Why not let your engineers leverage every tool in their toolbox to build scalable front-end code, rather than stick with these crude tools and perpetuate this myth of non-programmers building front-end templates?
Posted: July 28th, 2011 | Author: Pete Hunt | Filed under: Uncategorized | Tags: github, pymysql | 1 Comment »
PyMySQL, a pure-Python drop-in replacement for MySQLdb that works across all major Python implementations including PyPy, IronPython, Jython and Python 3, has moved to GitHub. The old domain name should be redirecting there soon (http://www.pymysql.org/), or you can just check out http://www.github.com/petehunt/pymysql. With GitHub it will be much easier to manage community contributions and after a few months at Facebook I’ve entirely forgotten how to use SVN
Posted: April 8th, 2011 | Author: Pete Hunt | Filed under: Uncategorized | 2 Comments »
I’ve been meaning to write a blog post on Node.js for a long time, but it wasn’t until now until I found an excuse to make myself sit down and write it. This afternoon the author of Node.js, Ryan Dahl, gave a talk at Facebook. It was pretty well attended, with around half of the audience having used Node.js beyond “hello world.”
Pythonistas could think of Node.js as the JavaScript equivalent of CPython, Twisted, and setuptools packaged together in a single binary residing server-side. It includes a package repository (npm), and is all about event-driven I/O. That means that every time you make a call that would block, you pass a callback to it, kind of like how Twisted’s Deferreds work.
Anyway, going into the presentation I was cautiously positive about the technology and had a pretty negative opinion about the Node.js community. I’m an avid reader of Hacker News and the Node.js spam is almost unbearable. But to his credit, Ryan acknowledged that in his presentation and on his Twitter.
What strikes me about Node.js is that it’s not particularly innovative. On the surface, it’s just a reimagined Twisted Python – but Twisted’s Deferred functionality is much cleaner in terms of error handling with errbacks and flow control with DeferredList. The syntactic advantages JavaScript has with anonymous functions are, to a degree, mitigated with inlineCallbacks() (the developers of Node.js had no solution to nested “callback hell”).
With that said, I’m totally stoked about Node.js, and I think it’s going to explode in growth. JavaScript has matured so much in recent years and there is a powerful set of best practices that turn it into a rather nice language, especially when coupled with Node.js’s implementation of CommonJS.
The design of Node.js is beautiful, largely because of JavaScript’s anonymous functions. JavaScript just seems like a better fit for this sort of framework than Python is. Additionally, one of the lesser-advertised features of Node.js is that it exposes a lot of V8′s internals to the JavaScript side (i.e. raw I/O buffers instead of strings), which results in awesome performance. Before the talk, I had no idea that Node.js was tied to V8 for any particular reason; now I know.
There’s also a ton of JavaScript developers who could be enlisted on the server-side and develop the client and the server without switching mental gears between languages. During the talk, Ryan cited a bunch of interesting npm packages, and apparently the library is extensive.
And finally, Node.js can do things that Twisted can’t – it explicitly forbids any blocking calls, which makes it a lot more difficult for developers to accidentally block the main thread. All of the Node.js packages are designed with this in mind. In contrast, the Python world needs all dependencies to expose a nonblocking API. Because of this, I think Node.js could revolutionize parallel I/O in a similar way that garbage collection revolutionized memory management.
As for me, I’m really excited. I plan on using Node.js for “real projects” in the future, but I’m holding out for V8 to support “use strict”. I am growing to like JavaScript, but until “use strict” comes out with its support for strong dynamic typing and lack of semicolon insertion, I’m going to stick with Python for my real projects.