With last night's storms rolling through, I thought something from the lighter side of life would be fun:
While I've been writing my Git book, I keep trying to pick up new tricks and figure out places to use them. A came across a post on git rebase --interactive and knew I had to find an opportunity to play with this. This morning, that opportunity came.
I've been working on some performance enhancement at work this week. I use Git locally for all of my development, then push back to our central Subversion repo once my changes are done and approved for the main repository. Timing for the changes I made earlier in the week was bad, so I was going to hang on to them for a week and push them when our engineering/QA department can absorb them more easily.
... the observant reader should be picking up on the past-tense here ...
So as part of my changes I created a Registry to share a single instance of an object around instead of going the Singleton route or putting the data in the global scope. Pretty useful all round, but silly me didn't think to put that in a branch by itself so I could pull that change around as necessary. No worries.
Light Bulb! "Ah ha! I'll just switch into my branch and use git rebase --interactive to rewrite all of those commits and squash them together. A quick git cherry-pick later and I'll have my Registry object in this new branch on some other functionality I'm working on."
If you've ever used git rebase you can probably see where this is going. If you haven't, it's probably one of the more dangerous commands—if not the most—in Git. It will literally let you reshape the history of the repository. Running with scissors is child's play. No, rebase can really screw you over. Imagine riding a unicycle while juggling four meat cleavers and you'll have the idea. Really, and I mean really impressive to watch, but one false move and you're screwed.
So back to my experience. I proceeded to weed through the 80 or so commits from Monday afternoon and Tuesday morning and pull out everything that didn't relate to the code I wanted. "Git will just ignore those" I naively thought. I got it down to the 8 or so commits that had to do with my registry object and its tests and squashed them all into one commit. I knew I was rewriting the history, but it never occurred to me that I was rewriting history. I let rebase do its thing, then cherry-pick'd my newly created revision and was off.
Until this afternoon. Rumor of my changes had made it around the office and someone else wanted to see them. Since I only track our trunk and not all of our branches and I'm the only there using Git, the easiest solution was to just create a patch to send to him. Simple enough. Use git merge --squash and git diff --no-prefix --cached and I'd have a patch he could apply directly to his copy and we're set.
Needless to say, when I started reviewing the patch, I notice there was a lot missing. :-(
Lesson learned. Don't play with live data when trying new stuff out. When you miss a bean bag while you're juggling, you don't loose your toe...
I can't believe I haven't seen this on planet-php, but um... did everyone miss that Project Zero is now running SugarCRM?
Am I the only one that's thinking a non-Zend Engine PHP might be kind of cool? Maybe I'm biased since I use a lot of Jython in my day job scripting The Grinder, but hooking a dynamic language into the JVM is pretty cool. Imagine not having to use tools like memcached 'cause all you need for a cache that's not mission critical is to instantiate a HashTable object.
I definitely hope IBM has plans beyond the commercial WebSphere sMash. More interpreters is definitely a good thing. Hopefully some others will take the cue and get PHP on the DLR (I'm looking at you Liz).
Either I'm making up for time I spent slacking, or I just took 4 years off of my life today. Today I managed to:
- Have my git book from Pragmatic Bookshelf announced: Pragmatic Version Control Using Git - It'll be on a shelf near you this fall and in PDF form before then.
- I filed my taxes (not completely today - I'd already done then, just hadn't submitted them yet)
- Proved conclusively that an installation stack wasn't stable for production (which is always fun when it's a random failure and you're having to play by the Law of Large Numbers)
- Proved conclusively that another stack was stable
- Should be able to prove or disprove a third stack configuration by the end of the night
- Got PHPT setup for inclusion into PHP as a replacement for the run-tests.php code which means it'll be distributed as part of the core PHP product
- Got PHPT onto the Google Summer of Code project list for the upcoming summer (as part of it becoming part of php-src)
- Fleshed out a killer business idea with a buddy of mine
- Wrote what I think was a pretty good article on relational databases
- Spent an hour at the gym this morning doing a lower-body workout
And it's just now 5. ;-)
There's been a lot of buzz around cloud computing with Google's new App Engine environment. I'm not going to discuss the merits or issues with their new product and instead focus on the fact that they're giving developers access to the BigTable via its DataStore.
This is the second major player to release access to an internal storage system that's document based, Amazon's SimpleDB being the other. In the open-source space we have CouchDB which is a distributed database that's completely document based. I see a definite trend from creating databases with relationships to store everything in one big blob.
And this makes sense. There are three motivators for normalized databases: reducing storage space, reducing processing required to find multiple instances of data, reducing number of places to update when something changes. The first two are hardware related. Computers and disks are ridiculously fast and cheap right now. For less than $10k you can have a machine that can outpace a half million dollar machine from less than a decade ago.
That leaves having to update multiple places when you change data. This is a technology problem and one that a lot of developers will recognize. It looks eerily similar to what we've been doing with caches for a long time. We grab the data, cache it the way we want, then retrieve that and ignore the shiny RDBMS that we have.
The problem with this is the same one you face with updating data in a de-normalized database. The default way is to just timeout and grab data after its old enough that we consider it stale. It's rudimentary, but works. The next step is having your data layer smart enough to expire the cache itself when it changes. Taking that further you end up with code that expires and primes the cache whenever there's an update.
Any route you take with caching, you end up with two data layers, the raw, relational database and the cached views of it. So it seems that document databases are the next logical step. Instead of creating this layer on top of you data to cache it, just store it as blob of data.
The one thing I haven't seen yet with these document-based systems is some sort of trigger mechanism in place, though, to make changes in one place ripple through the system. That can exist at the database level or the ORM level. It just has to be transparent to the developer.
Another interesting approach is what MySQL is planning for MySQL 6. They'll have communication with memcached from within MySQL via a user-defined function (UDF). There's already a project on forge.mysql.com that brings this functionality, but its still in alpha. Something like this coupled with the use of triggers could address these issues completely by making the MySQL server simply a data storage and backup mechanism that you only touch during development.
My second year participating in the CSS Naked Day. Getting a bit of a late start (it's already 11), but my site will be sans-CSS for the rest of the day to show how readable sites can be that use CSS for all of their layout.
Looking at my site, I need to move the menu so its positioned via CSS instead of falling semantically at the top. With that change, this will be as good as the previous incarnation of my site which was 100% semantic. :-)
How's that for a loaded title? ;-) First off, let me state that I am by no means a PHP hater. I've actually made quite a good living off of it, and unlike some people in other languages, I've never ended up on the streets homeless cause of PHP (yet).
I made a comment on Twitter about DHH's PHP post.
@chartjes that's sort of a backhanded comment, but it's dead on... PHP is ok until you start getting big, then it generally gets ugly
He took that as a bash and decided to challenge me to a dual. We nominated seconds, planned to cross the river into New Jersey, then realized that Twitter without loaded pistols would probably be a much better solution. And that's how the world's first twebate was born.
My final tweet on the debate I think sums up my view of PHP right now:
[PHP]'s like that super smart slacker we all knew in HS. If it would just apply itself, it could be really great.
I'm not exactly sure what it needs to do to "apply itself" as a language. It could start by realizing "because we've always done it that way" isn't a good reason not to change broken behavior. That's what compatibility modes are for. It could also look to Python to see how everything as an object can be liberating unlike Java where it can hinder you. It could realize that extensions are the next obvious step for anyone trying to make PHP hum and make the command to load a function/module—whether it's parsed or compiled—the same.
Every time I see a mysql_real_escape_string(), have to look up to see whether the string function I'm using has an underscore or not between str and what it does, or have to read a manual entry to figure out if my array function modifies the array I pass it or returns a new one I sigh and can't help but be reminded by that anti-social kid in the back of the class, drawing furiously in his notebook. He was a genius, he just needed some encouragement and the guts to grow up.
Here's a pic ruthlessly stolen from a friend's iPhone last night. It's the main drag in downtown Lawrence, Massachusetts St.

I've heard rumors that some of the bars are not going to be serving bottles of beer next weekend to keep the amount of sprayed beer to a minimum. Unfortunately I wasn't partaking in the partying. Next week though. By then this cold should have run its course and I'll participate in the mob-making... :-)
