2008-11-27

Python vs Ruby on beautiful code, Red Beauty, Green Beauty

 Comparing Python vs Ruby is kind of a sport, this time I'll talk about code beauty. Python's huge advantages are its mature and insightful libraries and its faster run time. Grammar-wise, they are awfully similar.

 Ruby is basically an slower Python where there are no functions; methods can't be freely passed around and are called on reference without parens; lambdas can be defined in-line (blocks); monkey-patching runs wild and has a virtually endless stock of little conveniences and shortcuts.

 Its actually no small loot, the niceties cut the character count and the more obscure shortcuts you know the more compact you can make your code. You can see dramatic differences on code length between beginner and expert Ruby devs.

 This is what I call Red Beauty: Ruby focuses on making code easier to write.

 Some of the features are simple trade-offs and I feel Python makes the right choices more often, I prefer the slot based philosophy of object orientation and having to use parens on methods is an small price to pay.

 Functions vs blocks are a false dichotomy, multiline lambdas could solve both problems but if I have to choose I prefer first class function objects, you can't pass more than one block to a method in Ruby.

 The near ban on monkey-patching can be painful, mostly in your pride, since its cooler to use your own methods on strings than wrappers, but I'll argue that its thanks to this Python has better libraries. Python libraries will always be superior period, expect me to byte my tongue in six years, but I think the philosophies of Python make for better module writing.

 So what about the niceties and shortcuts? Well its a mixed bag... I miss string interpolation but that's about it. The different ways to turn an string into a hash actually bother me because it means I have to learn many ways to do something to understand somebody else's code and code written using spacial case shortcuts can need complete re-writing when the specs change.

 So this is what I call Green Beauty: Python focuses on making code easier to maintain.

 So which is more beautiful? Both, they just have a different shade of beauty.
 

2008-11-24

The Bible is bullshit

I know it, you know it, they know it, but they're damn funny while knowing it, enjoy:



Penn & Teller; The Bible is Bullshit.

2008-11-15

PHP5 Iterators. MySQL iterator example.

PHP is stupid, enough said. Recently I wanted to abstract a table printing function so it could work with either arrays and mysql. In Python this is screams iterator and since I heard PHP5 supported iterators I alway wanted to write one. So before get to the PHP let me explain the Python way first:

The Pythonic Iterator Protocol:
  1. Take the object to traverse, call its '__iter__()' method to obtain/initialize it
  2. Call its 'next()' to obtain the current item
  3. Exit from the iteration when 'next()' raises the 'StopIteration'
Simple isn't it? All the work is done in 'next()' and all it has to do is return a value or raise 'StopIteration'


The PHP Iterator Protocol:


  1. Call 'rewind()' to make sure we are iterating from the begining.
  2. Call 'valid()', if it returns false exit from the iteration.
  3. Take the first element by calling 'current()' fetching the first element.
  4. Optionally get the key of the first element by calling 'key()'.
  5. Call 'next'()' to do whatever is necesary to fetch next item, ignore the return value.
  6. Call 'valid()', if it returns false exit from the iteration.
  7. Take the next element by calling 'current()'.
  8. Optionally get the key of the next element by calling 'key()'.
  9. Call 'next'()' to do whatever is necesary to fetch next item, ignore the return value.
  10. Repeat steps from 6 to 9.

"Wait a minute!" you say "steps 2-5 are the same that steps 6-9!" No they aren't. Steps 6-9 operate in the "next" item, the one 'next()' fetched for us. steps 2-5 operate on some ghostly "first" item that nobody has fetched yet.

So 'valid()', 'current()' and 'key()' have to behave differently for the first run. In practice it's sufficient with calling 'next()' from within 'valid()' the first time. But the two resons why this is horrible are because...

OOP and semantic purity are like M. Night Shyamalan and plot twists:

One implies the other, and it hurts when it doesn't match our expectatives. In OOP methods are named in a way that you know what they do just from looking at its name. The boolean method 'valid()' suggest a simple procedure to ensure the currently selected item is part of the iteration you don't expect it to also fetch the first item. Another problem is one of efficiency, for an array with N elements 'valid()' will have to make a test N times where it will evaluate the same allways except the very first case.

No, we have to take the inicialization out of the loop. OOP principles tell us the constructor is the place to make these set ups. But there is a problem, 'rewind()' is called just before the iteration begins! So we find ourselves in a dichcotomy:

  1. Fetch the first item in '__construct()', make 'rewind()' do nothing.
  2. Fetch the first item in 'rewind()', that is, call 'next()' after rewinding.
Either way 'rewind()' is a lier because it doesn't do what its name says it does. Now if I have to choose the leser evil, option 2 is the way to go, because it makes the iterator reusable which is the purpose of calling 'rewind()' in the first place. And so hereby I present:

A simple PHP MySQL Iterator:


class mysqlIter implements Iterator{
private $resource;

private $count = 0;
private $pos = -1;
private $valid;
private $curval;
public function __construct($resource){
$this->resource = $resource;
}
public function next(){
if ($value = mysql_fetch_assoc($this->resource)){
$this->valid = true;
$this->curval = $value;
$this->pos++;
} else {
$this->valid = false;
}
}
public function valid(){
return $this->valid;
}
public function current(){
return $this->curval;
}
public function key(){
return $this->pos;
}
public function rewind(){
mysql_data_seek($this->resource, 0);
$this->next();
}
public function count(){
return mysql_num_rows($this->resource);
}
}


Aftermat.

At first I wasn't aware 'next()' was not going to get called until the second leap, then 'rewind()' started to mess up the result, so it took me a little longer to implement the iterator. I blame the PHP way and its documentation.

A php-head will tell me that this is a case of PHP just being a different language, not stupid but the devil is in the details. For instance it is a good argument to say that there is nothing incosistent on rewind calling next() because it means a manually rewinded iterator is pointing to its first item always but this opens the question, why would you manually access the first item in an iterator? The answer is because you aren't exactly handling an iterator but a data structure that is iterable. Iteration happens direclty to the object, in Pythonland most iterables actually use a proxy iterator object (that's the purpose of '__iter__()') which means, among other things, that iterable objects don't need to contain iterator related attributes o methods.

Iterable objects in Python don't usually carry an internal pointer or implement next(), they simply have an '__iter__()' method that returns an object that does so.

Another implication is that, because a new iterator is instantiated on demand every time, the same data structure can be traversed by multiple clients without conflicts unlike PHP iterators.

But there are other problems with the argument that 'rewind()' calling 'next()' ensures the internal pointer is at the right position. One of them is that, if directly accessing an iterable is so desirable, then one would expect people to access freshly instantiated iterators. That means '__construct()' also should call 'next()', just in case.

But if an iterator is instantiated and then used (a very common pattern) then the first item would have been fetched twice!!

In short, iterators in PHP5 suck.