Tuesday, 27 October 2009

2to3 for map "fixed"!

The issue about the problem with 2to3 and map has been resolved with the following changeset. The fix is generally like saying "Fix your code yourself" rather than fixing the issue. Adding a warning is not an adequate fix for any issue. This fix just helps people who think their code will run after 2to3, but it is not addressing the core problem the slightest bit. Every bug (and every non-feature as the one added in the changeset) in 2to3 is hindering the widespread adoption of Python 3.x; I think the CPython developers should take a second to think about that before commiting such "fixes". At least leave the issue open, hoping for someone to patch it.

Sunday, 25 October 2009

The world's biggest hard-drive

Having ruined my Vista (another story I will blog about soon) I want to re-setup my Windows in order to be able to game. Before doing so, I have to backup my user files of Vista. As these are relatively large (~100G), I had to clear some space on my Linux and created a new partition for it (~150G). Having asked some folks in IRC about what file-system to use, I picked ext2. Copying slowed my system as hell (of course), but df -h verified that there was progress. Suddenly, my tty spit out some weird messages.
EXT2-fs error (device sda6):
ext2_new_blocks: Allocating block in system zone - blocks from 35586561, length 1

It wasn't long until the copy operation failed with the following error message.
No space left on device.

Okay.. I thought I had become insane and asked cfdisk about the size of the partition and du -hs about the size of the source files. About 150G and 100G, respectively. So everything was okay. Now the weird part starts. I asked df -h about how much of the partition was used, getting the following (quite amusing - considering my harddrive is 1TB large in total) output.
Filesystem Size Used Avail Use% Mounted on
/dev/sda2 92G 6.9G 81G 8% /
tmpfs 1.5G 0 1.5G 0% /lib/init/rw
varrun 1.5G 120K 1.5G 1% /var/run
varlock 1.5G 0 1.5G 0% /var/lock
udev 1.5G 152K 1.5G 1% /dev
tmpfs 1.5G 832K 1.5G 1% /dev/shm
lrm 1.5G 2.2M 1.5G 1% /lib/modules/2.6.28-16-generic/volatile
/dev/sda5 184G 139G 36G 80% /home
/dev/sda6 16T 16T 0 100% /mnt/media
/dev/sda1 489G 258G 231G 53% /media/disk

Note the emphasised part. 16T... This means 16 terrabytes. It is apparent that some bug must have occured. But hang on, the drive gets even larger. Disbelieving, I decided to see what du -hs would say about the directory that is on the device (I would be very thankful for any tips about what the error messages were trying to tell me).

name@gollum:/mnt/media$ du -hs name
du: cannot access `name/.gimp-2.6/.idlerc': Stale NFS file handle
du: cannot access `name/.gimp-2.6/Anwendungsdaten': Stale NFS file handle
du: cannot access `name/.gimp-2.6/AppData/.idlerc': Stale NFS file handle
du: cannot access `name/.gimp-2.6/AppData/Anwendungsdaten': Stale NFS file handle
du: cannot access `name/.idlerc': Stale NFS file handle
du: cannot access `name/Anwendungsdaten': Stale NFS file handle
du: cannot access `name/AppData/.gimp-2.6/.idlerc': Stale NFS file handle
du: cannot access `name/AppData/.gimp-2.6/Anwendungsdaten': Stale NFS file handle
du: cannot access `name/AppData/.idlerc': Stale NFS file handle
du: cannot access `name/AppData/Anwendungsdaten': Stale NFS file handle
19T name
name@gollum:/mnt/media$ du -hs lost+found
2.0T lost+found

So 19T and 2T, doesn't that make 21T? Either df lied or my maths is really rusty. Right now I'm trying to copy the same data onto an ext3 partition of the same size, hopefully this will work better.

UPDATE: With ext3, it worked fine.

Of Python3's map

Hello, it's been a long time since the last post, but here we go.

So what is wrong with map? Its semantics have secretly changed between Python 2.x and Python 3.x, without even 2to3 knowing. While the old, non-lazy, map of 2.x stopped when the last sequence was depleted, while the lazy one of 3.x (like 2.x itertools.imap) stops when the first sequence is depleted. The reasonability of this change can be quarreled about, but that is not really the topic here.

The real topic is that not even 2to3 is aware of this subtle change. An easy fix for this would be to translate map(fun, a, b, ...) to list(map(fun, *zip(*itertools.zip_longest(a, b, ...)))) (itertools.izip_longest in 2.6). Another problem with 2to3 and map is that None cannot be given as the function argument with 3.x, but 2to3 is unaware of this. Thus it should translate map(None, ..) to map(lambda *a: a, ..), which it doesn't.

So here is what 2to3 does. It minds neither of the semantic changes.
--- map_test.py (original)
+++ map_test.py (refactored)
@@ -1,1 +1,1 @@
-map(None, [1, 2, 3], [1, 2, 3, 4])
+list(map(None, [1, 2, 3], [1, 2, 3, 4]))
Python 2.x:
>>> map(None, [1, 2, 3], [1, 2, 3, 4])
[(1, 1), (2, 2), (3, 3), (None, 4)]
Python 3.x:
>>> list(map(None, [1, 2, 3], [1, 2, 3, 4]))
Traceback (most recent call last):
File "", line 1, in
TypeError: 'NoneType' object is not callable
>>> list(map(lambda *a: a, [1, 2, 3], [1, 2, 3, 4]))
[(1, 1), (2, 2), (3, 3)]
>>> list(map(lambda *a: a,
... *zip(*itertools.zip_longest([1, 2, 3], [1, 2, 3, 4]))))
[(1, 1), (2, 2), (3, 3), (None, 4)]



This shows that the proposed fixes work.

Issue created by birkenfeld after discussing the issue with me