After long holiday on my wedding preparation, D day and honeymoon, I returned back in shape.
Some things happened during the off days, such as the acquisition of SpringSource by VMWare, Inc.
I try to run my piece of XML parsing code like this in Python 3.1:
filename = "companies.xml"
parser = xml.parsers.expat.ParserCreate()
f = open(filename, "r")
The code above throws an error like this:
TypeError: read() did not return a bytes object (type=str)
This source code works in Python 2.6, but failed when running on Python 3.1.
After comparing the manual for the built-in open() function between those 2 versions (2.6 and 3.x versions), I found out that there is a new feature in Python 3.x which is not backward compatibility (unlike transitions between 2.x versions, the transition from 2.x to 3.x may break the backward compatibility).
Python 3.x added the "t" for text, "b" for binary, "+" for updating (read/write) and "U" modifier to the open file mode.
It turns out to be that Python 3.1 no more handles file open as binary by default.Now the text ("t") mode become default for Python 3.1 meanwhile Python 2.6 assume all file access are binary accesses. Python 3.x has implemented modes of more similarity to its C stdio library counterpart than its previous 2.6 version.
Since Python 3.x added additional parameter to the open() function, we must now specify "b" to make it binary access, so that the read() method will return bytes, otherwise it will return str.
xmlFilename = "companies.xml"
p = xml.parsers.expat.ParserCreate()
f = open(xmlFilename, "rb")
Now the ParseFile() method get what it asks for, a file handler with read() method that return bytes instead of str.