Tuesday, April 22, 2008

Google's Summer of Code

PyPy got one proposal accepted for Google's Summer of Code under the Python Software Foundation's umbrella. We welcome Bruno Gola into the PyPy community. He will work on supporting all Python 2.5 features in PyPy and will also update PyPy's standard library to support the modules that were modified or new in Python 2.5.

Right now PyPy supports only Python 2.4 fully (some Python 2.5 features have already sneaked in, though).

Thursday, April 17, 2008

Float operations for JIT

Recently, we taught the JIT x86 backend how to produce code for the x87 floating point coprocessor. This means that JIT is able to nicely speed up float operations (this this is not true for our Python interpreter yet - we did not integrate it yet). This is the first time we started going beyond what is feasible in psyco - it would take a lot of effort to make floats working on top of psyco, way more than it will take on PyPy.

This work is in very early stage and lives on a jit-hotpath branch, which includes all our recent experiments on JIT compiler generation, including tracing JIT experiments and huge JIT refactoring.

Because we don't encode the Python's semantics in our JIT (which is really a JIT generator), it is expected that our Python interpreter with a JIT will become fast "suddenly", when our JIT generator is good enough. If this point is reached, we would also get fast interpreters for Smalltalk or JavaScript with relatively low effort.

Stay tuned.


Tuesday, April 8, 2008

Wrapping pyrepl in the readline API

If you translate a pypy-c with --allworkingmodules and start it, you will probably not notice anything strange about its prompt - except when typing multiline statements. You can move the cursor up and continue editing previous lines. And the history is multiline-statements-aware as well. Great experience! Ah, and completion using tab is nice too.

Truth be told, there is nothing new here: it was all done by Michael Hudson's pyrepl many years ago. We had already included pyrepl in PyPy some time ago. What is new is a pure Python readline.py which exposes the most important parts of the API of the standard readline module by wrapping pyrepl under the hood, without needing the GNU readline library at all. The PyPy prompt is based on this, benefitting automagically from pyrepl's multiline editing capabilities, with minor tweaks so that the prompt looks much more like CPython's than a regular pyrepl prompt does.

You can also try and use this multiline prompt with CPython: check out pyrepl at http://codespeak.net/svn/pyrepl/trunk/pyrepl and run the new pythoni1 script.

Wednesday, April 2, 2008

Other April's Fools Ideas

While discussing what to post as an April Fool's joke yesterday, we had a couple of other ideas, listed below. Most of them were rejected because they are too incredible, others because they are too close to our wish list.

  • quantum computer backend
  • Perl6 interpreter in RPython
  • Ruby backend to allow run "python on rails"
  • mandatory static typing at app-level, because it's the only way to increase performances
  • rewrite PyPy in Haskell, because we discovered that dynamic typing is just not suitable for a project of this size
  • a C front-end, so that we can interpret the C source of Python C extensions and JIT it. This would work by writing an interpreter for LLVM bytecode in RPython.
  • an elisp backend
  • a TeX backend (use PyPy for your advanced typesetting needs)
  • an SQL JIT backend, pushing remote procedures into the DB engine

Tuesday, April 1, 2008

Trying to get PyPy to run on Python 3.0

As you surely know, Python 3.0 is coming; recently, they released Python 3.0 alpha 3, and the final version is expected around September.

As suggested by the migration guide (in the PEP 3000), we started by applying 2to3 to our standard interpreter, which is written in RPython (though we should call it RPython 2.4 now, as opposed to RPython 3.0 -- see below).

Converting was not seamless, but most of the resulting bugs were due to the new dict views, str/unicode changes and the missing "reduce" built-in. After forking and refactoring both our interpreter and the 2to3 script, the Python interpreter runs on Python 3.0 alpha 3!

Next step was to run 2to3 over the whole translation toolchain, i.e. the part of PyPy which takes care of analyzing the interpreter in order to produce efficient executables; after the good results we got with the standard interpreter, we were confident that it would have been relatively easy to run 2to3 over it: unfortunately, it was not :-(.

After letting 2to3 run for days and days uninterrupted, we decided to kill it: we assume that the toolchain is simply too complex to be converted in a reasonable amount of time.

So, we needed to think something else; THE great idea we had was to turn everything upside-down: if we can't port PyPy to Py3k, we can always port Py3k to PyPy!

Under the hood, the 2to3 conversion tool operates as a graph transformer: it takes the graph of your program (in the form of Python 2.x source file) and returns a transformed graph of the same program (in the form of Python 3.0 source file). Since the entire translation toolchain of PyPy is based on graph transformations, we could reuse it to modify the behaviour of the 2to3 tool. We wrote a general graph-inverter algorithm which, as the name suggests, takes a graph transformation and build the inverse transformation; then, we applied the graph inverter to 2to3, getting something that we called 3to2: it is important to underline that 3to2 was built by automatically analysing 2to3 and reversing its operation with only the help of a few manual hints. For this reason and because we are not keeping generated files under version control, we do not need to maintain this new tool in the Subversion repository.

Once we built 3to2, it was relatively easy to pipe its result to our interpreter, getting something that can run Python 3.0 programs.

Performance-wise, this approach has the problem of being slower at import time, because it needs to run (automatically) 3to2 every time the source is modified; in the future, we plan to apply our JIT techniques also to this part of the interpreter, trying to mitigate the slowdown until it is not noticeable anymore to the final user.

In the next weeks, we will work on the transformation (and probably publish the technique as a research paper, with a title like "Automatic Program Reversion on Intermediate Languages").

UPDATE: In case anybody didn't guess or didn't spot the acronym: The above was an April Fool's joke. Nearly nothing of it is true.