Friday, April 27, 2018

How to ignore the annoying Cython warnings in PyPy 6.0


If you install any Cython-based module in PyPy 6.0.0, it is very likely that you get a warning like this:
>>>> import numpy
/data/extra/pypy/6.0.0/site-packages/numpy/random/__init__.py:99: UserWarning: __builtin__.type size changed, may indicate binary incompatibility. Expected 888, got 408
  from .mtrand import *
The TL;DR version is: the warning is a false alarm, and you can hide it by doing:
$ pypy -m pip install pypy-fix-cython-warning
The package does not contain any module, only a .pth file which installs a warning filter at startup.

Technical details

This happens because whenever Cython compiles a pyx file, it generates C code which does a sanity check on the C size of PyType_Type. PyPy versions up to 5.10 are buggy and report the incorrect size, so Cython includes a workaround to compare it with the incorrect value, when on PyPy.
PyPy 6 fixed the bug and now PyType_Type reports the correct size; however, Cython still tries to compare it with the old, buggy value, so it (wrongly) emits the warning.
Cython 0.28.2 includes a fix for it, so that C files generated by it no longer emit the warning. However, most packages are distributed with pre-cythonized C files. For example, numpy-1.14.2.zip include C files which were generated by Cython 0.26.1: if you compile it you still get the warning, even if you locally installed a newer version of Cython.

There is not much that we can do on the PyPy side, apart for waiting for all the Cython-based packages to do a new release which include C files generated by a newer Cython.  In the mean time, installing this module will silence the warning.

Thursday, April 26, 2018

PyPy2.7 and PyPy3.5 v6.0 dual release

The PyPy team is proud to release both PyPy2.7 v6.0 (an interpreter supporting Python 2.7 syntax), and a PyPy3.5 v6.0 (an interpreter supporting Python 3.5 syntax). The two releases are both based on much the same codebase, thus the dual release.
This release is a feature release following our previous 5.10 incremental release in late December 2017. Our C-API compatibility layer cpyext is now much faster (see the blog post) as well as more complete. We have made many other improvements in speed and CPython compatibility. Since the changes affect the included python development header files, all c-extension modules must be recompiled for this version.
Until we can work with downstream providers to distribute builds with PyPy, we have made packages for some common packages available as wheels. You may compile yourself using pip install --no-build-isolation <package>, the no-build-isolation is currently needed for pip v10.
First-time python users are often stumped by silly typos and omissions when getting started writing code. We have improved our parser to emit more friendly syntax errors, making PyPy not only faster but more friendly.
The GC now has hooks to gain more insights into its performance
The default Matplotlib TkAgg backend now works with PyPy, as do pygame and pygobject.
We updated the cffi module included in PyPy to version 1.11.5, and the cppyy backend to 0.6.0. Please use these to wrap your C and C++ code, respectively, for a JIT friendly experience.
As always, this release is 100% compatible with the previous one and fixed several issues and bugs raised by the growing community of PyPy users. We strongly recommend updating.
The Windows PyPy3.5 release is still considered beta-quality. There are open issues with unicode handling especially around system calls and c-extensions.
The utf8 branch that changes internal representation of unicode to utf8 did not make it into the release, so there is still more goodness coming. We also began working on a Python3.6 implementation, help is welcome.
You can download the v6.0 releases here:
We would like to thank our donors for the continued support of the PyPy project. If PyPy is not quite good enough for your needs, we are available for direct consulting work.
We would also like to thank our contributors and encourage new people to join the project. PyPy has many layers and we need help with all of them: PyPy and RPython documentation improvements, tweaking popular modules to run on pypy, or general help with making RPython’s JIT even better.

What is PyPy?

PyPy is a very compliant Python interpreter, almost a drop-in replacement for CPython 2.7 and CPython 3.5. It’s fast (PyPy and CPython 2.7.x performance comparison) due to its integrated tracing JIT compiler.
We also welcome developers of other dynamic languages to see what RPython can do for them.
The PyPy release supports:
  • x86 machines on most common operating systems (Linux 32/64 bits, Mac OS X 64 bits, Windows 32 bits, OpenBSD, FreeBSD)
  • newer ARM hardware (ARMv6 or ARMv7, with VFPv3) running Linux,
  • big- and little-endian variants of PPC64 running Linux,
  • s390x running Linux

What else is new?

PyPy 5.10 was released in Dec, 2017.
There are many incremental improvements to RPython and PyPy, the complete listing is here.
 
Please update, and continue to help us make PyPy better.

Cheers, The PyPy team

Tuesday, April 10, 2018

Improving SyntaxError in PyPy

For the last year, my halftime job has been to teach non-CS uni students to program in Python. While doing that, I have been trying to see what common stumbling blocks exist for novice programmers. There are many things that could be said here, but a common theme that emerges is hard-to-understand error messages. One source of such error messages, particularly when starting out, is SyntaxErrors.

PyPy's parser (mostly following the architecture of CPython) uses a regular-expression-based tokenizer with some cleverness to deal with indentation, and a simple LR(1) parser. Both of these components obviously produce errors for invalid syntax, but the messages are not very helpful. Often, the message is just "invalid syntax", without any hint of what exactly is wrong. In the last couple of weeks I have invested a little bit of effort to make them a tiny bit better. They will be part of the upcoming PyPy 6.0 release. Here are some examples of what changed.

Missing Characters

The first class of errors occurs when a token is missing, often there is only one valid token that the parser expects. This happens most commonly by leaving out the ':' after control flow statements (which is the syntax error I personally still make at least a few times a day). In such situations, the parser will now tell you which character it expected:

>>>> # before
>>>> if 1
  File "<stdin>", line 1
    if 1
       ^
SyntaxError: invalid syntax
>>>>

>>>> # after
>>>> if 1
  File "<stdin>", line 1
    if 1
       ^
SyntaxError: invalid syntax (expected ':')
>>>>

Another example of this feature:

>>>> # before
>>>> def f:
  File "<stdin>", line 1
    def f:
        ^
SyntaxError: invalid syntax
>>>>

>>>> # after
>>>> def f:
  File "<stdin>", line 1
    def f:
         ^
SyntaxError: invalid syntax (expected '(')
>>>>

Parentheses

Another source of errors are unmatched parentheses. Here, PyPy has always had slightly better error messages than CPython:

>>> # CPython
>>> )
  File "<stdin>", line 1
    )
    ^
SyntaxError: invalid syntax
>>>

>>>> # PyPy
>>> )
  File "<stdin>", line 1
    )
    ^
SyntaxError: unmatched ')'
>>>>

The same is true for parentheses that are never closed (the call to eval is needed to get the error, otherwise the repl will just wait for more input):

>>> # CPython
>>> eval('(')
  File "<string>", line 1
    (
    ^
SyntaxError: unexpected EOF while parsing
>>>

>>>> # PyPy
>>>> eval('(')
  File "<string>", line 1
    (
    ^
SyntaxError: parenthesis is never closed
>>>>

What I have now improved is the case of parentheses that are matched wrongly:

>>>> # before
>>>> (1,
.... 2,
.... ]
  File "<stdin>", line 3
    ]
    ^
SyntaxError: invalid syntax
>>>>

>>>> # after
>>>> (1,
.... 2,
.... ]
  File "<stdin>", line 3
    ]
    ^
SyntaxError: closing parenthesis ']' does not match opening parenthesis '(' on line 1
>>>>

Conclusion

Obviously these are just some very simple cases, and there is still a lot of room for improvement (one huge problem is that only a single SyntaxError is ever shown per parse attempt, but fixing that is rather hard).

If you have a favorite unhelpful SyntaxError message you love to hate, please tell us in the comments and we might try to improve it. Other kinds of non-informative error messages are also always welcome!