Shed Skin
From Wikipedia, the free encyclopedia
Shed Skin | |
Developer: | Mark Dufour |
---|---|
Latest release: | 0.0.21 / March 26, 2007 |
OS: | Cross-platform |
Use: | Compiler |
Website: | mark.dufour.googlepages.com |
Shed Skin is an experimental compiler that can translate pure, but implicitly statically typed Python programs into optimized C++. The typing restriction means programs often have to be changed in minor ways, or are not supported. Shed Skin is currently limited to smallish programs, that do not make heavy use of the Python standard library, although an increasing number of modules is supported. The largest program so far supported is 1,600 lines.
Variables can only ever have a single type. So e.g. a = 1; a = '1' is not allowed. A single type can be abstract or generic (as in C++), so that e.g. a = A(); a = B(), where A and B have a common base class, is allowed.
The performance of code generated by Shed Skin is typically 2-40 times better than that obtained by using Psyco. It is in some cases still far from manually optimized C++ code, but several optimizations still need to be added. Additionally, generated code is completely independent of a Python run-time, such as the CPython interpreter, which makes it useful for use with hardware-constrained embedded systems. Another advantage is better obfuscation: it is much more difficult to reverse-engineer machine language generated by a C++ compiler) than Python bytecode.
To deduce type information in order to generate C++ type declarations, such as int, Shed Skin uses cutting-edge type inference techniques. It combines Ole Agesen's Cartesian Product Algorithm with John Plevyak's Iterative Class Splitting technique. As of yet, it is unclear whether these techniques will allow Shed Skin to scale much further than the mentioned 1,600 lines. However, the author still plans to make significant improvements, which should improve scalability dramatically. Additionally, type profiling or remembering analysis results between compile sessions will make a huge difference.
There is currently no easy way to connect compiled code with arbitrary Python code, besides using files or standard in- and output. The author is hoping someone else would like to look into this aspect. The author is also hoping other persons will join in to add memory optimizations (transforming heap allocation into stack- and static preallocation), look into wrap-around check elimination, improve the C++ implementation of the Python builtins, or add support for other library modules, possibly by running Shed Skin over pure Python implementations from the PyPy/Jython projects.
Shed Skin currently consists of only 6,000 lines of code, excluding the C++ implementation of the Python builtins (December 8, 2006). The reason it is so small, is that it does not try support several (sometimes important) features of Python. As such, typical future usage may be to only optimize speed-critical parts of larger Python projects. To a certain extent, this makes it similar to projects such as Pyrex, Boo and Cinpy. The difference that it allows one to write extensions using pure Python.
[edit] Current Limitations
- Programs cannot freely use the Python standard library, but several common imports are supported (see lib/*.py).
- The type analysis currently does not scale well beyond a few hundreds of lines of code.
- It is not (yet) possible to build extension modules.