Python 3.7 is out. If you want to learn all about it, watch my Pluralsight.com course.
Python 3.7 benefitted from both new functionality and optimizations. From what we know so far about 3.8, it’s going to be a similar story. This time, most of the new functionality is targeted at C extension and module development.
Based on the existing, Python Enhancement Proposals, or “PEPs” that have been submitted for 3.8 we have a good grasp on what features are likely to be included. I’ve put together a PEP-Explorer UI here for 3.8.
Many of the submitted PEPs are in draft status, which means the implementation details have not been finalised. Also, PEPs have to be approved by the Grand-Master, or BDFL, before they reach approval.
We’ll see the first beta early 2019 and it will be “feature-frozen” around June 2019.
Interpreter startup times are going to be improved
Python startup times have always been “slow” as is common with interpreted languages. Even for the pre-compiled (ie Python files with an existing .pyc cache) scripts, the time taken for the Python interpreter to start can be a problem if you’re starting multiple processes.
As shown in this graph (where lower is faster), Python 3 is slower to start than 2.7, and PyPy even slower still because of the JIT initialisation process.
Attempts have been made to optimise startup, but nothing “drastic” has made a significant difference. 3.8 has been rumoured to be the target version for such improvements and PEP 432 explains a clear strategy for splitting the startup process into stages. The idea is that when running
python from the command line or via a WSGI process, it will run the same initialisation sequence regardless of whether you’re wanting to run unit tests, explore the REPL, run 1 function, or execute a pre-compiled script.
Research has shown that the majority of the Python startup time is dominated by I/O because of the complexity of Python’s import paths and the number of libraries within a typical installation.
PEP432 on it’s own is not going to improve startup performance, but with another (yet to be written, and I’m just guessing it will), PEP to propose how pre-compiled scripts can have their import sequence and state either cached or configured, would make a drastic difference.
Having multiple interpreters
With linkages to startup time, PEP 554 proposes a new standard library module,
interpreters which will expose already existing C-APIs for having multiple Python interpreters within a single process. This allows for isolation of code within less overhead than an entire Python process.
PEP 554 also proposes extending the existing APIs to allow better sharing between interpreters of data.
Here’s what that might look like:
interp = interpreters.create()
Any C#, Perl, PHP or Swift developers may be familiar with Null-Aware operators, which can be used for many purposes. One of my favourites for C#, is the null-aware ternary operator. In this example, the value of fruit is assigned to the value of
val, unless it is null, in which case assign it the value of
var fruit = val ?? "watermelon";
PEP505 proposes 3 Python equivalents for the
None value, those are a similar to this C# example, but the Python flavor.
if val is None:
fruit = "watermelon"
fruit = val
# now becomes in PEP505..
fruit = val ?? "watermelon"
None-aware attribute access
if val.fruit is not None:
fruit = val.fruit.name()
# now becomes in PEP505
fruit = val.fruit?.name()
And similarly, for slicing or indexing a value where you’re not sure if is set.
list_of_things = get_values() # could be ``list`` or None
first = list_of_things?
Another PEP looking at similar behaviours is PEP 532.
Generator-sensitive Context Variables
PEP 567, included in Python 3.7, introduced Context Variables, which are context-local state and similar to thread-local storage. They work nicely with thread-like environments such as asyncio tasks.
PEP 568, by Nathaniel Smith (of Trio fame), builds on PEP567 but adds generator context sensitivity. Which is great news for those working with asyncio and wanting to use generators. I don’t use asyncio a great deal right now, so this one was a bit over my head.
Extending the API for C extension methods
@classmethod , the class-method is typically defined in pure-Python but can also be written in C
This has been the most controversial proposal to 3.8, and a form of it has already been approved.
PEP 572 proposes changes to the Python grammar to enable “assignment expressions”. Understanding this change requires a comprehension of the differences between a statement in Python and an expression.
Python has many types of simple statement, each ends in a linebreak (unless you use a semicolon, such as
import pdb; pdb.set_trace() .
- Import statement-
- Flow and pass statements, eg.
- Expression statements, eg.
x = y,
x += y
Python also has expressions, found within certain types of statements
- If statements have the syntax
if TEST: SUITE, where
SUITEis a set of statements nested with whitespace and
TESTis a single comparison expression or series a of, using
- For statements have the syntax
for EXPRESSION LIST in TESTS: SUITE
- The delete statements,
- With as statements
with TEST as EXPRESSION: SUITE
- List and dictionary comprehensions
What you can’t do is put statements into expressions, because statements don’t return anything. So
if x = y: doesn’t work
>>> x = 1
>>> y = 2
>>> if x=y:
File "<stdin>", line 1
SyntaxError: invalid syntax
PEP 572 proposes to change this with a new
:= operator and a new type of syntax for assignment expressions.
Take this example, you have a list of products and you want to calculate the shipping total. Currently in Python, if you use an
if statement within a list comprehension you can’t make a statement to assign the value of something.
With this new syntax, you can. The important part in this example is the creation of a new name,
cost , within a list comprehension which is a product of the
to_usd function call.
Want to play with this new syntax?
I built a branch with the assignment expression grammar and the None-value coalescence operators grammar: https://github.com/tonybaloney/cpython/tree/python38_wonderland
Changes to Built-in function classes
“PEP576 proposes to extend the classes for built-in functions and methods to be more like Python functions. Specifically, built-in functions and methods will gain access to the module they are declared in, and built-in methods will have access to the class they belong to.”
Why is this needed? Well, if you’re developing a Python module, such as Cython and want to develop functions in C, you have 2 choices:
- Use the builtin CPython functions, like
- Build your own version of len, print etc. Which is generally a bad idea.
This proposal adds 2 changes to the C API:
- A new function
PyBuiltinFunction_New(PyMethodDef *ml, PyObject *module)is added to create built-in functions.
PyCFunction_New()are deprecated and will return a
PyBuiltinFunctionif able, otherwise a
The PEP also proposes a new builtin class
A related PEP, 573 looks to extend the API that extension methods written in C have access to enabling them to see the state of the module without having to call an expensive
PyState_FindModule operation. Again, the implementation is mostly useful to Cython, mileage may vary.
Another related PEP, 580, is related to the development of extension types from the built in instances of
None of these classes is subclassable. So, any optimisations based on assumptions about method, builtins, etc. cannot be made.
573 proposes that the checks for
method_descriptor are replaced with a new “C Call” protocol.
Use of this new protocol means that user-developed extension types would get the same optimisation benefits as builtins, such as the new 20% faster LOAD_METHOD opcode added in Python 3.7.
Python Runtime Audit Hooks
PEP 578 proposes adding hooks into the CPython runtime, these hooks will enable developers of :
- Security Software
- Debugging Software
- Profiling Software, and likely some other examples I can’t think of
to “hook” into core runtime events and execute extended code. The API will be added into the
sys module, with the ability to both call the hook arbitrarily as well as configure your own hooks. Once added, hooks cannot be removed or replaced.
# Add an auditing hook
sys.addaudithook(hook: Callable[[str, tuple]])
# Raise an event with all auditing hooks
Example and proposed events include exec, import, compile and object.__setattr__. The PEP makes some recommendations on basic low-level hooks such as the execution of code objects, but also higher-level hooks such as the opening of network sockets and calling of URLs.
I can hear the eyebrows of security-buffs everywhere raising at that last statement. I am a huge-fan of this PEP as it could lead to some excellent 3rd party plugins for CPython to lock down execution environments. Similar to what SELinux does to the Linux kernel.
Hooks will implement responses to the event, typical responses will be to log the event, abort the operation with an exception, or to immediately terminate the process with an operating system exit call.
Here are some example uses I could think of
- Detect monkey-patching of core objects and functions
- Disable/Log opening of network sockets by default for all non-root users
- Trap/Proxy opening of remote URL connections
- Detect import operations to capture the import-tree of the runtime and it’s tests.