Providing attribute access to Python dictionary entries Acknowledgment Thanks to , BlackSwan Technologies SVP of Engineering, who proposed the idea for this project and provided invaluable guidance on its development. The project is an offshoot of BlackSwan’s work developing a Cloud AI Operating System, or , which is intended to provide 10x productivity improvements when coding for cloud/serverless environments. Asher Sterkin CAIOS Problem Statement JavaScript has advantages over native Python when it comes to accessing attribute values in a dictionary object. In this article, we will demonstrate how to achieve the same level of usability and performance in Python as with JavaScript. JavaScript Dictionary Access With JavaScript, key/value pairs can be accessed directly from a dictionary object either through the indexer or as a property of the object. dict = { : “Chris”, “one”: , : “some value”}; name = dict[“FirstName”]; name = dict.FirstName; var FirstName 1 1 // using indexer var // as property var In other words, in JavaScript, one could use dict.x and dict[‘x’] or dict[y] where y=’x’ interchangeably. Python Dictionary Even though it is possible to access object attributes by obj.attr notation, it does not work for dictionaries. In the dictionary, you can get the value using the following methods: dict = {“Name”: , : ,}dict[’Name’] dict[x] where x=’Name’ dict.get(‘Name’, ) or dict.get(x, ) "Chris" "Age" 25 default default Web API and Configuration Files When using Python, almost all external documents are converted through one of these formats into dictionaries: JSON/YAML configuration files, messages exchanged via Web API, or AWS lambda events. XML sometimes is also used. AWS SDK Our team often has to work with deeply nested files like data coming from the or as an parameter of the . AWS SDK event Lambda function handler { : [{ : , : datetime( , , ) }], : { : , : } } 'Buckets' 'Name' 'string' 'CreationDate' 2015 1 1 'Owner' 'DisplayName' 'string' 'ID' 'string' Code Write/Read Speed Optimization The problem is work efficiency. For example, JavaScript notation requires only 75% (one dot character vs two brackets and quotes) of the writing and reading overhead when compared to Python. Attribute Access In order to provide non-trivial access to attributes in Python, one has to implement two magic methods: . __getattr__ and __setattr __ Based on the discussion above, we need to extend the behavior of the existing dict class with these two magic methods. The adapter design pattern accomplishes this task. There are two options to consider: . Object Adapter or Class Adapter Evaluating Object Adapter Applying the Object Adapter design pattern means wrapping the original dict object with an external one and implementing the required magic methods. Python collections.abc One possibility is to implement Mapping and Mutable Mapping abstractions from the module, then to add __getattr__ and __setattr__ magic methods to them. Indeed, that was how the initial version of jdict was implemented. collections.abc This method turned out to be heavyweight and inefficient: It required reproducing all the methods of the Python dictionary. It behaved even worse when we needed to deal with deeply nested data structures. To learn how we finally addressed nested structures, see the JSON hook and botocore patch sections below. UserDict is another possible form of Object Adapter for a Python dictionary. In this case, it comes from the Python standard library. UserDict Using this option does not offer any significant advantage, since: After Python 2.2, it’s possible to inherit directly from the built-in dict class. We also have to reproduce the magic methods of the attribute. It incurs the overhead of regular __getitem__, __setitem__ operations. Named Tuples Another idea was to make the dictionary behave like named tuples, which supports attribute-level access. This approach also turned out to be ineffective: It created a complete copy of original dictionary and thus was impractical from a performance point of view. It did not solve the nested data structure problem. Jdict Class Adapter After completing our the research, we came to the conclusion that applying the Class Adapter design pattern has the best potential. The class adapter uses inheritance and can only extend the base class and supply additional functionality to it. This is how our Class Adapter code looks: typing Any, Union copy deepcopy json from import from import import ( ): """ . """ ( , : ) -> [ ]: : . ( ) : ( + ' ') ( , : , : ) -> : . ( , ) class jdict dict The class gives access to the dictionary through the attribute name def __getattr__ self name str Union Any try return self __getitem__ name except KeyError raise AttributeError name not in dict def __setattr__ self key str value Any None self __setitem__ key value __deepcopy__ def __deepcopy__(self, memo): jdict((k, deepcopy(v, memo)) k,v self.items()) return for in We also added the __deepcopy__ method to the adapter. Without this magic method deepcopy() a jdict object will produce a dict object, thus losing the advantage of attribute-level access. caios.jdict jdict copy py_dict = dict(a = [ , , ], b = ) j_dict = jdict(a = [ , , ], b = ) py_copy = copy.deepcopy(py_dict) j_copy = copy.deepcopy(j_dict) print(type(py_copy)) < from import import 1 2 3 7 1 2 3 7 ' '> ( ( )) < ' . . . '> class dict print type j_copy class caios jdict jdict jdict Dealing with nested data structures While applying the Class Adapter design pattern turned out to be the optimal starting point, it still left open the question of how to deal with nested data structures. In other words, what should be done about having jdict containing another dict. In order to solve this problem, we need to consider separately JSON object deserialization and explicit creation of a dict somewhere in the underlying SDK. JSON Decoding When working with data that we receive from external sources in JSON format, the in python: following translations are performed by default when decoding An , if specified, will be called with the result of every JSON object decoded with an ordered list of pairs. The return value of object_pairs_hook will be used instead of the . This feature can be used to implement custom decoders. If object_hook also is defined, then the object_pairs_hook takes priority. object_pairs_hook dict Thus, we utilize this hook in order to create jdict instead of dict during JSON decoding, This approach covers 80% of the cases we practically have to deal with. Botocore Patch The object pairs hook mentioned above, however, does not help with Boto3 SDK. The reason for this is that AWS service APIs return XML instead of JSON, and the results are parsed by the , which creates and populates the dict object directly. BaseXMLResponseParser Structure of Python Since in this case the JSON hook does not help, we need to look at automatic rewriting of compiled Python code. To understand how Python works and how we can solve this problem, let’s look at the full path of the program from source code to execution. Abstract Syntax Tree (AST) To solve the problem, based on the structure of the full path of the program from source code to execution, we need to replace the code inside the AST. By traversing the AST, we will change the regular dictionary to jdict. Thus, the Boto3 SDK will return the jdict, as is required. Below is the code of the class that walks through the abstract syntax tree and changes the Python dictionary to jdict. ast typing Any = self.generic_visit(node) import_node = ast.ImportFrom( = , names=[ast.alias(name= )], level= ) node.body.insert( , import_node) node def visit_Dict(self, : Any) -> Any: node = self.generic_visit(node) name_node = ast.Name(id= , ctx=ast.Load()) new_node = ast.Call(func=name_node, args=[node], keywords=[]) new_node import from import ( . ): """ . . """ ( , : ) -> : class jdictTransformer ast NodeTransformer The visitor class of the node that traverses the abstract syntax tree and calls the visitor function for each node found Inherits from class NodeTransformer def visit_Module self node Any Any node module 'caios.jdict.jdict' 'jdict' 0 0 return node 'jdict' return Patch Module Using AST, we created a patch for the module botocore. To convert XML to jdict in runtime: def patch_module( : str) -> None: parsers = sys.modules[ ] filename = parsers.__dict__[‘__file__’] src = open(filename).read() inlined = transform(src) code = compile(inlined, filename, ‘exec’) exec(code, vars(parsers)) module module In this case, we are patching the botocore parsers file. boto3 caios.jdict caios.jdict.patch_module(‘botocore.parsesrs’) import import Limitations of the method There are several limitations to the method above: Each Jdict instance actually stores 2 dictionaries, one inherited and another one in __dict__. If a dictionary key is not a valid Python name, attribute-based access won’t work. Consider, for example, dict.created-at. It could be either dict[‘created-at’] or dict.created_at (would require a schema change)/ Another limitation is encountered when a field name is a value of another variable. One could write dict[x] but not dict.x because dict.x means dict[‘x’], not the value of the x variable. If a dictionary contains a key of the dict class methods (e.g. keys), then accessing it via the dot notation will return the dict method, while accessing via __getitem__ will return the dictionary value. In other words d.keys will be not equal to d [‘keys’]? To Be Pursued At the moment, our program does not use such configuration files as YAML (we don’t need them at the moment). Also, the program does not support csv and tables. We are currently in the development of a program that will work with AWS tables. Third Party Libraries While working on this project, we did not discover any suitable third-party libraries to utilize . At the time of final writing for this article, I did, in fact, encounter several possibilities, namely: Attrdict Attributedict pyATS In our project, we conceivably could use any of these options. All three are based on the idea of creating an adapter and overriding the dictionary functions in it. Plus, some of them add functionality that is not required for our work. Conclusions This effort demonstrates that dictionary objects can be managed nearly as efficiently in Python as in JavaScript. Our team believes it noticeably will increase productivity. This effort simplified the effort using Python dictionaries. Now, we do not need to “break fingers” by typing endless sequences of brackets and quotes. The JSON hook solved the nested data structures problem for JSON decoding This covers 80% of the cases we practically encounter. The botcore patch solved the problem of results coming from Boto3 SDK which parses AWS services API results arriving in XML, and builds dict objects on the spot. If required, the same patch could be applied to other libraries.