The ideology serves as a powerful mechanism not only for uniting individuals with similar interests but also for exchanging experiences and ideas, embodying the principle that sharing is indeed caring. is a great example of how a community of enthusiasts can change the world. Its are highly impressive and we can claim that most of the Internet works on Linux’s horsepower. open source Linux usage statistics Bad programmers worry about the code. Good programmers worry about data structures and their relationships. Linus Torvalds Managing an open-source project carries significant weight, particularly as its popularity grows: it brings forth the spotlight on technical expertise and architectural choices. Popular projects could be a treasure trove of knowledge for anyone who wants to boost their skills. Let's delve deeply into a few notable projects. Why focus on open source? First of all, it embodies top-tier coding standards, integrating best practices championed by industry leaders. My firm belief lies in the notion that genuine programmers are fueled by passion and curiosity. Why not learn something new and integrate it into our daily practice? Secondly, hacking code could take your skills to a whole new level if you are consistent and tenacious. Last but not least, it could help you jump right into the project you like: find new buddies, brand yourself as a skilled developer, and get the job of your dreams. Help the community to solve their problems, and the reward will not keep you waiting. Take your cup of coffee, let’s get straight to work. Linux Kernel The very heart of the Linux system — its kernel. Indeed, this is not merely a singular project, but rather a constellation of diverse projects unified by a central idea: tying up different parts of hardware and providing a sustainable, secure API for the end users. The Linux codebase is huge and contains a lot of vendor-specific code, but it is well-organized and structured. I strongly recommend everyone interested to read . It might be a bit outdated but it will foster a solid understanding of how the kernel internals work. The kernel version I have on the table is 6.3. this Radix tree for memory management At the very beginning of the main.c we can spot definitions of some functions like . It is responsible for initializing the data structure that is used in memory management processes. radix_tree_init In the codebase, the radix_tree_root struct is an alias for a wrapper above the struct. It a lock (pthread mutex); xa_flags — the enum that tells how memory allocations will be performed; and the pointer to xa_head. XArray holds gfp_t struct xarray { spinlock_t xa_lock; /* private: The rest of the data structure is not to be used directly. */ gfp_t xa_flags; void __rcu * xa_head; }; typedef enum { GFP_KERNEL, GFP_ATOMIC, __GFP_HIGHMEM, __GFP_HIGH } gfp_t; If all of the entries in the array are NULL, xa_head is a NULL pointer. If the only non-NULL entry in the array is at index 0, xa_head is that entry. If any other entry in the array is non-NULL, xa_head points to a xa_node — the internal struct that handles offset, counts, slots, and other fields. struct xa_node { unsigned char shift; /* Bits remaining in each slot */ unsigned char offset; /* Slot offset in parent */ unsigned char count; /* Total entry count */ unsigned char nr_values; /* Value entry count */ struct xa_node __rcu *parent; /* NULL at top of tree */ struct xarray *array; /* The array we belong to */ union { struct list_head private_list; /* For tree user */ struct rcu_head rcu_head; /* Used when freeing node */ }; void __rcu *slots[XA_CHUNK_SIZE]; union { unsigned long tags[XA_MAX_MARKS][XA_MARK_LONGS]; unsigned long marks[XA_MAX_MARKS][XA_MARK_LONGS]; }; }; The radix tree data structure contains “nodes” that link with each other based on the int key (index). The node contains several slots, each of which can handle a pointer. The main idea here is to keep a tree wide, not deep — it is effective for fast addressing. When the function is called by a user (it can be a device driver), the following actions occur: radix_tree_insert __radix_create_tree creates a slot in the tree if it is necessary, based on the passed index which is used to check the size of the tree. Or raise an error if there is no available memory; insert_entities pass an item to the slot; some asserts check that offsets are applied correctly to the data structure. Wherever a user of the tree is going to look up a specific page, they need to look up it with an index. The lookup phase is quite efficient because of the broad nature of the structure. It is based on the key that is modified on the user side. Take a look at the for more insight into how it works. article Users of this data structure include GPU and network device drivers, which maintain memory page pointers in a structured format rather than juggling them. It's like having a pocket dimension! This is where Linus's earlier quote comes into play. Consider the essence of the data you handle; aim for efficiency, compactness, and cleanliness in its usage, regardless of the language or project you're involved in. In 2021 Liam Howlett (no, another Liam) and Matthew Wilcox introduced data structure. I encourage you to read more about the intentions, motivation, and realization details and . the maple tree here here Kubernetes Let's direct our steps to the point of the architectural design. One of the most famous orchestrators, invented by Google and spread around the world at rocket pace — Kubernetes. There are a lot of fairly new resources out there covering the internal details, for example here is a that you can start exploring as you delve deeper into the details. repository The whole point of this article is to get some useful insights that answer not just “how” but “why”. Understanding the motivation is crucial, in my opinion, that is how we can “catch a mindset” to re-implement all those cool staff in our daily job. Let’s briefly recall what the Kubernetes is. This is the system that manages containers within the applications and provides orchestration features such as scaling and high availability. Kubernetes’ main components are the API service, the scheduler, etcd — the key-value storage, the controller service, and the agent service — “kubelete” on each node to manage containers. Guys from developed it to utilize the compute resources of their cloud . Their main intention was a “great container management system”. Google as efficiently as possible Architectural highlights The very first thing I’d like to highlight is the distributed nature of execution. To synchronize all of the user’s entities states the orchestrator uses etcd — the key-value storage which information in a tree data structure. It is a common pattern that presents a to manage different states between multiple entities, with one slight difference: the storage is also used to “instruct” all of the components about the final cluster’s state. keeps So the whole system uses the storage as a source of truth to keep maintaining the desired state of a cluster. source of truth One more nice architectural decision is the — container runtime interface. CRI It allows Kubernetes to work not just with Docker but many others (CRI-O, Dockerd) by defining the , just like one from RFC. The kubelet agent utilizes GRPC to connect with a runtime, so in case one would like to implement its own runtime, they should implement the exact interface on their side. The concept aims to loosen the connection between the kubelet agent and a runtime which manages containers. interaction protocol At the end let’s briefly look at the “sidecar” pattern that was presented by third-party services like and has recently been within the core Kubernetes functionality. Istio implemented The whole idea is to embed a service to the main application that helps to handle some specific cases such as proxy or load-balancing. Kubernetes manages this container as a whole different entity of a pod: the container can the entire pod status; restarting and updating separately of the main application; resources with the main app. influence share The main point here is that the main application may know nothing about the sidecar, while the behavior of the entire service can be changed. Bless Open Source, we how new functionality was implemented by the community. can trace PostgreSQL PostgreSQL is not just a database but a whole platform. After the client connects to the server a new is established. The query goes through many while executed: parsing, the rewriting system, planning and executing. backend process stages The PSQL server has a bunch of that allow developers to interact with different internal components. For example, we can our scan provider aimed, for some optimizations at each stage. Or create an . Even though the overall process required for specific knowledge and the documentation could be a bit knotty, interfaces write index add-on the loosely coupled architecture empowers developers to vastly expand functionality. There are many that we can use within the platform and the “bloom index” is one of them. extensions realization Bloom index I faced a problem of growing indexes back in the day. The task was to implement the custom search filter: every column in the database and the combination of it could be a key for a search. Building indexes for each and every column could be a bad decision because of the complexity and size. For example, if we are sure that item==2 is False there is no reason to check all other fields. I remembered the — the probabilistic data structure that could return a false-positive result (because of hash collision) and be quite sure when the data is absent. As I’ve mentioned earlier PSQL has an extension for this data structure. The very first idea that comes to mind is to reduce the dimension of possible variations removing the query values that we are sure are not in the database. Bloom filter Bloom filter in a nutshell — it’s just a bit array. Every item would be hashed with a certain amount of different functions that translates it to some index. That index will set the exact bit in the filter’s array. In the case of 3 different functions, an item would set 3 different bits. At the lookup phase, the query would be also hashed and compared within the bits in the array. If at least one of the bits is not set, we can be sure the item is absent. If all of them were set, it is still possible that the item was not placed — that is because each bit we have found could refer to other items — the collision takes place. Check out this great of how it works within the examples. explanation To start using the bloom index within the PSQL we need to create it properly: CREATE INDEX bloom_idx_bar ON foo.bar USING bloom (id,dept_id,zipcode) WITH (length=80, col1=4, col2=2, col3=4); where “length” — is the length of the array of bits for each row; and the “col” numbers — describe how many bits would be occupied by the column (how many functions would be applied). By increasing these numbers we can reduce false-positive shots. Of index creation in PSQL is a bit complicated, we should understand internal data structures and protocols; let’s try to make the observation short and sweet. The function promotes the data structure’s functions as an IndexAmRoutine pointer — a pointer to the , that represents an API for the index access methods. blhandler struct The function creates the new index: allocate memory pages, scan the table for tuples that will be indexed and in the structure. blbuild BloomState typedef struct BloomState { FmgrInfo hashFn[INDEX_MAX_KEYS]; Oid collations[INDEX_MAX_KEYS]; BloomOptions opts; /* copy of options on index's metapage */ int32 nColumns; /* * sizeOfBloomTuple is index-specific, and it depends on reloptions, so * precompute it */ Size sizeOfBloomTuple; } BloomState; When we insert data to the index, we pass it in the form of an array within the array of booleans that marks NULL values to skip it at the indexing phase. The BloomFormTuple function forms the BloomTuple struct from the data — this is the part where the data would be signed. The tuple would be passed to a memory page after that phase. BloomTuple * BloomFormTuple(BloomState *state, ItemPointer iptr, Datum *values, bool *isnull) { int i; BloomTuple *res = (BloomTuple *) palloc0(state->sizeOfBloomTuple); res->heapPtr = *iptr; /* Blooming each column */ for (i = 0; i < state->nColumns; i++) { /* skip nulls */ if (isnull[i]) continue; signValue(state, res->sign, values[i], i); } return res; } At the scan phase the forms the struct which contains the index’s data and the makes the TIDBitmap struct from it — the internal data structure that holds item pointers — reading BloomState page by page. blbeginscan IndexScanDesc blgetbitmap I hope the explanation isn't too confusing, PSQL internals are quite specific. Bloom filter is a pretty common data structure, PSQL uses the same approach for some purposes. It is also broadly used in many different situations: from cache-hitting cases to implementations. internal kernel packet filter More to know In the article, I’ve just lifted the veil of the shiny open-source world. There are many more great implementations you could meet in the wild: and — if you’d like to catch the magic of the “fair” resource allocation; — to understand the simultaneous editing in Google Docs. CFS CFQ operational transform and CRDT The whole software world is in motion: sidecars and operators to be the new model for software delivery; eBPF is in a to change the service-mesh post. claimed hurry Take a look at the elegance of the great things’ internals.