Hello everyone!
I recently became interested in the Rust programming language and began learning its basics. While exploring Rust, I came across open-source projects written in the language. I even made a pull request to the rust analyzer project, leveraging my knowledge of compilers and static analysis. I enjoy learning by creating projects based on my interests and ideas.
While reading the "Building git" book, I came across the functions of each file inside the .git folder, which explained how git stores commits, branches, and other data, managing its own database. This gave me a new and exciting idea:
What if we could develop a query language that operates on these files?
I decided to implement this query language and I named it GQL. I was very excited to start this project because it was my first time implementing a query language. I decided to implement it from scratch, not converting .git files into an SQLite database and running normal SQL queries. I also thought about how cool it would be if in the future I can use the GQL engine as a part of a Git client or analyzer.
The goal is to implement it in two parts.
The first step involves converting the GraphQL (GQL) query into an Abstract Syntax Tree (AST) composed of nodes. This AST is then passed to the engine, which walks through the nodes and executes them as an interpreter. In the future, we could potentially convert this process into virtual matching for GQL bytecode instructions.
The engine can deal with .git files using the rust binding for the git2 library. So, it can select, update, and delete tasks. Also, it can store the selected data in a data structure to facilitate filtering or sorting.
To simplify this implementation, I created a struct called GQLObject that can represent a commit, branch, tag, or any other object in this engine. It also makes it easy to perform sorting, searching, and filtering with single functions that deal with this type.
pub struct GQLObject {
pub attributes: HashMap<String, String>,
}
The GQLObject is essentially a map of strings, where each string serves as a key-value pair. This design allows us to be flexible and generalize the storage of information for any type of data. With this structure in place, implementing features such as comparisons, filtering, or sorting becomes much simpler, as we can operate on the values within the map.
Over the last week, I implemented the selecting feature with conditions, filtering, and sorting. I also fitted it with optional limit and offset so you can write queries like this:
select * from commits
select name, email, title, message, time from commits
select * from commits order by name limit 1
select * from commits order by name limit 1 offset 1
select * from branches where ishead = "true"
select * from tags where name contains = "2023"
The next step is to optimize the code and start to support more features. For example, imaging query for deleting all branches except the master.
delete * from branches where name ! "master"
Or, pushing all or some branches to a remote repository using a single query. Maybe even grouping and analyzing how many commits for each user this month and a lot of other things we can do.
The GQL project is a free open source so everyone is most welcome to contribute, suggest features or report bugs.
GQL on Github: https://github.com/amrdeveloper/gql/
I am looking forward to your opinion and feedback 😋.
Also published here.
I hope you enjoyed my article!
Enjoy Programming 😋.