I follow some very bright people on Linkedin and Twitter (about the only reason I’m on Twitter), and I learn about interesting tech and interesting use cases by random chance. The latest is , and as an added benefit, the example is going to include . The intro paragraph from the Malloy GitHub page says: Malloy DuckDB “ ” Malloy is an experimental language for describing data relationships and transformations. It is both a semantic modeling language and a querying language that runs queries against a relational database. Malloy currently connects to BigQuery and Postgres, and natively supports DuckDB. We've built a Visual Studio Code extension to facilitate building Malloy data models, querying and transforming data, and creating simple visualizations and dashboards. Malloy was developed by the creator of , which Google bought back in 2019. Malloy is meant to be a more evolved and open-source version of Looker essentially. Looker has some clever ideas in it obviously, but I’ve also seen it referred to as: Lloyd Tabb Looker " " feels like JSON made a baby with SQL, in the worse possible way To be clear, I know Looker, but I haven’t done any real work with it. So that brings us to Malloy. about What Is Malloy? Lloyd describes Malloy on the as GitHub page “ .” also stating: “ ” an experimental language for describing data relationships and transformations. It is both a semantic modeling language and a querying language that runs queries against a relational database SQL is complete but ugly…Everything is expressible, but nothing is reusable; simple ideas are complex to express; the language is verbose and lacks smart defaults. Malloy is immediately understandable by SQL users, and far easier to use and learn. Malloy natively supports , which I find exciting, and it can currently connect to and . There is a Visual Studio Code extension for Malloy, as well as npm modules for javascript and typescript, in addition to a . Malloy can (from the docs): DuckDB BigQuery Postgres Composer Queries compile to SQL, optimized for your database. Has both a semantic data model and a query language. The semantic model contains reusable calculations and definitions, making queries short and readable. Excels at reading and writing nested data sets. Things that are complicated in SQL are simple to express in Malloy. For example, level of detail calculations, percent of the total, aggregating against multiple tables across a join safely, date operations, reasonable ordering by default, and more. First, a couple of examples taken from the docs so you can see what a Malloy description looks like and the SQL it generates: query: table('malloy-data.faa.airports') -> {
  group_by: fac_type
  aggregate: airport_count is count()
  where: state = 'CA'
  order_by: airport_count desc
} SELECT
   base.fac_type as fac_type,
   COUNT( 1) as airport_count
FROM `malloy-data.faa.airports` as base
WHERE base.state='CA'
GROUP BY 1
ORDER BY 2 desc This will translate to this SQL: And from the dataset, it will produce this result: Note that you can display this as HTML, JSON, or SQL. The basic structure of a Malloy Query takes the form of: query: <source> {
  join_one: <source> with …
  join_many: <source> on …
} -> {
   group_by:
    <field/dimension>
    <field/dimension>
   aggregate:
    <aggregation/measure>
    <aggregation/measure>
   nest:
    <named_query OR query_def>
    <named_query OR query_def>
   where: <filter_expression>, <filter_expression>, …
   having: <aggregate_filter_expression>, <aggregate_filter_expression>
   order_by: <field/dimension>, <aggregation/measure>, …
   limit: <limit>
} This maps to the following SQL query structure: SELECT
   <group_by>, <group_by>, …
   <aggregate>, <aggregate>, …
   <nest>, <nest>, …  		       -- very much a simplification; read more in Nesting Queries doc.
FROM <source>
LEFT JOIN <source> ON …
LEFT JOIN <source> ON …
WHERE (<filter_expression>) AND (<filter_expression>) AND …
GROUP BY <group_by>, <group_by>, …
HAVING <aggregate_filter_expression> AND <aggregate_filter_expression> AND …
ORDER BY <group_by> | <aggregate>
LIMIT <limit> If you don’t know Mimoune Djouallah, you should definitely follow him on or , he does lots of very clever things in the data and analytics space. I’m going to shamelessly make use of the example he just put out for this article, just because it is so elegant. Twitter Linkedin So, I’m going to summarize what Mimoune has going on here on his . Malloy Fiddle In the left-hand pane, we have the schema information, which is pretty obvious. Then to the right of that on top is our Malloy Query, in which you can see the nested aggregations easily described in just a few lines. Below that is the Semantic Model, and to the right is the generated bar chart that was created from the query. You can try out various saved queries from the “Queries” on the top row. Here is what is really wild, however, look at the below screenshot: That is the generated SQL from the Malloy Query, which was only 6 actual lines of code; it replaced 40 lines of SQL. You can see the embedded DuckDB connection that is accessing the various Parquet files that have the data for the query in the Semantic Model. It’s totally worth clicking into the Fiddle and playing around a bit. There is another really fun Fiddle from Lloyd that connects to the IMDB dataset that is absolutely worth checking out . here What I really like here is how you can create nested data sets that Mallow will run together in a single query; you can aggregate pretty much anything into a single query and simplify your SQL. Summary The main purpose of this article is to bring Malloy and some ideas to your mind; it’s not meant to be an exhaustive tutorial by any means. Yes, it’s another language, but it’s similar enough that it should be easy to pick up on. What really excited me about this whole affair was doing full analytics and visualizations on a ton of data in a browser, with no servers running, with DuckDB and Malloy. I’ve seen a lot of use cases where analysts would love to have this speed and ease of use. I hope I got you thinking a little bit and you’ll check it all out, but keep in mind that Malloy isn’t an actual product at this time and no one is offering support.

Google

Twitter

What the Heck is PRQL?

Data Mesh - A Contrarian View

Check out my library

Nominated for 2022 - HackerNoon Contributor of the Year - Climate Change

Nominated for 2022 - HackerNoon Contributor of the Year - Hiring

Developer Advocate

Too Long; Didn't Read

What the Heck Is Malloy?

What the Heck Is Malloy?

About Author

Comments

TOPICS

THIS ARTICLE WAS FEATURED IN

Related Stories

Announcing the COBOL Streamhouse

10 Ways Stand Out as a Java Developer and Land that Dream Job

10 Fantastic JavaScript Projects for Beginners

11 Great Tips From A Guy Who Leveled Up From Intern To Dev

125 Stories To Learn How To Do X

3 Free Python Courses For Beginners: 2020 Edition

Announcing the COBOL Streamhouse

10 Ways Stand Out as a Java Developer and Land that Dream Job

10 Fantastic JavaScript Projects for Beginners

11 Great Tips From A Guy Who Leveled Up From Intern To Dev

125 Stories To Learn How To Do X

3 Free Python Courses For Beginners: 2020 Edition

Light-Mode

Classic

Newspaper

Minty

Dark-Mode

Neon Noir

Minty

HN StartUps