Last time I spoke about the need for the context to actually make a proper judgment about the semantics of the simple list of names. This time I want to slightly expose how messy the situation is with date and time values. There is so much “wrong” about them that I don’t even know where to begin. This is going to be done in 3 separate parts just to keep every piece relatively short.
Let me start with the notion that the very concept of “date” as an absolute thing simply does not exist.
What we learned from our childhood living in western (Christian) culture is just one of many alternatives co-existing in the modern world.
Well, probably any culture on this planet has a concept of a year (Earth moves around the Sun for all cultures), a month (Moon moves around the Earth for everyone), and a day (you see the pattern here, right?). Well, this is the point where the similarities basically end. Anything outside of the astronomical objective events is subject to cultural interpretation.
Let’s look at a few examples:
Western Date: April 27, 2021
Chinese: Mar 16, Xin Chou Year, Year of Ox
Umm-Al-Qura: Ramadan 15, 1442 AH
Hebrew: 15 Iyar 5781
Please, also note that all of these are written using Latin characters, while in their corresponding cultural contexts they will be represented by different scripts. It is completely confusing for western readers, so I will just give one example of the current date in Hebrew here:
ט״ו בְּאִיָיר תשפ״א
I grabbed this here, and I won’t lie, I myself have no clue how this works and how I could possibly recognize it if I happened to see it somewhere in the real world.
Ok, anyways, computers are a product of western culture, so for simplicity, let’s limit the scope of this discussion to western dates, shall we?
When we say “western date,” we actually mean Gregorian Calendar, which is currently universally accepted by all, let’s say, traditionally Christian countries in the world.
But, if we happen to look at some database containing historical events, we must ask ourselves: is this particular date really Gregorian? In fact, there was quite a long historical transition period between Julian and Gregorian calendars when both were in use concurrently and the transition was happening at different times with different speeds in different places.
The Gregorian calendar was first adopted in 1582 in some European countries, and the final one was Russia and its satellites in 1918. It took 336 years to more or less standardize calendars across Christian countries.
Let’s limit our scope even further and assume we only deal with Gregorian Calendar. Are we done with our problems? Well, no. Let’s look at this:
What does it actually mean? US citizens will tell you that it is the 2nd of March 2021. UK citizens will argue that it is the 3rd of February 2021. I might argue that no matter where you live, this date refers to 1921. Alternatively, you can see other styles of numerical date representation:
English is not the only language used in countries that adopted the Gregorian calendar. The same date can look very different depending on the country:
November 13, 1976
13 ноября 1976
13 листопада 1976
13 listopada 1976r
Then there is, of course, ISO 8601, which standardizes the literal representation of dates, making it fairly universal and pretty convenient to use:
Obviously, in many cases, it is just impossible to universally recognize and normalize date literals without introducing a wider context. Often, necessary context can be derived from other pieces of data, in some cases, human intervention must happen, to inject the missing piece of information.
As soon as context is sufficient to resolve ambiguity, datuum.ai will correctly detect date literals and convert them into some standard representation (like ISO 8601).
In Part II, we will look at time literal and discuss time zones and DST for a bit and in Part III we will dip our toes into machine representations of date/time values.
Thanks to Dmytro Zhuk, Founder & CTO of datuum.ai for this text.
- Art Credit, Salvador Dali "Melting Clocks" -
Also Published Here