Work in Progres: Timezones. #109
lloydtabb
started this conversation in
What's Next: Language Proposals
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Timezones, Dates, Timestamps, Datetimes and more
Timezones are hard.
Timezones are one of those easy to screw up thing. Most data revolves around transactions and most transactions are sequenced in time. Time is linear, but we experience it in small chunks and large chunks and large chunks end up being 'days' and the definition of a 'day' is the hard thing. A day begins and ends depending on where you are. "Happy New Year" is not shouted all at once on earth, but is shouted over and over as the earth spins relative to the sun.
Some Type Definitions.
timestamp - data that can tracked to some universal time. This can be a 'timestamp' field in a database or a unix time that is the number of seconds/milliseconds since the epoch.
timezone - a string that represents the offset of the way to convert a timezone into a 'date'
timeblock - a string that represents a starttime and duration. It has no timezone and it form implies its duration. '2021-03-05', '2002-01', '2022-02-02 02:02". Dates are a form of timeblock.
Eye of the Beholder (Data is in my Timezone)
Most of the time, when working with data, you don't want to think about timezones. You simply want to look at yesterdays data. Yesterday should automatically be something. I could be where I live, where the company I work for is headquartered. In this case, converting a timestamp into yesterday is a simple conversion into a date at a single timezone. In some cases, you want yesterday to be based on what the user thinks of as a day and sometimes you want it to be what the company (data owner entity) thinks of as a day.
Where Ever you go, there you are (Time is relative to where it happened)
Suppose we wanted to look at all the sales in for Burger King Stores yesterday. Burger King stores open at 6a and close at 10p. In this case, conversion to a date makes sense relative to where the store is, not some universal time. I can compare NYC sales vs Los Angeles on the same day even though the stores opened and closed 3 hours apart. The opening time for a Burger King might be a datetime "2021-01-10 06:00:00' and this could represent both Los Angeles and NYC opening times even though the timestamps for these opens happened 3 hours differently in real time.
Design Points
Physical Representations of time
There are many storage formats for both timestamp and datetime.
Examples Physical Representation of Timestamp
timestamp - postgres field of type 'timestamp with timezone
unixtime - integer seconds since epoch
unixtime_ms - integer milliseconds since epoch
Examples of Physical Representations of datetime
date - date field in a database
datetime - BigQuery's datetime field time
timestamp without timezone - Postgres field type
YYYYMMDD - an integer where each set digits represent the year, month and day
YYYY - the year stored as an integer.
Timestamp Primitives
Granularity - Time can be chunked into millisecond, second, minute, hour, date, month, year and more.
Truncate - Take any timestamp (and timezone) and return the minimum time in that granularity. Returns a timestamp.
ToTimeBlock - Take a timestamp (and timezone) and convert it into a timeblock string.
Add/Subtract - any granularity to a timestamp
Beta Was this translation helpful? Give feedback.
All reactions