Oh wow... this is a great idea. Good idea sending it round. Ought to make things a bit easier when discussing and avoiding misunderstandings. =]
Hi all,
Collaborating around the subject of "time" is difficult; there are
subtleties abound in implementation, the perspectives people come from,
and the language used in discussions. I'm going to provide a glossary to
establish common terminology, use these terms to analyze our current
state, offer a roadmap for solving the problem in stages, suggest a
format for timestamps, urge compatibility with "exotic" use cases, and
finally call for outside help with implementing a timezone aware agenda
system.
Summary and references are at the end.
This is an initial glossary compiled from various standards and sources;
it's incomplete, probably incorrect, and open to critique, but is useful
in articulating a possible road map forward.
• Time
Time (concept)
What clocks measure (Einstein)
Time axis
Mathematical representation of the succession in time according
to the space-time model of instantaneous events along a unique
axis (ISO).
Instant (object)
A single point on time axis (ISO).
Moment in time
See: instant.
Mark
A set of symbols related to the object, or carrying some
symbolic meaning
Time scale
System of ordered marks which can be attributed to instants on
the time axis , one instant being chosen as the origin. e.g.,
GMT, UTC, TAI.
Basis time
See: time scale.
Time (mark)
The designation of an instant on a selected time scale, used in
the sense of time of day.
Time interval (object)
part of the time axis limited by two instants and, unless
otherwise stated, the limiting instants themselves a part of
time limited by two instants or moments in time (ISO). The
elapsed time between two events (NIST).
Duration (object)
as a quantity characterizing a time interval. These can be
written in different formats.
UTC
Time scale with the same rate as International Atomic Time
(TAI), but differing from TAI only by an integral number of
seconds.
Offset
Constant duration difference between times of two time scales
(ISO). i.e., a quantity to combine with a time scale to produce
a wall time. e.g., Nepal uses a +5:45 offset from the UTC time
scale.
Time shift
See: offset.
• Calendar and civil time
Wall time
what shows on the clock on the wall at a location. Like "local
system time" but needn't reference a computer to do the
calculation.
Standard time
Time scale derived from UTC, by a time shift established in a
given location by the competent authority (ISO).
Local system time
Local system time is determined by applying the system's time
zone offset and year offset values to UTC. The Time of day
system value displays the local system time. Local system time
and system time are used interchangeably.
Time Zone
A place/region. Can map between wall time and a time scale with
a table and an offset. A set of rules for determining the local
observed time (wall time) as it relates to incremental time (as
used in most computing systems) for a particular geographical
region. e.g., Brasília time presently has an offset of −03:00
from the UTC time.
Calendar event
A calendar object that is commonly used to represent things that
mark time or use time. Examples include meetings, appointments,
anniversaries, start times, arrival times, closing times.
• Implementation These concern how we actually decide to record,
reference, or manipulate time.
Representation
Expression indicating a time point, time interval or recurring
time interval. e.g., [2023-02-02 Thu 12:58 +1w], "this next
suday at 2pm EST", 3600 seconds from Unix epoch
Format
A description of the abstract form used for a representation.
e.g., [YYYY-MM-DD] 'Explain in prose relative to this moment in
time using locale and include your timezone'
Encoding
The method of storing a representation of time e.g., datestruct
in memory, Org timestamp in body of heading, value of a
"created" key in a database
Format syntax
Rules that allow for parsing a encoding unambiguously into some
time scale.
Timestamp (mark)
An encoded representation in a selected format. e.g., 24/01/2023
or 2023-01-24
Delimiting syntax
Rules that allow for detection and extraction of an encoding.
Necessary for encodings embedded in prose. e.g., '[]' for org
timestamps.
Displayed time
The formatting of a representation exposed to a user.
Calculating
Manipulating a set of time points, time intervals, or recurring
time intervals. e.g., determining instant from an offset,
comparing two representations in some lattice.
Incremental time
A datetime value consisting of monotonically increasing integer
units measured from a specific moment in time (epoch). For
example, the moment 1970-01-02T00:00:00.000Z might have an
incremental time value (measured in milliseconds) of 86400000,
since there are 86,400 seconds in a day and 1000 ms in a second.
Floating time
A wall time value without time zone or offset information. E.g.,
2023-01-24 or 6:45pm.
Fixed time
A representation of a (past or future) UTC time.
Absolute time
See: fixed time.
Unfixed time (from UTC)
A representation which is not referenced to a past or future UTC
time. e.g., Future time given as a local time in some specified
time zone, where changes to the definition of that time zone
(e.g., a political decision to enact or rescind daylight saving
time) affect the instant in time corresponding with the
timestamp.
• Time formats
Incremental timestamp
Timestamps that can be directly compared: their integer values
determine which is earlier or later. e.g., Unix seconds since
epoch.
Field-Based timestamp
Timestamps which must be normalized and their individual fields
compared. Field based times can have certain kinds of logical
operations performed on them (for example, rolling the month
forward or back), while incremental time requires a logical
transformation. e.g., ISO8601 style timestamps.
ISO Basic format
A format which omits hyphen separators e.g., YYYYMMDD
ISO Extended format
A format which includes hyphen separators e.g., YYYY-MM-DD
Extended Date/Time Format EDTF
An extension of the ISO 8601 created by the Library of Congress
to cover date formats and conditions useful in metadata systems
but not handled in the ISO standard.
What does format does Org have now?
• The format currently in use for timestamps is floating, field-based,
and has a resolution/precision of minutes.
What kinds of representations would a calendar system capable of
handling timezones require?
• Instant (fixed)
• This is referring to an unambiguous moment in time
• e.g., 2007-02-03T05:00:00.000Z
• Offset (fixed)
• This captures the idea of "when did it happen for the person who
made the observation"
• e.g., 2007-02-03T04:00:00.000+01:00
• Instant with explicit offset and zone (fixed)
• e.g., 2007-01-01T02:00:00.000+01:00[America/Chicago]
• Zoned local date time (floating)
• Tricky, requires decisions about how to interpret timestamps after
political changes.
• e.g., 2007-01-01T01:00:00.000[America/Chicago]
I claim that before dealing with the nuances of calendar appointments,
repeating events, agenda displays etc, that Org must first support
fixed/absolute time instead of just floating time. Without some basis
time scale the conversions from time zones and offsets to some
incremental time point is impossible. Resolving this prerequisite will
also simplify the timezone discussion because we won't be mixing
calendar issues with time issues.
What would a roadmap be?
• Design and implement an absolute and offset timestamp system
• Decide on a time scale
• Decide on a format and syntax
• Implement instant timestamps
• Implement offeset timestamps
• Design and implement the time zone aware calendar system This is a
separate project.
What time scale should Org use?
There are only two decent options, either TAI or UTC. The rest of the
world has agreed upon UTC, we should too. Conversion to TAI can be done
by users or on export.
What format and syntax should Org use?
A heretical suggestion: We should abandon the day of week abbreviation
and use a new format.
The current format generates a three leter abbreviation of the day of
the week [2023-01-25 Wed 12:12]. I suggest supporting this as a
legacy/simple format but switch to a format/encoding like
[2023-01-25T15:13:42Z] for the new system. Specifically I'm advocating
for an extended ISO 8601 format, compatible with expanded dates and
Level 2 of the EDTF, with some (bracket?) notation surrounding it such
that Org can parse the syntax as a timestamp. I advocate further for the
use of durations and repeating intervals to follow the same standard
format. Thus instead of a range being formatted as:
[2023-01-25 Wed 13:57]–[2023-01-26 Thu 13:57]
it would be:
[2023-01-25T16:57:42Z/2023-01-26T16:57:42Z].
If the square bracket delimiter syntax is insufficient or too difficult
to parse unambiguously, we could just encapsulate the ISO format in a
sub-syntax (e.g., [&&(ISO format)] similar to the [%%(diary sexp)]
technique). This is ugly, but perhaps a stepping stone during
development to separate syntax parsing concerns from calculating etc.
What are the problems with the day of the week in existing format?
• The day of the week is redundant information and can be rebuilt from
an ISO date Any user who wishes to display a format with the day of
the week can do so.
• It's a nonstandard format Although the Org documentation says that the
timestamps are "inspired by the standard ISO 8601 date/time format"
the use of a day name is not contained in the ISO specification. The
present Org format is actually two ISO components, the date and the
time, with a non-standard day name sandwiched between them with space
separators. Spaces are no longer allowed in the ISO format. By
deviating from an existing standard we place the burden of parsing on
ourselves and make sharing more difficult.
• Day of the week is irrelevant in many situations Looking at timestamps
from a year ago it's often the case that what day of the week it was
created is unimportant.
What are the advantages of switching to a standard format for the new
system?
• We can allow the legacy/simple system to coexist and interpret it as a
floating timestamp This simplifies the issues of maintaining
compatibility with existing org documents. It also placates those who
have single user systems in a single time zone who do not want to have
any calendar complexity imposed on them.
• We have a way of distinguishing new timestamps from legacy/simple ones
By making a change in syntax we reduce (or eliminate?) the possibility
of ambiguity between "which version" of a timestamp is being parsed. A
legacy timestamp can be treated as such, and new timestamps are easily
identified by the 'T' present instead of spaces, or in the delimiters
wrapping the representation.
• We free ourselves from the constraints of the legacy timestamp format
Trying to engineer a new syntax which also parses as an extension of
the legacy one is more complex and embeds things like "day of the
week" and the use of spaces as separators into this new system. Easier
to have two side by side.
• We can defer to existing parsing and calculating systems There are
already programs written which support the ISO standard and EDTF.
• We can directly (or nearly directly) import the regular expressions
and parsing mechanisms already written.
• These enable decent testing suites as we build the system, as we can
check against existing packages to see if our parsing and
calculations agree.
• Users who wish to use external libraries (irrespective of language
or license) can extract the new timestamp and parse or calculate
externally.
• Org is part of a standard
• We are able to defer to experts and 35 years of knowledge rather
than debate among ourselves
• Interfacing with other programs is simplified as the area inside the
delimiter notation can be passed as a string without parsing.
• New users and collaborators can be onboarded faster without needing
to learn a new system
• Org documentation can refer to the standard instead of bearing the
burden of exposition.
• The move to include time zones in the format is simplified
• The ISO standard has recently adopted a format for time zones from
RFC3339 and JAVAZDT, we can adhere to 8601 and keep things
consistent.
What other perspectives should the new format support?
In addition to the representations necessary for a timezone aware
calendar system, I suggest the new format be compatible with two other
representations: finer/ arbitrary resolution for scientific work, and
Level 2 of the Extended Date/Time Format for bibliographic and metadata
systems.
Although most implementations come from the computer/database
perspective, where precision is limited by clock speed, scientific data
may be finer grained. Adopting a format which allows for arbitrary
precision enables Org to be useful in more scenarios. This would allow
data of higher frequency to be collected and stored into org headings as
a plain text database. Even if the data was stored externally it would
be convenient to be able to comment or discuss collected data by
referencing its time point.
The Extended Data/Time Format (EDTF) was designed by the Library of
Congress to address limitations of the ISO standard for metadata and
archival purposes. A draft specification was created in 2012 and EDTF
functionality has now been integrated into ISO 8601-2019. Of great
interest is the ability to express the concepts of uncertainty and
approximation. Archival work includes scenarios where the precise date
may be unknown, so a format was created with qualifiers capable of
handling these situations. In the EDTF format '1984?' expresses possibly
the year 1984, but not definitely, while '2004-06~' expresses year-month
approximate. This format has been implemented by multiple library
systems and in 2021 Wikibase added an extension to support EDTF.
The initial technical or code burden to support these perspectives is
minimal. Both can be parsed and calculated with by existing libraries,
and the functionality to actually calculate with them can be delayed.
The important thing is selecting a format which won't exclude them.
That these features are omitted in many systems as result of the
restricted domain and the data types used for storage; Org does not have
these constraints. Further, both of these communities tend to attract
people who are talented and sympathetic with (even occasionally funded
to support!) open source projects. By expanding Org's format to be more
inclusive we provide a haven rather than shutting them out.
The calendar implementation should elicit help from experts
Everyone seems in agreement that leveraging existing libraries is
desirable. We should also read and defer to documentation and
recommendations available from legitimate projects (e.g., W3, ISO). But
I think these are still insufficient for architecting an elegent time
system capable of satisfying the various perspectives. Calendar
applications in particular contain many edge cases and decisions about
display and interface etc. The knowldege concerning these is more likely
tacit than explicit, so I suggest we reach out to people who have
already designed/engineered solutions and get their input.
Here are some projects, organizations, or perspectives we could seek
help from:
• Calendar applications
• ical standard
• CalConnect standard
• Thunderbird/lightning calendar
• Google calendar
• Outlook
• Lotus notes
• Standard organizations
• NIST
• ISO
• Database or computer applications
• SQL
• Oracle
• Java's time system
• Numpy
• Rust
• Archival or research users
• Library of congress
• Metadata systems
• Academic users
• History
• Scientific users
• Astronomers
• Physicists
• Chemists
• Geologists
• Metrologists
To summarize:
Org presently only supports simple floating timestamps. A calendar
system capable of handling time zones requires some form of fixed or
incremental timestamp with offsets. We can solve the absolute timestamp
system first, and deal with calendar concerns after. If we're
implementing a new time system the format and syntax should allow for
"exotic" use cases like arbitrary precision, uncertainty, and expanded
dates. The mechanics for calculating with those exotic cases needn't be
implemented by Org immediately.
We should adopt UTC as the time scale, EDTF (an extension of ISO 8601)
as the time format, and merely encapsulate this format with a delimiting
syntax (using brackets if possible) that Org can parse and distinguish
from the present format. The existing Org format should be considered
simple/legacy and can be interpretted or translated internally into the
new system as calculations require. The new format can be implemented
alongside the simple/legacy system.
This discussion of absolute offset timestamps should be split off from
timezone, calendar, and display concerns. Implementing a calendar
application with timezones is complicated and we should seek help from
those who have built the systems from before.
References:
Time
https://www.iso.org/obp/ui/#iso:std:iso:8601:-1:ed-1:v1:en
https://www.w3.org/International/articles/definitions-time/
https://www.ibm.com/docs/en/i/7.3?topic=concepts-time
https://tc39.es/proposal-temporal/docs/ambiguity.html
EDTF
https://www.loc.gov/standards/datetime/ Main page on EDTF
https://edtf.wikibase.wiki/wiki/Property:P1 Has examples of EDTF codes
https://www.wikibase.consulting/wikibase-edtf/ Wikibase implemented
EDTF in 2021
https://github.com/ProfessionalWiki/WikibaseEdtf#wikibase-edtf
https://github.com/corylown/edtf-humanize Transform EDTF strings into
human friendly display https://github.com/unt-libraries/edtf-validate
Validate EDTF strings https://github.com/plk/biblatex/issues/656
Discussion of Biblatex's implementation of EDTF
https://www.npmjs.com/package/edtf Parser for EDTF
https://github.com/inukshuk/edtf.js/tree/main Parser for EDTF
Implemention details
https://www.w3.org/TR/international-specs/#loc_time
https://dev.mysql.com/doc/refman/5.7/en/date-and-time-type-syntax.html
Time zones
https://datatracker.ietf.org/doc/draft-ietf-sedate-datetime-extended/
An extension syntax for representing time zone. We should follow this.
Very helpful for implementing time zones.
https://www.w3.org/TR/timezone/#representing Very relevant
https://www.w3.org/International/core/2005/09/timezone.html#IDALFAT
Calendar and scheduling
https://www.calconnect.org/resources/glossary