23 Minutes in the Past

Exploring the history of timezones and why Python's pytz offsets Indian times by 23 minutes. I try Python's new ZoneInfo stdlib module, and describe why it sucks.

23 Minutes in the Past

Back in 2023, Discord rolled out Events. Users could RSVP, and receive Notifications when the event was starting.

For years, we've been having new joinees ask when our events are scheduled, or join a few hours too late - often due to timezone mishaps. This feature couldn't have rolled out at a better time, as Discord handles all of this.

Well before recurring events were a thing, I decided to add Event Scheduling to the Bot - If the event for this week didn't exist, create it.

event = await guild.create_scheduled_event(
    name=f"Weekly Campfire #{vc_no}",
    scheduled_start_time=(
        datetime.combine(
            saturday,
            time(
                hour=16, minute=00
            ),  # Thanks https://stackoverflow.com/a/49828827
            tzinfo=pytz.timezone("Asia/Calcutta"),
        )
    ),
    ...
)

This worked, but take a closer look:

Do you notice it?

A User sent me this screenshot, asking why the times looked odd. (3:37 PM instead of 4:00 PM).

datetime.combine(
    saturday,
    time(
        hour=16, minute=00
    ),  # Thanks https://stackoverflow.com/a/49828827
    tzinfo=pytz.timezone("Asia/Calcutta"),
)

In this innocuous snippet of code though, lied an interesting bug, and a historical tale.

During the era of the British Empire, a single colony often had multiple timezones. This was no different for India, currently at GMT+5:30, it used to have four different timezones.

Indian Standard Time (Wikipedia)

How does this translate into code?

Calculating Timezone Offsets

Asia/Calcutta, India's Timezone, is 5 hours and 30 minutes ahead of GMT.

How did we calculate this value?

One day is defined as the time taken for Earth to complete one rotation around its Axis, During this time, all the latitudes (horizontal lines) are exposed to the Sun, but only a few Longitudes (Vertical lines) are. As the Earth rotates, different Longitudes face the Sun - making it day in that area.

For time calculation, we'll be using Longitudes exclusively. There's one longitude for each degree, and Earth rotates 360° in 24 hours. To calculate how much time passes between each Longitude (one degree difference), we'll divide 24/360.

This gives us 0.06666666667, that's 0.06666666667 Hours of difference between each Longitude. Multiplying that by 60 (minutes in an hour), we get exactly 4 minutes for each degree travelled.

A Google Search reveals that Calcutta's coordinates are 22.5744° N, 88.3629° E.

We're 88.3629° East of GMT. East is Addition, as that Longitude has already been exposed to the Sun.


Take a look at those two red circles. The long red line represents Sunset. If we're at Point 1, Point 2 is to our East. We can see that Sunset has already happened at Point 2, but not where we are (Point 1).

This means time is more advanced at Point 2, there's a 2° difference to the East, that means Point 2 is 8 minutes ahead of Point 1.

This is why East is addition.


Calcutta is 88.3629° East of GMT, let's convert that to minutes. 88.3629 x 4 = 353.4516 minutes. That's exactly 5 hours, 53 minutes, and 27 seconds.

Let's take an arbitrary time, 6:00 PM, or 18:00. Using this timezone of GMT+5:53, 6:00 PM would become 12:07 PM GMT.

If convert that to India's current timezone, of GMT+5:30, that time becomes 5:37 PM, or 17:37.

Exactly 23 minutes behind.


Why does this happen?

Sure, this is sound mathematically, but why is Python using 123-year old timezones?

Before Python 3.9, pytz was the de-facto method of working with different timezones. It's an excellent library, allows easy conversion of timezone strings, and is usually accepted by the Standard Library's datetime module.

>>> datetime.combine(
    datetime.today(),
    time(
        hour=16, minute=00
    ),  # Thanks https://stackoverflow.com/a/49828827
    tzinfo=pytz.timezone("Asia/Calcutta"),
)

datetime.datetime(2024, 7, 17, 16, 0, tzinfo=<DstTzInfo 'Asia/Calcutta' LMT+5:53:00 STD>)

Notice the tzinfo? We're seeing a 5 hour 53 minute offset.

Meant to be a feature, pytz uses the appropriate offset based on the date it's converting to and from. Here, it's a bug.

Let's see this action, between 1884–1948, Asia/Calcutta was GMT+5:53. After 1948, it was converted to GMT+5:30 - the offset at the center of India.

t = datetime.now()  # 2024
t1 = t - timedelta(days=140*365)  # ~140 years behind, 1884

# Let's convert both to the Indian timezone

tz = pytz.timezone("Asia/Calcutta")
tz.localize(t).tzinfo  # <DstTzInfo 'Asia/Calcutta' IST+5:30:00 STD>
tz.localize(t1).tzinfo # <DstTzInfo 'Asia/Calcutta' LMT+5:53:00 STD>

This is amazing, the developer's attention to detail is remarkable.

But, there's one issue. Our scheduled event is in 2024, why is pytz using a timezone from 1884?

That's a simple answer, datetime's tzinfo doesn't support pytz. Instead, developers are told to use tz.localize(date).

But why does this actually happen?

Pytz is unable to read the date when it's used as such:

datetime.combine(datetime.now(), time(hour=16, minute=00), tzinfo=pytz.timezone("Asia/Calcutta"))

As, the date doesn't exist yet, pytz has no idea which offset to use - and currently, it defaults to the oldest record (GMT+5:53). Which, as we've seen in our case, is unsuitable.

The Solution?

We could do tz.localize(), but Python 3.9 has a new solution for us.

from zoneinfo import ZoneInfo

Python's new ZoneInfo constructor can act is a replacement for pytz. By no means is it drop-in, especially lacking convenience features (pytz.timezone("asia/calcutta") works, ZoneInfo("asia/calcutta") raises an error).

I don't understand the system of offset either,

>>> asia_calcutta = zoneinfo.ZoneInfo(key='Asia/Calcutta')
>>> asia_calcutta.utcoffset(t)
datetime.timedelta(seconds=19800)  # 2024 - 5 hours, 30 minutes
>>> asia_calcutta.utcoffset(t1)
datetime.timedelta(seconds=19270)  # 1884 - 5 hours, 21 minutes

The same example in pytz yields the correct offsets:

>>> tz.utcoffset(t)
datetime.timedelta(seconds=19800)  # 2024 - 5 hours, 30 minutes
>>> tz.utcoffset(t1)
datetime.timedelta(seconds=21180)  # 1884 - 5 hours, 53 minutes

I'll edit this with clarification, when I figure out why this is happening, and whether it's a bug.

Still interested, I gave ZoneInfo a spin:

Roadblocks

  1. tzname Inconsistency:
# ZoneInfo
>>> asia_calcutta.tzname(t)
'IST'
>>> asia_calcutta.tzname(t1)
'MMT'  # Myanmar Standard Time

# pytz
>>> tz.tzname(t)
'IST'
>>> tz.tzname(t1)
'LMT'  # Local Mean Time, it's a drop-in for timezones that don't exist currently.
  1. Capitalization: I could pass user-provided timezones to pytz directly. Now, I need to add a step of parsing:
from zoneinfo import available_timezones, ZoneInfoNotFoundError

def clean_timezone(timezone: str) -> str:
    """Covert timezone to proper capitalization. asia/calcutta -> Asia/Calcutta."""
    lower_tz = timezone.lower()
    all_timezones = list(available_timezones())
    lower_timezones: List[str] = [tz.lower() for tz in all_timezones]

    if lower_tz in lower_timezones:
        return all_timezones[lower_timezones.index(lower_tz)]

    raise ZoneInfoNotFoundError(timezone)
    

I'm not a big fan of ZoneInfo so far, I know it's going to break something when I deal with old dates and timezones - but so far, things have been (somewhat) smooth sailing.

As long as time doesn't rewind, I suppose!

What did I use?

ZoneInfo,

datetime.combine(
    datetime.today(),
    time(
        hour=16, minute=00
    ),  # Thanks https://stackoverflow.com/a/49828827
    tzinfo=ZoneInfo("Asia/Calcutta"),
)

This is what I settled on, and it works. Though, I'd rather have used tz.localize.

Thanks for the read!

If you have advice, questions, ideas, or corrections - please reach out at [email protected].


Deployment Data

{
  "LiveURL": "https://www.joinfreedomacademy.com/",
  "Repository": "https://github.com/Freedom-Academy/LatestFreedomBot/"
}