Notes

Coordinate to Timezone Lookup Library

April 5, 2026

Looking up the timezone for a given latitude/longitude coordinate is a common problem in some software domains. There are several databases like timezone-boundary-builder which provide mappings of geo polygons to timezones which are updated with political changes. However, the full GeoJSON file is large and complex. Doing naive “is coordinate-in-polygon” lookups can take milliseconds and require hundreds of megabytes of memory. I created the github.com/albertyw/localtimezone project with the goal of providing lat/lng->timezone lookups with accuracy and microsecond-level responses.

Data Generation

Stage 1: Converting Polygons to H3 Cells

H3 is Uber’s hierarchical hexagonal geospatial indexing system. It divides the world into hexagonal cells at multiple resolutions. At resolution 7, each cell covers about 5.16 square km. This was chosen as a tradeoff between higher resolutions that would slow down performance and lower resolutions that would trade off accuracy.

By using H3, instead of storing and computing polygon geometries, a lat/lng coordinate can be efficiently converted into an H3 cell and the H3 cell can be mapped to a timezone.

GeoJSON Polygons
      │
      ▼
  h3.PolygonToCells(polygon, resolution=7)
      │
      ▼
  Set of H3 cell IDs for each timezone

The conversion of polygons to H3 cells is done in a go generate function and the resulting H3-> timezone mapping is committed into the library. The generator (tzshapefilegen/main.go) downloads the latest timezone GeoJSON release, processes each feature in parallel via goroutines, and collects all the H3 cell IDs for each timezone.

Stage 2: H3 Cell Compaction

A resolution-7 grid of the entire land surface of Earth produces millions of cells. Storing all of them is wasteful, because many neighboring cells belong to the same timezone. H3 provides a compaction operation that exploits its hierarchical structure: when all 7 children of a parent cell all map to the same timezone, they can be replaced by their single parent cell at the next coarser resolution.

Before compaction:        After compaction:
  7 children cells          1 parent cell (resolution 6)
  at resolution 7
  ┌───┬───┬───┐             ┌─────────────┐
  │ A │ A │ A │             │      A      │
  ├───┼───┼───┤      →      │   (parent)  │
  │ A │ A │ A │             └─────────────┘
  ├───┼───┤
  │ A │
  └───┘

(Imagine these are hexagons, not rectangles)

This recursively compresses large homogeneous regions like Asia/Shanghai and Europe/Moscow into much smaller representations. The result is a mixed-resolution set of cells - fine-grained cells near timezone borders, coarser cells in wide uniform regions. In localtimezone, this results in a 97% reduction of cells from H3 compaction (32886K cells to 896K cells).

Stage 3: The Binary Format

To allow the client to read the H3 data fast, H3->timezone data is serialized into a compact custom binary format called H3TZ:

Offset  Size   Field
──────────────────────────────────────────
0       4      Magic: "H3TZ"
4       1      Version (1)
5       1      H3 resolution used
6       2      Number of timezone names (uint16)
8       ...    String table: [uint16 len][bytes] per timezone name
...     4      Cell count (uint32)
...     N×10   Cell entries: [int64 cell ID][uint16 tz index]

The string table stores timezone names once, and each cell entry references a name by its uint16 index. Each cell entry is a fixed 10 bytes.

The cells array is sorted by cell ID before writing. This is critical for the binary search used at query time.

Stage 4: S2 Compression

The raw binary data is then compressed using S2, a high-speed compression algorithm derived from Snappy. S2 is optimized for decompression throughput rather than maximum compression ratio — the entire dataset is decompressed once at client initialization, so decompression speed matters more than final file size. See the previous post about Go Compression Benchmark Results for more information.

Stage 5: Embedding in the Binary

The compressed data file is embedded at compile time:

//go:embed data.h3.s2
var TZData []byte

There is no file I/O at runtime. The timezone data ships inside the Go binary itself. The only cost is binary size and the one-time decompression at startup.

Library Initialization and Lookup

Initialization: Decompression and Indexing

When NewLocalTimeZone() is called, it:

  1. Decompresses the S2-compressed blob
  2. Parses the H3TZ binary header and string table
  3. Bulk-reads all cell entries into two parallel slices: cells []int64 (sorted cell IDs) and tzIdx []uint16 (timezone name indices)
  4. Stores the result in an atomic.Pointer[immutableCache]
S2-compressed bytes  →  decompress  →  parse H3TZ  →  immutableCache
                                                         ├── tzNames []string
                                                         ├── cells   []int64   (sorted)
                                                         └── tzIdx   []uint16

The immutable cache makes the client safe for concurrent access without locks. The atomic.Pointer allows the cache to be replaced atomically if new data is loaded, though in practice the embedded data never changes at runtime.

Client startup takes about 5ms and uses ~17MB of RAM.

Lookup: Binary Search with Multi-Resolution Fallback

A timezone lookup for a (lat, lon) point works like this:

  1. Convert point to H3 cell at resolution 7
  2. For each resolution from 7 down to 0: a. Compute the cell at this resolution (cell or its ancestors) b. Binary search in the sorted cells array c. Scan forward to collect all entries with that cell ID (for overlapping zones)
  3. Fallback: Compute timezone from neighboring H3 cells or based on longitude

Step 2 handles the compacted cells. If a point’s exact resolution-7 cell isn’t in the table, its resolution-6 parent might be (because that whole region was compacted). The lookup scans all resolutions, collecting matches at each level to handle overlapping zones.

The binary search is O(log N) over the sorted cells array. With ~900K cells, this is still only about 20 comparisons. The actual measured response time is around 1 microsecond per lookup.

Benchmark Numbers

BenchmarkGetZone/GetZone_on_large_cities       989595     1205 ns/op
BenchmarkGetZone/GetOneZone_on_large_cities    1000000    1006 ns/op
BenchmarkClientInit/main_client                247        4772720 ns/op   (~5ms)

About a million lookups per second per core, with a ~5ms cold start.

Architecture Summary

The localtimezone library makes use of these tricks to increase speed of initialization and lookup:

  1. Preprocessing polygon geojsons into H3 for O(log N) cell-based lookups
  2. Compacting H3 cells to minimize memory usage and reduce binary search space
  3. Serializing H3 data into a binary format for faster loading
  4. Compression with S2 for fast decompression
  5. Binary search over a sorted cell array for fast lookups

By combining these tricks, the library maintains acceptable accuracy (down to 5 square kilometer resolution) while returning timezone values for anywhere in the world in microseconds while using ~17MB of memory.