Skip to content

Conversation

@Ayush10
Copy link

@Ayush10 Ayush10 commented Feb 1, 2026

Summary

  • Normalizes all numpy.datetime64 values to nanosecond (ns) precision during Index.__init__ construction, ensuring consistent hashing across all dict lookups and set operations
  • Fixes KeyError when multiple Index objects with different datetime64 precisions interact in concat, sum_by_index, __or__, and _align_indices
  • Adds comprehensive test covering cross-precision lookups, arithmetic, concat, sum_by_index, to_dict, and reindex

Root Cause

numpy.datetime64 values with different precisions (e.g. 'ns' vs 's') compare as equal (== returns True) but produce different hashes. Since Index uses a dict for index_map and several operations use set(), mismatched precisions cause KeyError failures even though the datetime values are logically identical.

Fix

A single normalization line in Index.__init__ that converts datetime64 arrays to datetime64[ns] — the standard precision used by pandas. This fixes all downstream operations at the source rather than patching each individually.

Test plan

  • New test_datetime64_precision_normalization test covering all affected code paths
  • All existing test_index_data.py tests pass
  • CI pipeline passes

Fixes #1806

…tent hashing

numpy.datetime64 values with different precisions (e.g. 'ns' vs 's') compare
as equal but produce different hashes, breaking dict lookups and set operations
in Index, concat, sum_by_index, and _align_indices.

Fixes microsoft#1806
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

numpy.datetime64 precision cause dict indexing failure in index_data.py

1 participant