Skip to content

Conversation

@Ayush10
Copy link

@Ayush10 Ayush10 commented Jan 31, 2026

Summary

  • Fix ValueError on mixed date formats (YYYY-MM-DD vs YYYY-MM-DD HH:MM:SS) by using pd.to_datetime(utc=True) which handles mixed formats across all pandas versions
  • Fix AttributeError: 'Index' object has no attribute 'tz_localize' by switching to tz_convert(None) after utc=True conversion
  • Fix TypeError: unsupported operand type(s) for /: 'str' and 'float' by adding pd.to_numeric(errors="coerce") before arithmetic operations on columns that may contain string data from CSV reads

Changes

  • scripts/data_collector/yahoo/collector.py: Fix normalize_yahoo date handling (lines 395-396), add numeric coercion in adjusted_price and _manual_adj_data
  • scripts/data_collector/base.py: Fix Normalize._executor date filtering (line 308)

Test plan

  • The fillna(method="ffill") deprecation warning mentioned in the issue is already fixed in the current codebase (.ffill() is used)
  • Performance improvements and --skip_download are feature requests beyond the scope of this bug fix PR

Fixes #1981

- Use pd.to_datetime(utc=True) + tz_convert(None) to handle mixed date
  formats and timezone-aware/naive inputs in pandas >= 2.0
- Add pd.to_numeric(errors="coerce") before arithmetic on columns that
  may contain string data from CSV reads
- Apply same utc=True fix in base.py Normalize._executor date filtering

Fixes microsoft#1981
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Multiple Issues in Yahoo Data Collector Causing Failures

1 participant