Skip to content

Support engine-aware metric collection in EPP #2096

@bongwoobak

Description

@bongwoobak

What would you like to be added:
Support for engine-aware metric collection in the EPP, allowing a single InferencePool to include Pods running different inference engines (e.g., vLLM, SGLang) while still collecting the correct metrics per Pod.

This would enable use cases such as:

  • A/B testing between inference engines
  • Gradual engine migration within the same pool
  • Mixed-engine pools with a unified routing layer

One possible design direction could include:

  • A Mapping Registry that defines metric mappings per engine type
  • Runtime selection of metric mappings based on Pod labels (e.g., inference.k8s.io/engine-type)
  • Optional configuration flags to enable multi-engine support
  • Backward compatibility where existing single-engine setups continue to work as a default mapping

(This is just one possible approach, and open to discussion.)

Why is this needed:
Currently, EPP assumes that all Pods in an InferencePool expose identical metric names. This assumption breaks down in real-world scenarios where:

  • Different inference engines expose different metric schemas
  • Users want to compare or migrate engines without splitting pools

For example:

  • vllm:num_requests_waiting
  • sglang:num_queue_reqs

When such Pods are mixed, metric collection fails or becomes unreliable, preventing effective routing decisions.

Supporting engine-aware metric collection would make InferencePool more flexible and better aligned with real deployment patterns.

If this direction aligns with the project goals, I’m happy to follow up with a PR or contribute an initial implementation for discussion.

Metadata

Metadata

Assignees

No one assigned

    Labels

    needs-triageIndicates an issue or PR lacks a `triage/foo` label and requires one.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions