Python: Add typical rate limiting handling for model clients

Every single model inference API is rate limited these days, so any practical use of the method `AzureOpenAIChatClient.create_agent` would need to deal with retry logic leading to boiler plate code.

I would like Agent Framework to solve this typical pattern, helping the developer to focus on the rest. A simple implementation using `tenacity` library would be as follows (but streaming needs a more delicate handling):

```python
class AzureOpenAIChatClientWithRetry(AzureOpenAIChatClient):
    """Azure OpenAI Chat Client with built-in retry logic for handling rate limits."""

    retry_attempts = 3
    """Number of retry attempts for rate limit errors."""

    @staticmethod
    def _before_sleep_log(retry_state: RetryCallState) -> None:
        """Log when rate limiting is reached and retry is about to sleep."""
        attempt_number = retry_state.attempt_number
        wait_time = retry_state.next_action.sleep if retry_state.next_action else 0
        logger.warning(
            "Rate limiting reached. Attempt %d failed. Retrying in %.2f seconds...",
            attempt_number,
            wait_time,
        )

    @override
    @retry(
        stop=stop_after_attempt(retry_attempts),
        wait=wait_exponential(multiplier=1, min=4, max=10),
        retry=retry_if_exception_type(RateLimitError),
        reraise=True,
        before_sleep=_before_sleep_log
    )
    def get_response(self, *args, **kwargs):
        """Get response with retry on rate limit errors (429 status code only)."""
        return super().get_response(*args, **kwargs)
```

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Python: Add typical rate limiting handling for model clients #1362

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Python: Add typical rate limiting handling for model clients #1362

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions