Releases · mlflow/mlflow

06 Apr 05:49

TomeHirata

model-catalog/latest

257aa57

Model Catalog

Per-provider model catalog files. Updated weekly by CI.

Assets 72

13 Apr 09:42

B-Step62

ts/v0.2.0-rc.1

db9142d

TypeScript SDK 0.2.0 RC1 Pre-release

Pre-release

Release candidate for @mlflow/vercel TypeScript package with version 0.2.0: #22105

Assets 2

08 Apr 05:29

WeichenXu123

v3.11.1

09179c6

v3.11.1 Latest

Latest

MLflow 3.11.1 includes several major features and improvements.

Major New Features:

🔍 Automatic Issue Identification: Automatically identify quality issues in your agent with AI! Use the new "Detect Issues" button in the traces table to analyze selected traces and surface potential problems across categories like correctness, safety, and performance. Issues are linked directly to traces for easy investigation and debugging. Docs (#21431, #21204, #21165, #21163, #21161, @smoorjani, @serena-ruan)
💰 Gateway Budget Alerts & Limits: Control your AI Gateway spending with configurable budget policies! Set spending limits by time window (daily, weekly, or monthly), receive alerts before hitting limits, and prevent runaway costs with automatic request blocking. The new budget management UI lets you track spending, configure webhooks for notifications, and monitor violations across all your gateway endpoints. Docs (#21116, #21534, #21569, #21473, #21108, @TomeHirata, @copilot-swe-agent)
📊 Trace Graph View: Visualize complex trace hierarchies with an interactive graph view! Navigate multi-level trace structures, understand parent-child relationships at a glance, and debug complex systems more effectively with a visual representation of your trace topology. Docs (#20607, @joelrobin18)
🌐 Native OpenTelemetry GenAI Convention Support: MLflow now natively supports the OpenTelemetry GenAI Semantic Conventions for trace export! When exporting traces via OTLP with MLFLOW_ENABLE_OTEL_GENAI_SEMCONV enabled, MLflow automatically translates them to follow the OTel GenAI semantic conventions, enabling seamless integration with OTel-compatible observability platforms while preserving GenAI-specific metadata. Docs (#21494, #21495, @B-Step62)
🔧 OpenCode Tracing Integration: Debug smarter with OpenCode CLI integration! Track and analyze code execution flows directly from your development workflow, making it easier to identify performance bottlenecks and trace issues back to specific code paths. Docs (#20133, @joelrobin18)
⚡ Native UV Support for Model Dependencies: Automatic dependency inference now supports UV! MLflow automatically detects UV projects and captures exact, locked dependencies from your lockfile when logging models, ensuring reproducible environments. Docs (#20344, #20935, @debu-sinha)
🔒 Pickle-Free Model Serialization: Enhance security with pickle-free model formats! MLflow now supports safer model serialization using torch.export and skops formats, with improved controls when MLFLOW_ALLOW_PICKLE_DESERIALIZATION=False. Comprehensive documentation guides you through migrating existing models to pickle-free formats for production deployments. Docs (#21404, #21188, #20774, @WeichenXu123)

Breaking Changes:

⚠️ TypeScript SDK Package Renaming: The MLflow TypeScript SDK packages have been renamed to use npm organization scoping. If you're using the TypeScript SDK, update your package.json dependencies and import statements: mlflow-tracing → @mlflow/core, mlflow-openai → @mlflow/openai, mlflow-anthropic → @mlflow/anthropic, mlflow-gemini → @mlflow/gemini. All packages are now at version 0.2.0. (#20792, @B-Step62)
Remove MLFLOW_ENABLE_INCREMENTAL_SPAN_EXPORT environment variable (#22182, @PattaraS)
Remove litellm and gepa from genai extras (#22059, @TomeHirata)
Block / and : in Registered Model names (#21458, @Bhuvan-08)

Features:

[Evaluation] Allow MetaPromptOptimizer to work without litellm (#22233, @TomeHirata)
[Tracking] Update Databricks API calls to use new gRPC APIs instead of py4j APIs (#22205, @WeichenXu123)
[Build] Add aiohttp as a core dependency of mlflow (#22189, @TomeHirata)
[Evaluation] Extend _get_provider_instance with groq, deepseek, xai, openrouter, ollama, databricks, vertex_ai (#22148, @kriscon-db)
[UI] Move native providers to non-LiteLLM in gateway UI (#22203, @TomeHirata)
[Tracing / Tracking] Add trace_location parameter to create_experiment (#22075, @dbrx-euirim)
[Gateway] Complete Bedrock provider with Converse API support (#21999, @TomeHirata)
[Gateway] Add native Vertex AI gateway provider (#21998, @TomeHirata)
[Gateway] Add native Databricks gateway provider (#21997, @TomeHirata)
[Gateway] Add native Ollama gateway provider (#21995, @TomeHirata)
[Gateway] Add native xAI (Grok) gateway provider (#21993, @TomeHirata)
[Tracing] Use bulk upsert in log_spans() to eliminate per-span ORM overhead (#21954, @harupy)
[Tracing] Add builtin cost_per_token to remove litellm dependency for cost tracking (#22046, @TomeHirata)
[Evaluation] Remove LiteLLM hard dependency from the discovery pipeline and judge adapters (#21739, @harupy)
[Evaluation] Add pipelined predict-score execution for mlflow.genai.evaluate (#20940, @alkispoly-db)
[Tracing / Tracking] Default trace location table_prefix to experiment ID in set_experiment (#21815, @danielseong1)
[Tracking] Add default uvicorn log config with timestamps (#21838, @harupy)
[Tracing / UI] Add Session ID filter to GenAI traces table filter dropdown (#21794, @daniellok-db)
[Evaluation / UI] Add Default Credential Chain auth mode for Bedrock/SageMaker in AI Gateway (#21061, @timsolovev)
[UI] Add multi metric bar chart support (#21258, @RenzoMXD)
[Tracking] Add TCP keepalive to HTTP sessions to detect stale connections and reduce timeout hangs (#21514, @mobaniha)
[Evaluation] Add proxy URL support for make_judge (#21185, @yukimori)
[UI] Improve run group filter to use grouping criteria instead of run IDs (#21072, @daniellok-db)
[UI] Add tool selector to Tool Calls charts and fix dark mode/sizing (#20865, @B-Step62)
[UI] Graph View Traces + OpenAI (#20607, @joelrobin18)
[UI] Show run description in chart tooltip (#21580, @KaushalVachhani)
[Evaluation / Tracing / UI] Add bulk judge execution from traces table toolbar with status feedback (#21270, @PattaraS)
[Gateway] Add Redis-backed BudgetTracker for distributed gateway deployments (#21504, @TomeHirata)
[Tracing / Tracking] Add trace location param to set_experiment (#21385, @danielseong1)
[Build / Tracking] Add azure extra for Azure Blob Storage support in full Docker image (#21582, @harupy)
[UI] Add budget violation indicator to gateway budget list page (#21569, @copilot-swe-agent)
[Evaluation] [5/5] Add discover_issues() pipeline and public API (#21431, @smoorjani)
[UI] Add Structured Output (JSON Schema) Support to the MLflow Prompts UI (#21394, @kennyvoo)
[Tracing] Auto-inject tracing context headers in autologging (#21490, @TomeHirata)
[UI] Add budget alert webhooks UI and fix budgets table borders (#21534, @TomeHirata)
[Model Registry / Prompts / UI] Add webhooks management UI to settings page (#21483, @TomeHirata)
[Tracing] Opencode CLI (#20133, @joelrobin18)
[Models] Add uv_groups and uv_extras params for uv dependency group control (#20935, @debu-sinha)
[Tracing] Add GenAI Semantic Convention translator for OTLP trace export (#21494, @B-Step62)
[Tracking] Add polars dataset support to autologging (#21507, @harupy)
[Tracing] Add mlflow.tracing.context() API for injecting metadata/tags without wrapper spans (#21318, @B-Step62)
[UI] Add budget dates and current spending for gateway budgets (#21473, @TomeHirata)
[Tracing / UI] Improve DSPy trace chat view readability (#21296, @B-Step62)
[UI] Add Kubernetes request auth provider plugin (#21176, @HumairAK)
[Tracking] Add IS NULL/IS NOT NULL support for tags and params in search_runs (#21283, @TomeHirata)
[Tracing / UI] Display clickable gateway trace link in trace explorer (#21316, @TomeHirata)
[UI] Add session selection support with checkbox, actions, and row alignment (#21324, @B-Step62)
[Models] Add UV package manager support for automatic dependency inference (#20344, @debu-sinha)
[Evaluation / UI] Add feature flag to control evaluation runs issues panel visibility (#21406, @serena-ruan)
[Tracing / UI] Add cached tokens display to Token Usage chart (#21295, @TomeHirata)
[UI] Add budget policies management UI for AI Gateway (#21116, @TomeHirata)
[UI] Allow multiple judge selection in Run judge on trace modal (#21322, @B-Step62)
[Docs / Tracking] Add admin-only authorization to webhook CRUD operations (#21271, @TomeHirata)
[Evaluation / Tracking] Add SqlIssue database table for storing experiment issues (#21165, @serena-ruan)
[Model Registry / Prompts] Support search_prompt_versions in OSS SQLAlchemy store (#21315, @TomeHirata)
[Evaluation / Tracing / UI] Add issue detection button to traces table toolbar with feature flag (#21204, @serena-ruan)
[Docs / Tracing / UI] Add inline audio player for input_audio content parts in trace UI (#21302, @TomeHirata)
[Evaluation / Tracing] Add IssueReference assessment type to store issue links with traces (#21163, @serena-ruan)
[Evaluation / Tracing] Add issue management protos with create, update, get, and search APIs (#21161, @serena-ruan)
[UI] Add IS NULL/IS NOT NULL operators for trace tags in search UI (#21280, @TomeHirata)
[Docs / Tracing] Add IS NULL/IS NOT NULL support for trace tags in search_traces (#21277, @TomeHirata)
[Tracing] Add steer message tracing support for Claude Code (#21265, @harupy)
[Models / Tracking] Add support for transformers 5.x (#20728, @KUrushi)
[Gateway] Add WEE...

Contributors

MarkVasile, amotl, and 61 other contributors

Assets 2

01 Apr 09:14

B-Step62

v3.11.0rc1

2627607

v3.11.0rc1 Pre-release

Pre-release

Stripped third-party dependencies from evaluation and AI Gateway features, replacing external provider routing with built-in implementations.

Assets 2

16 Mar 14:02

serena-ruan

v3.11.0rc0

cf3bbbf

v3.11.0rc0 Pre-release

Pre-release

We're excited to announce MLflow 3.11.0rc0, which includes several notable updates:

Major New Features:

🔍 Automatic Issue Identification: Automatically identify quality issues in your agent with AI! Use the new "Detect Issues" button in the traces table to analyze selected traces and surface potential problems across categories like correctness, safety, and performance. Issues are linked directly to traces for easy investigation and debugging. (#21431, #21204, #21165, #21163, #21161, @smoorjani, @serena-ruan)
💰 Gateway Budget Alerts & Limits: Control your AI Gateway spending with configurable budget policies! Set spending limits by time window (daily, weekly, or monthly), receive alerts before hitting limits, and prevent runaway costs with automatic request blocking. The new budget management UI lets you track spending, configure webhooks for notifications, and monitor violations across all your gateway endpoints. (#21116, #21534, #21569, #21473, #21108, @TomeHirata, @copilot-swe-agent)
📊 Trace Graph View: Visualize complex trace hierarchies with an interactive graph view! Navigate multi-level trace structures, understand parent-child relationships at a glance, and debug complex systems more effectively with a visual representation of your trace topology. (#20607, @joelrobin18)
🌐 Native OpenTelemetry GenAI Convention Support: MLflow now natively supports the OpenTelemetry GenAI Semantic Conventions for trace export! When exporting traces via OTLP with MLFLOW_ENABLE_OTEL_GENAI_SEMCONV enabled, MLflow automatically translates them to follow the OTel GenAI semantic conventions, enabling seamless integration with OTel-compatible observability platforms while preserving GenAI-specific metadata. (#21494, #21495, @B-Step62)
🔧 Opencode Tracing Integration: Debug smarter with Opencode CLI integration! Track and analyze code execution flows directly from your development workflow, making it easier to identify performance bottlenecks and trace issues back to specific code paths. (#20133, @joelrobin18)
⚡ UV Package Manager Support: Automatic dependency inference now supports UV! MLflow automatically detects UV projects and captures exact, locked dependencies from your lockfile when logging models, ensuring reproducible environments. (#20344, #20935, @debu-sinha)
🔒 Pickle-Free Model Serialization: Enhance security with pickle-free model formats! MLflow now supports safer model serialization using torch.export and skops formats, with improved controls when MLFLOW_ALLOW_PICKLE_DESERIALIZATION=False. Comprehensive documentation guides you through migrating existing models to pickle-free formats for production deployments. (#21404, #21188, #20774, @WeichenXu123)

Breaking Changes:

⚠️ TypeScript SDK Package Renaming: The MLflow TypeScript SDK packages have been renamed to use npm organization scoping. If you're using the TypeScript SDK, update your package.json dependencies and import statements: mlflow-tracing → @mlflow/core, mlflow-openai → @mlflow/openai, mlflow-anthropic → @mlflow/anthropic, mlflow-gemini → @mlflow/gemini. All packages are now at version 0.2.0. (#20792, @B-Step62)

Stay tuned for the full release, which will be packed with even more features and bugfixes.

To try out this release candidate, please run:

pip install mlflow==3.11.0rc0

Contributors

debu-sinha, smoorjani, and 6 other contributors

Assets 2

05 Mar 14:47

daniellok-db

v3.10.1

cadc323

v3.10.1

MLflow 3.10.1 is a patch release that contains some minor feature enhancements, bug fixes, and documentation updates.

Features:

[UI] Add try-it page on Gateway usage example modal (#21077, @PattaraS)
[UI] Filter gateway experiments from the experiment list page (#21130, @copilot-swe-agent)

Bug fixes:

[UI] Fix "View full dashboard" link in gateway usage tab when workspace is enabled (#21191, @copilot-swe-agent)
[UI] Persist AI Gateway default passphrase security banner dismissal to localStorage (#21292, @copilot-swe-agent)
[Evaluation] Demote unused parameters log message from WARNING to DEBUG in instructions judge (#21294, @copilot-swe-agent)
[UI] Clear "All" time selector when switching to overview tab (#21371, @daniellok-db)
[Prompts / UI] Fix Traces view in Prompts tab not being scrollable (#21282, @TomeHirata)
[UI] Fix judge builder instruction textarea (#21299, @daniellok-db)
[UI] Fix group mode to aggregate "Additional runs" as "Unassigned" group in charts (#21155, @copilot-swe-agent)
[UI] Fix artifact download when workspaces are enabled (#21074, @timsolovev)
[Tracing] Fix NOT NULL constraint on assessments.trace_id during trace export (#21348, @dbczumar)
[Tracking] Fix 403 Forbidden for artifact list via query param when default_permission=NO_PERMISSIONS (#21220, @copilot-swe-agent)
[UI] [ML-63097] Fix broken LLM judge documentation links (#21347, @smoorjani)
[Tracing] Fix Run Judge failed with litellm.InternalServerError: Invalid response object. (#21262, @PattaraS)
[Tracing / UI] Update Action menu: indentation to avoid confusion (#21266, @PattaraS)
[Model Registry] Fix MlflowClient.copy_model_version for the case that copy UC model across workspaces (#21212, @WeichenXu123)
[UI] Fix empty description box rendering for sanitized-empty experiment descriptions (#21223, @copilot-swe-agent)
[Artifacts] Fix single artifact downloading through HttpArtifactRepository (#12955, @Koenkk)
[Tracing] Fix find_last_user_message_index skipping skill content injections (#21119, @alkispoly-db)
[Tracing] Fix retrieval context extraction when span outputs are stored as strings (#21213, @smoorjani)
[UI] Fix visibility toggle button in chart tooltip not working (#21071, @daniellok-db)
[UI] Move gateway experiment filtering to server-side query to fix inconsistent page sizes (#21138, @copilot-swe-agent)
[Gateway] Downgrade spurious warning to debug log for gateway endpoints with fallback_config but no FALLBACK models (#21123, @copilot-swe-agent)
[Tracing] Fix MCP fn_wrapper to pass None for optional params with UNSET defaults (#21051, @yangbaechu)
[Tracking] Add CASCADE to logged_model tables experiment_id foreign keys (#20185, @harupy)
[Tracing] Fix MCP fn_wrapper handling of Click UNSET defaults (#20953) (#20962, @yangbaechu)

Documentation updates:

[Docs] Update SSO oidc plugin doc: add google identity platform / AWS cognito / Azure Entra ID configuration guide (#20591, @WeichenXu123)
[Docs / Tracing] Fix distributed tracing rendering and improve doc (#21070, @B-Step62)
[Docs] docs: Add single quotes to install commands with extras to prevent zsh errors (#21227, @mshavliuk)
[Docs / Model Registry] Fix outdated docstring claiming models:/ URIs are unsupported in register_model (#21197, @copilot-swe-agent)
[Docs] Replace MinIO with RustFS in docker-compose setup (#21099, @jmaggesi)

Small bug fixes and documentation updates:

#20740, #21148, #21149, #21096, @TomeHirata; #21368, #21118, @B-Step62; #21384, #21345, #21236, #21106, #21033, #21115, #21034, @smoorjani; #21326, #21133, #21036, @copilot-swe-agent; #21293, @daniellok-db; #21175, @caponetto; #21305, #21264, @serena-ruan; #21216, @justinwei-db; #21038, #21082, @bbqiu; #21143, #20733, @mprahl; #20488, @mdalvz0000; #21142, @EPgg92; #21094, @PattaraS

Contributors

caponetto, jmaggesi, and 20 other contributors

Assets 2

20 Feb 16:05

daniellok-db

v3.10.0

d0b9741

v3.10.0

We're excited to announce MLflow 3.10.0, which includes several notable updates:

Major New Features:

🏢 Organization Support in MLflow Tracking Server: MLflow now supports multi-workspace environments. Users can organize experiments, models, prompts, with a coarser level of unit and logically isolate them in a single tracking server. (#20702, #20657, @mprahl, @Gkrumbach07, @B-Step62)

💬 Multi-turn Evaluation & Conversation Simulation: MLflow now supports multi-turn evaluation, including evaluating existing conversations with session-level scorers and simulating conversations to test new versions of your agent, without the toil of regenerating conversations. Use the session-level scorers introduced in MLflow 3.8.0 and the brand new session UIs to evaluate the quality of your conversational agents and enable automatic scoring to monitor quality as traces are ingested. (#20243, #20377, #20289, @smoorjani)

💰 Trace Cost Tracking: Gain visibility into your LLM spending! MLflow now automatically extracts model information from LLM spans and calculates costs, with a new UI that renders model and cost data directly in your trace views. (#20327, #20330, @serena-ruan)

🎯 Navigation bar redesign: We've redesigned the navigation to provide a frictionless experience. A new workflow type selector in the top-level navbar lets you quickly switch between GenAI and Classical ML contexts, with streamlined sidebars that reduce visual clutter. (#20158, #20160, #20161, #20699, @ispoljari, @daniellok-db)

🎮 MLflow Demo Experiment: New to MLflow GenAI? With one click, launch a pre-populated demo and explore tracing, evaluation, and prompt management in action. No configuration, no code required. (#19994, #19995, #20046, #20047, #20048, #20162, @BenWilson2)

📊 Gateway Usage Tracking: Monitor your AI Gateway endpoints with detailed usage analytics. A new usage page shows request patterns and metrics, with trace ingestion that links gateway calls back to your experiments for end-to-end observability. (#20357, #20358, #20642, @TomeHirata)

⚡ In-UI Trace Evaluation: Users can now run custom or pre-built LLM judges directly from the traces and sessions UI. This enables quick evaluation of individual traces and individual without context switching to the python SDK. (#20360, @hubertzub-db, @danielseong1)

Features:

[UI] Add sliding animation to workflow switch component (#20831, @daniellok-db)
[Tracing] Display cached tokens in trace UI (#20957, @TomeHirata)
[Evaluation] Move select traces button to be next to Run judge (#20992, @PattaraS)
[Gateway] Distributed tracing for gateway endpoints (#20864, @TomeHirata)
[Gateway] Add user selector in the gateway usage page (#20944, @TomeHirata)
[Docs] [MLflow Demo] Docs for GenAI Demo (#20240, @BenWilson2)
[UI] Move Getting Started above experiments list and make collapsible (#20691, @B-Step62)
[Model Registry / Tracking] Add mlflow migrate-filestore command (#20615, @harupy)
[UI] Add visual indicator for demo experiment in experiment list (#20787, @B-Step62)
[Scoring] Enable parquet content_type in the scoring server input for pyfunc (#20630, @TFK1410)
[UI] feat(ui): Add workspace landing page, multi-workspace support, and qu… (#20702, @Gkrumbach07)
[Tracking] Merge workspace feature branch into master (#20657, @B-Step62)
[Gateway] Add Gateway Usage Page (#20642, @TomeHirata)
[Gateway] Add usage section in endpoint page (#20357, @TomeHirata)
[UI] [ MLflow Demo ] UI updates for MLflow Demo interfaces (#20162, @BenWilson2)
[Build] Support comma-separated rules in # clint: disable= comments (#20651, @copilot-swe-agent)
[Build / Docs / Models / Projects / Scoring] Replace virtualenv with python -m venv in virtualenv env_manager path (#20640, @copilot-swe-agent)
[Tracing] Add per-decorator sampling_ratio_override parameter to @mlflow.trace (#19784, @harupy)
[Evaluation / Tracking] Add mlflow datasets list CLI command (#20167, @alkispoly-db)
[Gateway] Add trace ingestion for Gateway endpoints (#20358, @TomeHirata)
[Tracing] feat(typescript-anthropic): add streaming support (#20384, @rollyjoel)
[Evaluation] Add delete dataset records API (#19690, @joelrobin18)
[UI] Add tooltip link to navigate to traces tab with time range filter (#20466, @serena-ruan)
[Tracking] [MLflow Demo] Add mlflow demo cli command (#20048, @BenWilson2)
[Evaluation] Add an SDK for distillation from conversation to goal/persona (#20289, @smoorjani)
[Tracing] Livekit Agents Integration in MLflow (#20439, @joelrobin18)
[Tracing / UI] Enable running scorers/judges from trace details drawer in UI (#20518, @danielseong1)
[Gateway] link gateway and experiment (#20356, @TomeHirata)
[Prompts] Add optimization backend APIs to auth control (#20392, @chenmoneygithub)
[Tracing] Add an SDK for search sessions to get complete sessions (#20288, @smoorjani)
[Tracing] Reasoning in Chat UI Mistral + Chat UI (#19636, @joelrobin18)
[Evaluation] Add TruLens third-party scorer integration (#19492, @debu-sinha)
[Evaluation / Tracing] Add Guardrails AI scorer integration (#20038, @debu-sinha)
[Tracking] [MLflow Demo] Add Prompt demo data (#20047, @BenWilson2)
[Tracking] [MLflow Demo] Add Eval simulation data (#20046, @BenWilson2)
[Tracking] [MLflow Demo] Add trace data for demo (#19995, @BenWilson2)
[Tracking] Support get_dataset(name=...) in OSS environments (#20423, @alkispoly-db)
[UI] Add session comparison UI with goal/persona matching (#20377, @smoorjani)
[UI] Model and cost rendering for spans (#20330, @serena-ruan)
[UI] [1/x] Support span model extraction and cost calculation (#20327, @serena-ruan)
[Evaluation] Make conversation simulator public and easily subclassable (#20243, @smoorjani)
[Prompts] Add progress tracking for prompt optimization job (#20374, @chenmoneygithub)
[Prompts] Prompt Optimization backend PR 3: Add Get, Search, and Delete prompt optimization job APIs (#20197, @chenmoneygithub)
[Prompts] Track intermediate candidates and evaluation scores in gepa optimizer (#20198, @chenmoneygithub)
[Tracking] [MLflow Demo] Base implementation for demo framework (#19994, @BenWilson2)
[Prompts] Prompt Optimization backend PR 2: Add CreatePromptOptimizationJob and CancelPromptOptimizationJob (#20115, @chenmoneygithub)
[Tracing] Support shift+select for Traces (#20125, @B-Step62)
[UI] Ml61127/remove experiment type selector inside experiment page (#20161, @ispoljari)
[UI] Ml61126/remove nested sidebars within gateway and experiments tab (#20160, @ispoljari)
[UI] [ML-61124]: add selector for workflow type in top level navbar (#20158, @ispoljari)
[Prompts / UI] Feat/render md in prompt registry (#19615, @iyashk)
[Prompts] [Prompt Optimization Backend PR #1] Wrap prompt optimize in mlflow job (#20001, @chenmoneygithub)
[Tracking] Add --experiment-name option to mlflow experiments get command (#19929, @alkispoly-db)

Bug fixes:

[Tracing / UI] Fix infinite fetch loop in trace detail view when num_spans metadata mismatches (#20596, @coldzero94)
[UI] fix:implement dark mode in experiment correctly (#20974, @intelliking)
[Evaluation] Fix 'Select traces' do not show new traces in Judge UI (#20991, @PattaraS)
[Tracing / Tracking] Fix RecursionError in strands, semantic_kernel, and haystack autologgers with shared tracer provider (#20809, @cgrierson-smartsheet)
[Tracking] fix(tracking): Fix IntegrityError in log_batch when duplicate metrics span multiple key batches (#20807, @aws-khatria)
[Tracing] Support native tool calls in CrewAI 1.9.0+ autolog tests (#20742, @TomeHirata)
[Evaluation] Fix retrieval_relevance assessments logged to wrong span with missing chunk index (#20998, @smoorjani)
[Evaluation] Fix missing session metadata on failed session-level scorer assessments (#20988, @smoorjani)
[Tracking] Enhance path validation in check_tarfile_security for windows (#20924, @TomeHirata)
[Docs] Fix admonition link underlines not rendering (#20990, @copilot-swe-agent)
[Tracking] Rebuild SearchTraces V2 request body on ENDPOINT_NOT_FOUND fallback (#20963, @brendanmaguire)
[Build] Add model version search filtering based on user permissions (#20964, @TomeHirata)
[Tracing] Display notebook trace viewer when workspace is on (#20947, @TomeHirata)
[Tracing] Add MLFLOW_GATEWAY_RESOLVE_API_KEY_FROM_FILE flag to prevent local file inclusion in API gateway (#20965, @TomeHirata)
[Tracking] Fix Claude Agent SDK tracing by capturing messages from receive_messages (#20778, @smoorjani)
[Build / Tracking] Add missing authentication for fastapi routes (#20920, @TomeHirata)
[Evaluation] Fix guardrails scorer compatibility with guardrails-ai 0.9.0 (#20934, @smoorjani)
[UI] Fix duplicated title and add icons to Experiments/Prompts page headers (#20813, @B-Step62)
[Tracing] Trace UI papercut: highlight searched text and change search box hint's wording. (#20841, @PattaraS)
[Prompts] Fix arbitrary file read via prompt tag validation bypass in Model Registry (#20833, @TomeHirata)
[Tracking] Fix RestException crash on null error_code and incorrect except clause (#20903, @copilot-swe-agent)
[UI] Fix Disable action button in Traces Tab (#20883, @joelrobin18)
[UI] Fix experiment rename modal not refreshing experiment details (#20882, @joelrobin18)
[Build] Skip workspace header when workspace is disabled (#20904, @TomeHirata)
[UI] Block CORS for ajax paths (#20832, @TomeHirata)
[UI] [UI] Improve empty states across Experiments, Models, Prompts, and Gateway pages (#20044, @ridgupta26)
[UI] UI: Improve empty states for Traces and Sessions tabs (#20034, @ridgupta26)
[Build] Validate webhook url to fix SSRF vulnerability (#20747, @TomeHirata)
[Scoring / Tracing] Fix TypeError in online scoring config endpoint when basic-auth is enabled (#20783, @copilot-swe-agent)
...

Contributors

etirelli, brendanmaguire, and 50 other contributors

Assets 2

12 Feb 05:01

serena-ruan

v3.10.0rc0

9b0f106

v3.10.0rc0 Pre-release

Pre-release

We're excited to announce MLflow 3.10.0rc0, which includes several notable updates:

Major New Features:

🏢 Organization Support in MLflow Tracking Server: MLflow now supports multi-workspace environments! You can organize your experiments and resources across different workspaces with a new landing page that lets you navigate between them seamlessly. (#20702, #20657, @mprahl, @Gkrumbach07, @B-Step62)
💬 Multi-turn Conversation Simulation: Building on the conversation simulator introduced in 3.9, we've made it fully public and easily subclassable. You can now create custom simulation scenarios, compare sessions with goal/persona matching, and distill conversations into reusable goal/persona pairs for comprehensive agent testing. (#20243, #20377, #20289, @smoorjani)
💰 Trace Cost Tracking: Gain visibility into your LLM spending! MLflow now automatically extracts model information from LLM spans and calculates costs, with a new UI that renders model and cost data directly in your trace views. (#20327, #20330, @serena-ruan)
🎯 Top-level GenAI/Classical ML Split: We've redesigned the navigation to provide a frictionless experience. A new workflow type selector in the top-level navbar lets you quickly switch between GenAI and Classical ML contexts, with streamlined sidebars that reduce visual clutter. (#20158, #20160, #20161, #20699, @ispoljari, @daniellok-db)
🎮 MLflow Demo Experiment: Get started with MLflow faster than ever! The new mlflow demo CLI command generates a fully-populated demo environment with sample traces, prompts, and evaluation data so you can explore MLflow's features hands-on without any setup. (#19994, #19995, #20046, #20047, #20048, #20162, @BenWilson2)
📊 Gateway Usage Tracking: Monitor your AI Gateway endpoints with detailed usage analytics. A new usage page shows request patterns and metrics, with trace ingestion that links gateway calls back to your experiments for end-to-end observability. (#20357, #20358, #20642, @TomeHirata)

Stay tuned for the full release, which will be packed with even more features and bugfixes.

To try out this release candidate, please run:

pip install mlflow==3.10.0rc0

Contributors

mprahl, smoorjani, and 7 other contributors

Assets 2

29 Jan 08:49

harupy

v3.9.0

cf3d582

v.3.9.0

We're excited to announce MLflow 3.9.0, which includes several notable updates:

Major New Features:

🔮 MLflow Assistant: Figuring out the next steps to debug your apps and agents can be challenging. We're excited to introduce the MLflow Assistant, an in-product chatbot that can help you identify, diagnose, and fix issues. The assistant is backed by Claude Code, and directly passes context from the MLflow UI to Claude. Click on the floating "Assistant" button in the bottom right of the MLflow UI to get started!
📈 Trace Overview Dashboard: You can now get insights into your agent's performance at a glance with the new "Overview" tab in GenAI experiments. Many pre-built statistics are available out of the box, including performance metrics (e.g. latency, request count), quality metrics (based on assessments), and tool call summaries. If there are any additional charts you'd like to see, please feel free to raise an issue in the MLflow repository!
✨ AI Gateway: We're revamping our AI Gateway feature! AI Gateway provides a unified interface for your API requests, allowing you to route queries to your LLM provider(s) of choice. In MLflow 3.9.0, the Gateway server is now located directly in the tracking server, so you don't need to spin up a new process. Additional features such as passthrough endpoints, traffic splits, and fallback models are also available, with more to come soon! For more detailed information, please take a look at the docs.
🔎 Online Monitoring with LLM Judges: Configure LLM judges to automatically run on your traces, without having to write a line of code! You can either use one of our pre-defined judges, or provide your own prompt and instructions to create custom metrics. Head to the new "Judges" tab within the GenAI Experiment UI to get started.
🤖 Judge Builder UI: Define and iterate on custom LLM judge prompts directly from the UI! Within the new "Judges" tab, you can create your own prompt for an LLM judge, and test-run it on your traces to see what the output would be. Once you're happy with it, you can either use it for online monitoring (as mentioned above), or use it via the Python SDK for your evals.
🔗 Distributed Tracing: Trace context can now be propagated across different services and processes, allowing you to truly track request lifecycles from end to end. The related APIs are defined in the mlflow.tracing.distributed module (with more documentation to come soon).
📚 MemAlign - a new judge optimizer algorithm: We're excited to introduce MemAlignOptimizer, a new algorithm that makes your judges smarter over time. It learns general guidelines from past feedback while dynamically retrieving relevant examples at runtime, giving you more accurate evaluations.

Features:

[Gateway] Add LiteLLM provider to support many other providers (#19394, @TomeHirata)
[Gateway] Add passthrough support for Anthropic Messages API (#19423, @TomeHirata)
[Gateway] Add passthrough support for Gemini generateContent and streamGenerateContent APIs (#19425, @TomeHirata)
[Gateway] Add routing strategy and fallback configuration support for gateway endpoints (#19483, @TomeHirata)
[Gateway] Deprecate Unity Catalog function integration in AI Gateway (#19457, @harupy)
[Gateway / UI] Create List API Keys landing page (#19441, @BenWilson2)
[Gateway / UI] Add Create API Keys functionality (#19442, @BenWilson2)
[Gateway / UI] Add delete and update capabilities for API Keys (#19446, @BenWilson2)
[Gateway / UI] Add endpoint listing page and tab layout (#19474, @BenWilson2)
[Gateway / UI] Add Create endpoint page and enhance provider select (#19475, @BenWilson2)
[Gateway / UI] Add Model select functionality for endpoint creation (#19477, @BenWilson2)
[Gateway / UI] Add Auth config to endpoint creation (#19494, @BenWilson2)
[Gateway / UI] Add the Endpoint Edit Page (#19502, @BenWilson2)
[Gateway / UI] Refactor the provider display for better UX (#19503, @BenWilson2)
[Gateway / UI] Create Endpoint details page (#19537, @BenWilson2)
[Gateway / UI] Add security notice banner (#19538, @BenWilson2)
[Gateway / UI] Create common editable combo box with extra modal select (#19546, @BenWilson2)
[Evaluation] Introduce MemAlign as a new optimizer for judge alignment (#19598, @smoorjani)
[Evaluation] Parallelize LLM calls in MemAlign guideline distillation (#20291, @veronicalyu320)
[Evaluation] Add GePaAlignmentOptimizer for judge instruction optimization (#19882, @alkispoly-db)
[Evaluation] Add Fluency scorer for evaluating text quality (#19414, @alkispoly-db)
[Evaluation] Add KnowledgeRetention built-in scorer (#19436, @alkispoly-db)
[Evaluation] Implement automatic discovery for builtin scorers (#19443, @alkispoly-db)
[Evaluation] Add Phoenix (Arize) third-party scorer integration (#19473, @debu-sinha)
[Evaluation] Add gateway provider support for scorers (#19470, @danielseong1)
[Evaluation] Introduce a conversation simulator into mlflow.genai (#19614, @smoorjani)
[Evaluation] Integrate conversation simulation into mlflow.genai.evaluate (#19760, @smoorjani)
[Evaluation] Make conversation simulator work with datasets (#19845, @SomtochiUmeh)
[Evaluation] Support for conversational datasets with persona, goal, and context (#19686, @SomtochiUmeh)
[Evaluation] Introduce conversational guidelines scorer (#19729, @smoorjani)
[Evaluation] Update tool call correctness judge to accept expected tool calls (#19613, @smoorjani)
[Evaluation] Support trace parsing fallback using Databricks model (#19654, @AveshCSingh)
[Evaluation] Documentation for online evaluation / scoring (#20103, @dbczumar)
[Evaluation] Job backend: Update job backend to use static names rather than function full names (#19430, @WeichenXu123)
[Evaluation] Job backend: support job cancellation (#19565, @WeichenXu123)
[Tracing] Support distributed tracing (#19920, @WeichenXu123)
[Tracing] Trace Metrics backend (#19271, @serena-ruan)
[Tracing] Add IS NULL / IS NOT NULL comparator support for trace metadata filtering (#19720, @dbczumar)
[Tracing] Auto-navigate to Events tab when clicking error spans (#20188, @anshuman-sahu)
[Tracing] Support shift+select for Traces (#20125, @B-Step62)
[Tracing] SpringAI Integration (#19949, @joelrobin18)
[Tracing] Reasoning in Chat UI for OpenAI, Anthropic, Gemini, Langchain, and PydanticAI (#19535, #19541, #19627, #19651, #19657, @joelrobin18)
[UI] Current Page context to assistant (#20139, @joelrobin18)
[UI] Assistant regenerate button (#20066, @joelrobin18)
[UI] Copy button Assistant (#20063, @joelrobin18)
[UI] Overview tab for GenAI experiments (#19521, @serena-ruan)
[UI] Enable Scorers UI feature flags (#19842, @danielseong1)
[UI] Improve LLM judge creation modal UX and variable ordering (#19963, @danielseong1)
[UI] Hide instructions section for built-in LLM judges (#19883, @danielseong1)
[UI] Change model provider and name to dropdown list (#19653, @chenmoneygithub)
[Prompts] Support Jinja2 template in prompt registry (#19772, @B-Step62)
[Prompts] Support metaprompting in mlflow.genai.optimize_prompts() (#19762, @chenmoneygithub)
[Prompts] Add option to delegate saving dspy model to dspy.module.save API (#19704, @WeichenXu123)
[Prompts / UI] Add traces mode to prompts details page and implement filtered traces (#19599, @TomeHirata)
[Tracking] Support mlflow.genai.to_predict_fn for app invocation endpoints (#19779, @jennsun)
[Tracking] Add log_stream API for logging binary streams as artifacts (#19104, @harupy)
[Tracking] Add import_checkpoints API for databricks SGC Checkpointing with MLflow (#19839, @WeichenXu123)
[Tracking] Support GC clean up for Historical Jobs (#19626, @joelrobin18)
[Tracking] Add JupyterNotebookRunContext for Tracking local Jupyter notebook as the source (#19162, @iyashk)
[Tracking] Full docker image support with db (#19979, @serena-ruan)
[Tracking] Add react route handling to communicate with the tracking server (#19010, @BenWilson2)
[Tracking] [TypeScript SDK] Simplify Databricks auth by delegating to Databricks SDK (#19434, @simonfaltum)
[Models] Safe model serialization: Support saving pytorch model via torch.export.save, add skops serialization format, and deprecate unsafe pickle/cloudpickle formats (#18759, #18832, #19692, #20151, @WeichenXu123)

Bug fixes:

[Gateway] Fix Anthropic and Gemini streaming for LiteLLM providers (#20398, @TomeHirata)
[Build] Include git submodule contents in Python package build (#20394, @copilot-swe-agent)
[Tracing] Fix duplicate traces in semantic kernel autolog (#20206, @harupy)
[Tracing] Fix Claude autolog to prioritize settings.json over OS environment variables (#20376, @alkispoly-db)
[Evaluation] Fix temperature/json issues with ConversationSimulator on managed (#20236, @xsh310)
[Tracing / UI] Add support for OpenAI function calling inputs in chat UI parsing (#20058, @daniellok-db)
[Tracking] Update checking code for pickle deserialization (#20267, @WeichenXu123)
[Gateway] Fix Vertex AI model configuration (#20242, @TomeHirata)
[UI] Store gateway<>scorer binding correctly (#20176, @TomeHirata)
[Evaluation] Support SparkDF trace handling in eval (#20207, @BenWilson2)
[Evaluation] Fix tool name extraction for tool call correctness (#20201, @smoorjani)
[Prompts] Fix scorers issue in metaprompting (#20173, @chenmoneygithub)
[UI] Propagate Run id context to Assistant (#20138, @joelrobin18)
[Model Registry] Allow for model registration to use KMS auth from different workspace (#20156, @BenWilson2)
[UI] Improve scorer trace picker UX and validation (#20178, @danielseong1)
[Evaluation] Improve `Me...

Contributors

jaceklaskowski, zjffdu, and 35 other contributors

Assets 2

16 Jan 04:48

daniellok-db

v3.9.0rc0

d8c9d8c

v3.9.0rc0 Pre-release

Pre-release

We're excited to announce MLflow 3.9.0rc0, a pre-release including several notable updates:

Major New Features:

🔮 MLflow Assistant: Figuring out the next steps to debug your apps and agents can be challenging. We're excited to introduce the MLflow Assistant, an in-product chatbot that can help you identify, diagnose, and fix issues. The assistant is backed by Claude Code, and directly passes context from the MLflow UI to Claude. Click on the floating "Assistant" button in the bottom right of the MLflow UI to get started!
📈 Trace Overview Dashboard: You can now get insights into your agent's performance at a glance with the new "Overview" tab in GenAI experiments. Many pre-built statistics are available out of the box, including performance metrics (e.g. latency, request count), quality metrics (based on assessments), and tool call summaries. If there are any additional charts you'd like to see, please feel free to raise an issue in the MLflow repository!
✨ AI Gateway: We're revamping our AI Gateway feature! AI Gateway provides a unified interface for your API requests, allowing you to route queries to your LLM provider(s) of choice. In MLflow 3.9.0rc0, the Gateway server is now located directly in the tracking server, so you don't need to spin up a new process. Additional features such as passthrough endpoints, traffic splits, and fallback models are also available, with more to come soon! For more detailed information, please take a look at the docs.
🔎 Online Monitoring with LLM Judges: Configure LLM judges to automatically run on your traces, without having to write a line of code! You can either use one of our pre-defined judges, or provide your own prompt and instructions to create custom metrics. Head to the new "Judges" tab within the GenAI Experiment UI to get started.
🤖 Judge Builder UI: Define and iterate on custom LLM judge prompts directly from the UI! Within the new "Judges" tab, you can create your own prompt for an LLM judge, and test-run it on your traces to see what the output would be. Once you're happy with it, you can either use it for online monitoring (as mentioned above), or use it via the Python SDK for your evals.
🔗 Distributed Tracing: Trace context can now be propagated across different services and processes, allowing you to truly track request lifecycles from end to end. The related APIs are defined in the mlflow.tracing.distributed module (with more documentation to come soon).
📚 MemAlign - a new judge optimizer algorithm: We're excited to introduce MemAlignOptimizer, a new algorithm that makes your judges smarter over time. It learns general guidelines from past feedback while dynamically retrieving relevant examples at runtime, giving you more accurate evaluations.

Stay tuned for the full release, which will be packed with even more features and bugfixes.

To try out this release candidate, please run:

pip install mlflow==3.9.0rc0

Please try it out and report any issues on the issue tracker.

Assets 2

Releases: mlflow/mlflow

Model Catalog

Uh oh!

TypeScript SDK 0.2.0 RC1

Uh oh!

v3.11.1

Contributors

Uh oh!

v3.11.0rc1

Uh oh!

v3.11.0rc0

Contributors

Uh oh!

v3.10.1

Contributors

Uh oh!

v3.10.0

Contributors

Uh oh!

v3.10.0rc0

Contributors

Uh oh!

v.3.9.0

Contributors

Uh oh!

v3.9.0rc0

Uh oh!