Skip to content

[fix](s3) Add limit-aware brace expansion and fix misleading glob metrics#61843

Open
dataroaring wants to merge 2 commits intomasterfrom
fix/s3-glob-expansion-safety
Open

[fix](s3) Add limit-aware brace expansion and fix misleading glob metrics#61843
dataroaring wants to merge 2 commits intomasterfrom
fix/s3-glob-expansion-safety

Conversation

@dataroaring
Copy link
Copy Markdown
Contributor

Summary

Follow-up fixes from review of #61775 (cherry-pick of #60414):

  • Unbounded brace expansion (OOM risk): expandBracePatterns() fully materializes all paths before checking s3_head_request_max_paths. Patterns like {1..100000} or multi-brace cartesian products could cause high CPU/memory usage. Added a limit-aware expandBracePatterns(pattern, maxPaths) that stops expansion early via BraceExpansionTooLargeException, avoiding large allocations.

  • Misleading glob metrics logs: When the HEAD/getProperties optimization succeeds and returns early, the finally block still logs LIST-path counters (elementCnt/matchCnt) as 0. Added usedHeadPath flag to skip the LIST metrics log when the HEAD optimization was used. This is especially important for Azure where the log is at INFO level (always visible in production).

  • Unit tests: Added 6 new test cases covering within-limit, exactly-at-limit, one-over-limit, exceeds-limit, cartesian-exceeds, and zero-means-unlimited scenarios.

Note: The timestamp issues (toEpochSecond/getSecond) are addressed separately in #61790.

Test plan

  • Existing S3UtilTest passes
  • New limit-aware expansion tests pass (boundary cases: exactly at limit, one over limit, cartesian product)
  • Verify S3 HEAD path fallback logs show info message when limit exceeded
  • Verify Azure getProperties path fallback works correctly

🤖 Generated with Claude Code

…rics

Prevent OOM from unbounded brace expansion by adding early-stop to
expandBracePatterns when the expansion exceeds s3_head_request_max_paths.
Also skip LIST-path metrics logging when the HEAD/getProperties
optimization was used, avoiding misleading "process 0 elements" logs.

Found via review of #61775.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@dataroaring dataroaring requested a review from CalvinKirs as a code owner March 28, 2026 07:11
Copilot AI review requested due to automatic review settings March 28, 2026 07:11
@Thearas
Copy link
Copy Markdown
Contributor

Thearas commented Mar 28, 2026

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

  1. What problem was fixed (it's best to include specific error reporting information). How it was fixed.
  2. Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
  3. What features were added. Why was this function added?
  4. Which code was refactored and why was this part of the code refactored?
  5. Which functions were optimized and what is the difference before and after the optimization?

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR hardens the deterministic-path (HEAD/getProperties) optimization for S3/Azure by making brace expansion limit-aware to avoid unbounded expansion, and fixes misleading glob LIST metrics logs when the HEAD/getProperties path returns early.

Changes:

  • Add a limit-aware S3Util.expandBracePatterns(pattern, maxPaths) with an early-abort exception.
  • Update S3/Azure deterministic-path globbing to use the limit-aware expansion and skip LIST metrics logging when the HEAD/getProperties path is used.
  • Add unit tests covering limit boundary conditions for brace expansion.

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 5 comments.

File Description
fe/fe-core/src/main/java/org/apache/doris/common/util/S3Util.java Introduces limit-aware brace expansion and a dedicated exception for exceeding limits.
fe/fe-core/src/main/java/org/apache/doris/fs/obj/S3ObjStorage.java Uses limit-aware brace expansion for HEAD optimization and avoids logging LIST metrics when HEAD path is used.
fe/fe-core/src/main/java/org/apache/doris/fs/obj/AzureObjStorage.java Uses limit-aware brace expansion for getProperties optimization and skips misleading LIST metrics logs; adjusts finally-block behavior.
fe/fe-core/src/test/java/org/apache/doris/common/util/S3UtilTest.java Adds test coverage for limit-aware brace expansion scenarios.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +620 to +622
* @throws BraceExpansionTooLargeException if expansion exceeds maxPaths
*/
public static List<String> expandBracePatterns(String pathPattern, int maxPaths) {
Copy link

Copilot AI Mar 28, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

expandBracePatterns(String, int) Javadoc says maxPaths must be > 0, but the implementation treats 0/negative as “unlimited”. Since callers pass Config.s3_head_request_max_paths, setting that config to 0 (which previously effectively disabled the HEAD optimization) would now remove the limit and could trigger unbounded expansion / excessive HEAD requests. Please clarify and enforce the contract (e.g., reject maxPaths <= 0 in this overload and keep “unlimited” only for the 1-arg method, or update the config semantics/docs accordingly).

Suggested change
* @throws BraceExpansionTooLargeException if expansion exceeds maxPaths
*/
public static List<String> expandBracePatterns(String pathPattern, int maxPaths) {
* @throws BraceExpansionTooLargeException if expansion exceeds maxPaths
* @throws IllegalArgumentException if {@code maxPaths} is not greater than zero
*/
public static List<String> expandBracePatterns(String pathPattern, int maxPaths) {
if (maxPaths <= 0) {
throw new IllegalArgumentException("maxPaths must be > 0");
}

Copilot uses AI. Check for mistakes.
Comment on lines +494 to +498
@Test
public void testExpandBracePatterns_zeroLimitMeansUnlimited() {
// maxPaths=0 means no limit (backward compatibility)
List<String> result = S3Util.expandBracePatterns("file{1,2,3,4,5}.csv", 0);
Assert.assertEquals(5, result.size());
Copy link

Copilot AI Mar 28, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test asserts maxPaths=0 means unlimited, but the new expandBracePatterns(String, int) Javadoc says maxPaths must be > 0 and s3_head_request_max_paths is documented as a hard cap for HEAD-path expansion. Please align the test with the intended semantics (either treat 0 as “disable/fallback” like the prior behavior, or update the API/config documentation and add safeguards if 0 truly means “unlimited”).

Suggested change
@Test
public void testExpandBracePatterns_zeroLimitMeansUnlimited() {
// maxPaths=0 means no limit (backward compatibility)
List<String> result = S3Util.expandBracePatterns("file{1,2,3,4,5}.csv", 0);
Assert.assertEquals(5, result.size());
@Test(expected = IllegalArgumentException.class)
public void testExpandBracePatterns_zeroLimitIsInvalid() {
// maxPaths must be > 0; zero is an invalid argument
S3Util.expandBracePatterns("file{1,2,3,4,5}.csv", 0);

Copilot uses AI. Check for mistakes.
Comment on lines 760 to +766
// First expand [...] brackets to {...} braces, then expand {..} ranges, then expand braces
// Use limit-aware expansion to avoid large allocations before checking the limit
String expandedPattern = S3Util.expandBracketPatterns(keyPattern);
expandedPattern = S3Util.extendGlobs(expandedPattern);
List<String> expandedPaths = S3Util.expandBracePatterns(expandedPattern);

// Fall back to listing if too many paths to avoid overwhelming S3 with HEAD requests
// Controlled by config: s3_head_request_max_paths
if (expandedPaths.size() > Config.s3_head_request_max_paths) {
LOG.info("Expanded path count {} exceeds limit {}, falling back to LIST",
expandedPaths.size(), Config.s3_head_request_max_paths);
List<String> expandedPaths;
try {
expandedPaths = S3Util.expandBracePatterns(expandedPattern, Config.s3_head_request_max_paths);
Copy link

Copilot AI Mar 28, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new comment says this uses “limit-aware expansion to avoid large allocations before checking the limit”, but this code still calls S3Util.extendGlobs() first, which eagerly enumerates numeric ranges (e.g. {1..100000}) into a large in-memory string/list. If the goal is to mitigate OOM/CPU risks, consider making extendGlobs/range expansion limit-aware too (or short-circuit when the range cardinality would exceed s3_head_request_max_paths).

Copilot uses AI. Check for mistakes.
Comment on lines 474 to +480
// First expand [...] brackets to {...} braces, then expand {..} ranges, then expand braces
// Use limit-aware expansion to avoid large allocations before checking the limit
String expandedPattern = S3Util.expandBracketPatterns(keyPattern);
expandedPattern = S3Util.extendGlobs(expandedPattern);
List<String> expandedPaths = S3Util.expandBracePatterns(expandedPattern);

// Fall back to listing if too many paths to avoid overwhelming Azure with requests
// Controlled by config: s3_head_request_max_paths
if (expandedPaths.size() > Config.s3_head_request_max_paths) {
LOG.info("Expanded path count {} exceeds limit {}, falling back to LIST",
expandedPaths.size(), Config.s3_head_request_max_paths);
List<String> expandedPaths;
try {
expandedPaths = S3Util.expandBracePatterns(expandedPattern, Config.s3_head_request_max_paths);
Copy link

Copilot AI Mar 28, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new comment says this uses “limit-aware expansion to avoid large allocations before checking the limit”, but S3Util.extendGlobs() runs first and eagerly enumerates numeric ranges into a potentially huge brace string. To fully address the OOM/CPU risk for patterns like {1..100000}, extendGlobs (range expansion) likely also needs a limit-aware/short-circuit path keyed off s3_head_request_max_paths.

Copilot uses AI. Check for mistakes.
long endTime = System.nanoTime();
long duration = endTime - startTime;
LOG.info("process {} elements under prefix {} for {} round, match {} elements, take {} micro second",
remotePath, elementCnt, roundCnt, matchCnt,
Copy link

Copilot AI Mar 28, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The log format placeholders don’t match the arguments: the message starts with “process {} elements under prefix {}…”, but the first argument passed is remotePath (a String) and elementCnt is second. Swap the first two arguments so the element count is logged in the first placeholder and the prefix/path in the second.

Suggested change
remotePath, elementCnt, roundCnt, matchCnt,
elementCnt, remotePath, roundCnt, matchCnt,

Copilot uses AI. Check for mistakes.
…log order

- Add MAX_RANGE_EXPANSION_SIZE (10000) hard cap in extendGlobNumberRange
  to prevent OOM from patterns like {1..100000000} before the limit-aware
  brace expansion even runs.
- Fix Javadoc: maxPaths=0 means unlimited (not "must be > 0").
- Fix pre-existing Azure glob metrics log argument order (remotePath and
  elementCnt were swapped).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@dataroaring
Copy link
Copy Markdown
Contributor Author

run buildall

dataroaring added a commit to dataroaring/claude that referenced this pull request Mar 28, 2026
Adds a 6th review agent that checks for:
- Javadoc/contract consistency (param docs vs actual behavior)
- Upstream data flow tracing (incomplete OOM/DoS fixes)
- Boundary/off-by-one in limit checks (N+1 exact boundary)
- Log format argument order mismatches
- Pre-existing bugs in surrounding touched code

Also adds plugin.json manifest for proper plugin discovery.

Lessons learned from PR apache/doris#61843 code review.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@doris-robot
Copy link
Copy Markdown

TPC-H: Total hot run time: 26834 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 3832d79d648691667b398ee7a6180c1dc8c7d6ca, data reload: false

------ Round 1 ----------------------------------
orders	Doris	NULL	NULL	0	0	0	NULL	0	NULL	NULL	2023-12-26 18:27:23	2023-12-26 18:42:55	NULL	utf-8	NULL	NULL	
============================================
q1	17662	4554	4391	4391
q2	q3	10701	788	532	532
q4	4724	364	253	253
q5	8140	1242	1021	1021
q6	227	178	147	147
q7	814	855	677	677
q8	10605	1507	1387	1387
q9	6862	5228	4690	4690
q10	6322	1943	1677	1677
q11	477	243	251	243
q12	764	609	476	476
q13	18034	2730	1942	1942
q14	230	244	214	214
q15	q16	730	756	674	674
q17	744	862	459	459
q18	6022	5376	5370	5370
q19	1104	996	612	612
q20	542	495	374	374
q21	4474	1893	1459	1459
q22	348	291	236	236
Total cold run time: 99526 ms
Total hot run time: 26834 ms

----- Round 2, with runtime_filter_mode=off -----
orders	Doris	NULL	NULL	150000000	42	6422171781	NULL	22778155	NULL	NULL	2023-12-26 18:27:23	2023-12-26 18:42:55	NULL	utf-8	NULL	NULL	
============================================
q1	4435	4357	4376	4357
q2	q3	3873	4324	3769	3769
q4	873	1187	817	817
q5	4071	4374	4336	4336
q6	187	174	142	142
q7	1759	1641	1493	1493
q8	2479	2688	2549	2549
q9	7310	7201	7187	7187
q10	3681	3914	3609	3609
q11	502	434	412	412
q12	492	591	449	449
q13	2319	2767	1956	1956
q14	268	283	262	262
q15	q16	689	741	686	686
q17	1125	1248	1351	1248
q18	7071	6692	6594	6594
q19	895	883	933	883
q20	2034	2110	1943	1943
q21	3947	3510	3359	3359
q22	445	415	367	367
Total cold run time: 48455 ms
Total hot run time: 46418 ms

@doris-robot
Copy link
Copy Markdown

TPC-DS: Total hot run time: 167052 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 3832d79d648691667b398ee7a6180c1dc8c7d6ca, data reload: false

query5	4339	635	496	496
query6	321	222	210	210
query7	4018	477	259	259
query8	332	244	227	227
query9	8067	2684	2659	2659
query10	530	384	354	354
query11	6955	5102	4832	4832
query12	180	126	125	125
query13	931	472	380	380
query14	3845	3868	3416	3416
query14_1	2892	2806	2797	2797
query15	203	192	176	176
query16	1010	463	450	450
query17	935	709	605	605
query18	775	443	347	347
query19	200	209	177	177
query20	143	125	125	125
query21	138	137	113	113
query22	13032	13333	13060	13060
query23	16119	15925	15556	15556
query23_1	15695	15658	15486	15486
query24	6885	1649	1242	1242
query24_1	1221	1218	1210	1210
query25	545	479	422	422
query26	952	268	155	155
query27	2755	499	302	302
query28	4472	1833	1801	1801
query29	782	578	478	478
query30	304	243	203	203
query31	1046	959	868	868
query32	91	72	71	71
query33	494	338	288	288
query34	909	865	528	528
query35	635	683	605	605
query36	1097	1160	988	988
query37	144	98	87	87
query38	2968	2884	2889	2884
query39	864	829	810	810
query39_1	795	812	786	786
query40	236	163	140	140
query41	65	63	66	63
query42	265	264	259	259
query43	248	248	230	230
query44	
query45	194	192	183	183
query46	885	993	611	611
query47	2083	2135	2080	2080
query48	298	311	243	243
query49	631	465	392	392
query50	704	289	214	214
query51	4076	4183	3987	3987
query52	268	267	262	262
query53	288	338	286	286
query54	313	272	290	272
query55	92	92	82	82
query56	316	309	331	309
query57	1945	1722	1681	1681
query58	312	288	284	284
query59	2762	2974	2747	2747
query60	367	361	337	337
query61	187	184	186	184
query62	631	589	539	539
query63	312	284	279	279
query64	5073	1400	1124	1124
query65	
query66	1471	470	390	390
query67	24591	24161	24108	24108
query68	
query69	421	324	295	295
query70	1006	938	950	938
query71	344	309	303	303
query72	3089	2725	2475	2475
query73	540	548	313	313
query74	9653	9568	9369	9369
query75	2848	2724	2420	2420
query76	2294	1046	672	672
query77	384	385	311	311
query78	11116	11200	10437	10437
query79	1097	755	574	574
query80	715	632	536	536
query81	515	259	226	226
query82	776	152	119	119
query83	352	266	245	245
query84	300	155	100	100
query85	878	494	457	457
query86	393	306	291	291
query87	3180	3101	2983	2983
query88	3552	2683	2655	2655
query89	437	385	340	340
query90	1820	176	171	171
query91	169	165	135	135
query92	83	78	74	74
query93	933	864	505	505
query94	465	335	302	302
query95	579	345	392	345
query96	636	515	227	227
query97	2494	2485	2380	2380
query98	256	225	229	225
query99	1013	1000	899	899
Total cold run time: 238546 ms
Total hot run time: 167052 ms

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants