Skip to content

Conversation

@liaoxin01
Copy link
Contributor

@liaoxin01 liaoxin01 commented Jan 29, 2026

What problem does this PR solve?

Issue Number: close #xxx

Related PR: #xxx

Problem Summary:

When enable_packed_file is enabled, the warm up functions fail to download cache data because they use storage_resource->fs (the raw FileSystem) instead of RowsetMeta::fs() which wraps the FileSystem with PackedFileSystem.

The PackedFileSystem correctly maps segment file paths to their actual locations within packed files. Without this mapping, the download fails because the segment files don't exist on S3 - they are packed into larger packed files.

This fix changes the file_system used in:

  1. download_file_cache_block (block_file_cache_downloader.cpp)
  2. _submit_segment_download_task (cloud_tablet.cpp)
  3. _submit_inverted_index_download_task (cloud_tablet.cpp)

All now use RowsetMeta::fs() instead of storage_resource->fs to properly handle packed files.

Release note

None

Check List (For Author)

  • Test

    • Regression test
    • Unit Test
    • Manual test (add detailed scripts or steps below)
    • No need to test or manual test. Explain why:
      • This is a refactor/code format and no logic has been changed.
      • Previous test can cover this change.
      • No code files have been changed.
      • Other reason
  • Behavior changed:

    • No.
    • Yes.
  • Does this need documentation?

    • No.
    • Yes.

Check List (For Reviewer who merge this PR)

  • Confirm the release note
  • Confirm test cases
  • Confirm document
  • Add branch pick label

Copilot AI review requested due to automatic review settings January 29, 2026 16:14
@liaoxin01
Copy link
Contributor Author

run buildall

Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR fixes a bug where warm-up cache functionality fails when enable_packed_file is enabled. The issue occurs because the code was using the raw FileSystem from storage_resource->fs instead of the wrapped FileSystem from RowsetMeta::fs(), which properly handles packed files by wrapping the FileSystem with PackedFileSystem.

Changes:

  • Updated three warm-up download functions to use RowsetMeta::fs() instead of storage_resource->fs for proper packed file support
  • Modified 17 regression tests to randomly enable/disable enable_packed_file configuration to ensure both scenarios are tested

Reviewed changes

Copilot reviewed 19 out of 19 changed files in this pull request and generated no comments.

Show a summary per file
File Description
be/src/io/cache/block_file_cache_downloader.cpp Updated download_file_cache_block to use RowsetMeta::fs() with null check for packed file support
be/src/cloud/cloud_tablet.cpp Updated _submit_segment_download_task and _submit_inverted_index_download_task to use rowset_meta->fs() with null checks
regression-test/suites/cloud_p0/tablets/test_clean_tablet_when_rebalance.groovy Added random enable_packed_file configuration to test both scenarios
regression-test/suites/cloud_p0/tablets/test_clean_tablet_when_drop_force_table.groovy Added random enable_packed_file configuration to test both scenarios
regression-test/suites/cloud_p0/tablets/test_clean_stale_rs_index_file_cache.groovy Added random enable_packed_file configuration to test both scenarios
regression-test/suites/cloud_p0/tablets/test_clean_stale_rs_file_cache.groovy Added random enable_packed_file configuration to test both scenarios
regression-test/suites/cloud_p0/cache/test_topn_broadcast.groovy Added random enable_packed_file configuration to test both scenarios
regression-test/suites/cloud_p0/cache/test_load_cache.groovy Added random enable_packed_file configuration to test both scenarios
regression-test/suites/cloud_p0/balance/test_warmup_rebalance.groovy Added random enable_packed_file configuration to test both scenarios
regression-test/suites/cloud_p0/balance/test_peer_read_async_warmup.groovy Added random enable_packed_file configuration to test both scenarios
regression-test/suites/cloud_p0/balance/test_expanding_node_balance.groovy Added random enable_packed_file configuration to test both scenarios
regression-test/suites/cloud_p0/balance/test_balance_warm_up_with_compaction_use_peer_cache.groovy Added random enable_packed_file configuration to test both scenarios
regression-test/suites/cloud_p0/balance/test_balance_warm_up_use_peer_cache.groovy Added random enable_packed_file configuration to test both scenarios
regression-test/suites/cloud_p0/balance/test_balance_warm_up_task_abnormal.groovy Added random enable_packed_file configuration to test both scenarios
regression-test/suites/cloud_p0/balance/test_balance_warm_up_sync_global_config.groovy Added random enable_packed_file configuration to test both scenarios
regression-test/suites/cloud_p0/balance/test_balance_warm_up.groovy Added random enable_packed_file configuration to test both scenarios
regression-test/suites/cloud_p0/balance/test_balance_use_compute_group_properties.groovy Added random enable_packed_file configuration to test both scenarios
regression-test/suites/cloud_p0/balance/test_balance_metrics.groovy Added random enable_packed_file configuration to test both scenarios
regression-test/suites/cloud_p0/balance/test_alter_compute_group_properties.groovy Added random enable_packed_file configuration to test both scenarios

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@doris-robot
Copy link

BE UT Coverage Report

Increment line coverage 0.00% (0/21) 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 52.75% (19276/36542)
Line Coverage 36.13% (179126/495797)
Region Coverage 32.58% (138919/426445)
Branch Coverage 33.52% (60110/179351)

@doris-robot
Copy link

TPC-H: Total hot run time: 31842 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 8fc190e78429083b3feffe4535eb831a65ab6265, data reload: false

------ Round 1 ----------------------------------
q1	17635	5381	5134	5134
q2	2051	316	230	230
q3	10164	1316	749	749
q4	10204	841	327	327
q5	7544	2111	1922	1922
q6	194	180	148	148
q7	867	742	603	603
q8	9263	1398	1078	1078
q9	5201	4812	4813	4812
q10	6820	1935	1549	1549
q11	529	301	266	266
q12	335	373	224	224
q13	17788	4049	3209	3209
q14	239	237	217	217
q15	896	822	806	806
q16	677	666	625	625
q17	650	790	503	503
q18	6781	6370	6491	6370
q19	1288	1003	627	627
q20	401	350	238	238
q21	2676	2122	1927	1927
q22	350	316	278	278
Total cold run time: 102553 ms
Total hot run time: 31842 ms

----- Round 2, with runtime_filter_mode=off -----
q1	5362	5335	5334	5334
q2	248	331	256	256
q3	2154	2704	2244	2244
q4	1362	1719	1293	1293
q5	4331	4221	4267	4221
q6	214	181	139	139
q7	2245	2121	1842	1842
q8	2544	2485	2359	2359
q9	7661	7570	7472	7472
q10	2897	3059	2641	2641
q11	550	465	452	452
q12	666	723	627	627
q13	3966	4371	3752	3752
q14	300	310	282	282
q15	906	816	837	816
q16	670	728	678	678
q17	1161	1315	1320	1315
q18	8534	7686	7837	7686
q19	895	807	810	807
q20	2088	2160	2107	2107
q21	4655	4153	4043	4043
q22	567	549	513	513
Total cold run time: 53976 ms
Total hot run time: 50879 ms

@doris-robot
Copy link

ClickBench: Total hot run time: 28.28 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit 8fc190e78429083b3feffe4535eb831a65ab6265, data reload: false

query1	0.05	0.04	0.04
query2	0.10	0.04	0.04
query3	0.26	0.08	0.08
query4	1.61	0.11	0.11
query5	0.28	0.24	0.24
query6	1.15	0.69	0.67
query7	0.03	0.02	0.02
query8	0.05	0.04	0.04
query9	0.56	0.51	0.50
query10	0.54	0.55	0.55
query11	0.14	0.09	0.09
query12	0.14	0.10	0.11
query13	0.63	0.62	0.62
query14	1.06	1.06	1.08
query15	0.87	0.87	0.87
query16	0.38	0.38	0.38
query17	1.15	1.11	1.11
query18	0.23	0.21	0.21
query19	2.06	2.01	2.02
query20	0.02	0.02	0.02
query21	15.42	0.25	0.15
query22	5.05	0.06	0.04
query23	16.02	0.28	0.10
query24	1.46	0.37	0.19
query25	0.09	0.09	0.07
query26	0.15	0.13	0.13
query27	0.06	0.06	0.06
query28	3.13	1.16	0.98
query29	12.59	3.90	3.19
query30	0.28	0.14	0.11
query31	2.82	0.64	0.41
query32	3.25	0.59	0.50
query33	3.27	3.24	3.33
query34	16.05	5.43	4.66
query35	4.83	4.80	4.80
query36	0.65	0.51	0.49
query37	0.11	0.07	0.07
query38	0.08	0.04	0.04
query39	0.05	0.03	0.03
query40	0.18	0.16	0.17
query41	0.08	0.03	0.03
query42	0.04	0.03	0.03
query43	0.05	0.04	0.04
Total cold run time: 97.02 s
Total hot run time: 28.28 s

@hello-stephen
Copy link
Contributor

BE Regression && UT Coverage Report

Increment line coverage 0.00% (0/21) 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 71.43% (25588/35820)
Line Coverage 54.03% (267230/494564)
Region Coverage 51.55% (222048/430763)
Branch Coverage 53.03% (95485/180067)

@liaoxin01 liaoxin01 force-pushed the fix/warm-up-cache-async-support-packed-file-master branch from 8fc190e to 78aa79c Compare January 30, 2026 02:39
@liaoxin01
Copy link
Contributor Author

run buildall

@liaoxin01 liaoxin01 force-pushed the fix/warm-up-cache-async-support-packed-file-master branch from 78aa79c to 467d9ac Compare January 30, 2026 02:42
@liaoxin01
Copy link
Contributor Author

run buildall

luwei16
luwei16 previously approved these changes Jan 30, 2026
@github-actions github-actions bot added the approved Indicates a PR has been approved by one committer. label Jan 30, 2026
@github-actions
Copy link
Contributor

PR approved by at least one committer and no changes requested.

@github-actions
Copy link
Contributor

PR approved by anyone and no changes requested.

@doris-robot
Copy link

TPC-H: Total hot run time: 31738 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 467d9acfadeb281b4d770a78406795e0a7858096, data reload: false

------ Round 1 ----------------------------------
q1	17674	5172	5058	5058
q2	2025	335	188	188
q3	10196	1375	754	754
q4	10221	902	315	315
q5	7549	2234	1896	1896
q6	208	180	145	145
q7	887	729	614	614
q8	9274	1409	1032	1032
q9	5608	4897	4836	4836
q10	6847	1925	1546	1546
q11	515	310	274	274
q12	377	377	225	225
q13	17767	4032	3237	3237
q14	229	240	212	212
q15	913	836	808	808
q16	686	675	622	622
q17	646	800	515	515
q18	7060	6657	6416	6416
q19	1539	1001	601	601
q20	388	354	237	237
q21	2659	2019	1935	1935
q22	355	311	272	272
Total cold run time: 103623 ms
Total hot run time: 31738 ms

----- Round 2, with runtime_filter_mode=off -----
q1	5406	5284	5257	5257
q2	261	334	261	261
q3	2158	2670	2258	2258
q4	1360	1758	1336	1336
q5	4247	4158	4226	4158
q6	227	189	138	138
q7	2319	2143	1878	1878
q8	2667	2512	2431	2431
q9	7506	7478	7472	7472
q10	2815	3105	2565	2565
q11	579	489	450	450
q12	697	801	618	618
q13	3846	4457	3545	3545
q14	408	299	309	299
q15	866	825	840	825
q16	673	743	709	709
q17	1157	1349	1394	1349
q18	8538	8015	7876	7876
q19	909	851	833	833
q20	2106	2163	2101	2101
q21	4919	4343	4179	4179
q22	579	549	517	517
Total cold run time: 54243 ms
Total hot run time: 51055 ms

@doris-robot
Copy link

ClickBench: Total hot run time: 28.75 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit 467d9acfadeb281b4d770a78406795e0a7858096, data reload: false

query1	0.06	0.05	0.05
query2	0.10	0.05	0.05
query3	0.25	0.09	0.08
query4	1.61	0.11	0.11
query5	0.26	0.27	0.25
query6	1.16	0.69	0.67
query7	0.03	0.02	0.02
query8	0.05	0.03	0.04
query9	0.57	0.50	0.48
query10	0.54	0.54	0.54
query11	0.13	0.09	0.10
query12	0.14	0.10	0.10
query13	0.62	0.61	0.62
query14	1.06	1.07	1.06
query15	0.88	0.86	0.87
query16	0.39	0.38	0.40
query17	1.08	1.09	1.12
query18	0.23	0.22	0.21
query19	2.09	1.93	2.02
query20	0.02	0.02	0.01
query21	15.42	0.26	0.15
query22	5.18	0.05	0.05
query23	16.05	0.28	0.10
query24	1.63	0.85	0.82
query25	0.16	0.13	0.07
query26	0.13	0.14	0.13
query27	0.06	0.05	0.04
query28	4.72	1.12	0.97
query29	12.54	3.96	3.19
query30	0.28	0.13	0.11
query31	2.82	0.63	0.40
query32	3.24	0.60	0.50
query33	3.20	3.32	3.25
query34	16.22	5.39	4.66
query35	4.82	4.84	4.83
query36	0.65	0.50	0.48
query37	0.12	0.07	0.07
query38	0.07	0.04	0.03
query39	0.05	0.03	0.03
query40	0.18	0.16	0.16
query41	0.09	0.04	0.03
query42	0.04	0.03	0.03
query43	0.05	0.04	0.03
Total cold run time: 98.99 s
Total hot run time: 28.75 s

@hello-stephen
Copy link
Contributor

BE UT Coverage Report

Increment line coverage 0.00% (0/21) 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 52.49% (19272/36716)
Line Coverage 35.97% (179064/497805)
Region Coverage 32.40% (138946/428890)
Branch Coverage 33.33% (60089/180277)

When enable_packed_file is enabled, the warm up functions fail to
download cache data because they use storage_resource->fs (the raw
FileSystem) instead of RowsetMeta::fs() which wraps the FileSystem
with PackedFileSystem.

The PackedFileSystem correctly maps segment file paths to their
actual locations within packed files. Without this mapping, the
download fails because the segment files don't exist on S3 - they
are packed into larger packed files.

This fix changes the file_system used in:
1. download_file_cache_block (block_file_cache_downloader.cpp)
2. _submit_segment_download_task (cloud_tablet.cpp)
3. _submit_inverted_index_download_task (cloud_tablet.cpp)

All now use RowsetMeta::fs() instead of storage_resource->fs to
properly handle packed files.
@liaoxin01 liaoxin01 force-pushed the fix/warm-up-cache-async-support-packed-file-master branch from 467d9ac to 703d692 Compare January 30, 2026 06:45
@liaoxin01
Copy link
Contributor Author

run buildall

@github-actions github-actions bot removed the approved Indicates a PR has been approved by one committer. label Jan 30, 2026
@doris-robot
Copy link

TPC-H: Total hot run time: 32221 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 703d692464b3c702cc68772d77d41510dfc169d4, data reload: false

------ Round 1 ----------------------------------
q1	17616	5277	5065	5065
q2	2061	341	189	189
q3	10154	1290	728	728
q4	10221	847	314	314
q5	7485	2181	1929	1929
q6	190	177	149	149
q7	885	761	601	601
q8	9199	1366	1139	1139
q9	5181	4775	4857	4775
q10	6813	1936	1568	1568
q11	528	294	296	294
q12	337	377	220	220
q13	17783	4043	3197	3197
q14	231	233	218	218
q15	893	813	822	813
q16	684	674	628	628
q17	650	830	468	468
q18	6707	6653	7401	6653
q19	1523	1098	661	661
q20	390	359	232	232
q21	2897	2436	2095	2095
q22	359	330	285	285
Total cold run time: 102787 ms
Total hot run time: 32221 ms

----- Round 2, with runtime_filter_mode=off -----
q1	5474	5522	5592	5522
q2	287	390	247	247
q3	2502	2877	2489	2489
q4	1465	1803	1428	1428
q5	4812	4584	4592	4584
q6	225	178	143	143
q7	2051	1926	1751	1751
q8	2515	2389	2351	2351
q9	7639	7348	7264	7264
q10	3013	2964	2606	2606
q11	570	524	440	440
q12	649	669	556	556
q13	3537	4007	3225	3225
q14	280	276	263	263
q15	830	798	780	780
q16	629	665	630	630
q17	1072	1229	1246	1229
q18	7538	7433	7355	7355
q19	804	771	785	771
q20	1943	2031	1886	1886
q21	4488	4118	4065	4065
q22	555	560	503	503
Total cold run time: 52878 ms
Total hot run time: 50088 ms

@doris-robot
Copy link

ClickBench: Total hot run time: 28.35 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit 703d692464b3c702cc68772d77d41510dfc169d4, data reload: false

query1	0.05	0.05	0.06
query2	0.10	0.05	0.05
query3	0.26	0.08	0.09
query4	1.61	0.11	0.11
query5	0.28	0.25	0.24
query6	1.19	0.66	0.66
query7	0.04	0.03	0.03
query8	0.06	0.04	0.04
query9	0.57	0.49	0.49
query10	0.56	0.54	0.55
query11	0.15	0.10	0.10
query12	0.14	0.11	0.12
query13	0.63	0.61	0.62
query14	1.06	1.05	1.05
query15	0.88	0.87	0.88
query16	0.42	0.38	0.40
query17	1.16	1.13	1.15
query18	0.23	0.21	0.21
query19	2.09	2.03	2.05
query20	0.02	0.01	0.02
query21	15.38	0.28	0.15
query22	5.24	0.06	0.05
query23	16.19	0.28	0.11
query24	1.36	0.34	0.36
query25	0.08	0.08	0.07
query26	0.14	0.12	0.12
query27	0.09	0.07	0.04
query28	3.43	1.14	0.97
query29	12.54	3.84	3.11
query30	0.30	0.16	0.14
query31	2.81	0.63	0.40
query32	3.24	0.59	0.50
query33	3.29	3.24	3.21
query34	16.02	5.40	4.76
query35	4.72	4.73	4.71
query36	0.67	0.50	0.49
query37	0.12	0.07	0.07
query38	0.07	0.04	0.04
query39	0.04	0.03	0.04
query40	0.18	0.17	0.15
query41	0.09	0.04	0.03
query42	0.05	0.04	0.03
query43	0.05	0.04	0.04
Total cold run time: 97.6 s
Total hot run time: 28.35 s

@hello-stephen
Copy link
Contributor

BE UT Coverage Report

Increment line coverage 0.00% (0/24) 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 52.49% (19272/36716)
Line Coverage 35.97% (179062/497807)
Region Coverage 32.39% (138909/428895)
Branch Coverage 33.33% (60086/180281)

@hello-stephen
Copy link
Contributor

BE Regression && UT Coverage Report

Increment line coverage 12.50% (3/24) 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 71.61% (25772/35991)
Line Coverage 54.27% (269538/496630)
Region Coverage 52.00% (225334/433305)
Branch Coverage 53.34% (96554/181013)

@github-actions github-actions bot added the approved Indicates a PR has been approved by one committer. label Jan 30, 2026
@github-actions
Copy link
Contributor

PR approved by at least one committer and no changes requested.

@gavinchou gavinchou merged commit 894293c into apache:master Jan 30, 2026
29 of 31 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by one committer. dev/4.0.x dev/4.0.x-conflict reviewed

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants