Skip to content

Conversation

@nagisa-kunhah
Copy link

@nagisa-kunhah nagisa-kunhah commented Dec 17, 2025

Fix #51706

What problem does this PR solve?

Issue Number: close #xxx

Fix issue: #51706

Problem Summary:

Release note

None

Check List (For Author)

  • Test

    • Regression test
    • Unit Test
    • Manual test (add detailed scripts or steps below)
    • No need to test or manual test. Explain why:
      • This is a refactor/code format and no logic has been changed.
      • Previous test can cover this change.
      • No code files have been changed.
      • Other reason
  • Behavior changed:

    • No.
    • Yes.
  • Does this need documentation?

Check List (For Reviewer who merge this PR)

  • Confirm the release note
  • Confirm test cases
  • Confirm document
  • Add branch pick label

@hello-stephen
Copy link
Contributor

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

  1. What problem was fixed (it's best to include specific error reporting information). How it was fixed.
  2. Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
  3. What features were added. Why was this function added?
  4. Which code was refactored and why was this part of the code refactored?
  5. Which functions were optimized and what is the difference before and after the optimization?

@zclllyybb zclllyybb self-assigned this Dec 17, 2025
@nagisa-kunhah nagisa-kunhah changed the title [feature] Impl the function map_concat like spark #51706 [feature](function) Impl the function map_concat like spark #51706 Dec 18, 2025
mrhhsg
mrhhsg previously approved these changes Jan 9, 2026
Copy link
Member

@mrhhsg mrhhsg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

Comment on lines 710 to 712
if (boundFunction instanceof MapConcat) {
return processMapConcat((MapConcat) boundFunction);
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should genreate right signature in function, this file should not be changed

@nagisa-kunhah nagisa-kunhah dismissed stale reviews from mrhhsg and zclllyybb via 105c5a6 January 10, 2026 21:27
@nagisa-kunhah
Copy link
Author

run buildall

@github-actions github-actions bot removed the approved Indicates a PR has been approved by one committer. label Jan 10, 2026
@hello-stephen
Copy link
Contributor

FE UT Coverage Report

Increment line coverage 2.38% (1/42) 🎉
Increment coverage report
Complete coverage report

@doris-robot
Copy link

TPC-H: Total hot run time: 32263 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 105c5a68861fe61ad3301db938b324c86b52b575, data reload: false

------ Round 1 ----------------------------------
q1	17622	4337	4073	4073
q2	2038	355	248	248
q3	10172	1338	746	746
q4	10221	886	322	322
q5	7524	2164	1926	1926
q6	198	169	145	145
q7	944	805	652	652
q8	9264	1442	1155	1155
q9	5078	4600	4578	4578
q10	6848	1823	1415	1415
q11	516	303	287	287
q12	735	757	581	581
q13	17813	3870	3171	3171
q14	288	297	283	283
q15	590	502	513	502
q16	712	688	627	627
q17	682	855	521	521
q18	7275	6495	7120	6495
q19	1249	1045	652	652
q20	428	400	274	274
q21	3306	2822	2599	2599
q22	1134	1108	1011	1011
Total cold run time: 104637 ms
Total hot run time: 32263 ms

----- Round 2, with runtime_filter_mode=off -----
q1	4349	4248	4331	4248
q2	346	412	343	343
q3	2291	2832	2466	2466
q4	1486	1875	1402	1402
q5	4708	4486	4300	4300
q6	214	174	133	133
q7	2096	1907	1751	1751
q8	2524	2400	2376	2376
q9	7272	7124	7357	7124
q10	2409	2721	2226	2226
q11	571	479	462	462
q12	693	783	597	597
q13	3446	3921	3120	3120
q14	271	285	260	260
q15	527	495	483	483
q16	619	649	620	620
q17	1108	1292	1310	1292
q18	7471	7201	7199	7199
q19	819	819	790	790
q20	1941	1992	1789	1789
q21	4627	4203	4060	4060
q22	1062	1060	970	970
Total cold run time: 50850 ms
Total hot run time: 48011 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 172927 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 105c5a68861fe61ad3301db938b324c86b52b575, data reload: false

query5	4788	607	453	453
query6	327	234	215	215
query7	4212	467	257	257
query8	327	235	231	231
query9	8794	2631	2646	2631
query10	509	371	323	323
query11	15252	15152	14867	14867
query12	175	112	113	112
query13	1243	492	393	393
query14	6286	3017	2846	2846
query14_1	2763	2691	2738	2691
query15	203	188	174	174
query16	980	514	493	493
query17	1079	686	584	584
query18	2606	436	348	348
query19	233	227	200	200
query20	130	125	117	117
query21	217	139	126	126
query22	4092	3922	3831	3831
query23	15917	15517	15337	15337
query23_1	15410	15577	15339	15339
query24	7505	1592	1212	1212
query24_1	1170	1189	1204	1189
query25	575	479	428	428
query26	1003	265	164	164
query27	2729	451	295	295
query28	4546	2146	2137	2137
query29	787	569	477	477
query30	313	248	215	215
query31	772	625	567	567
query32	85	76	69	69
query33	546	361	305	305
query34	890	874	557	557
query35	718	764	747	747
query36	876	895	732	732
query37	125	91	75	75
query38	2784	2739	2654	2654
query39	758	747	743	743
query39_1	724	703	717	703
query40	209	130	117	117
query41	67	61	62	61
query42	107	104	102	102
query43	436	471	418	418
query44	1329	731	715	715
query45	187	185	175	175
query46	849	957	602	602
query47	1335	1478	1326	1326
query48	310	316	243	243
query49	602	417	320	320
query50	644	275	201	201
query51	3752	3775	3791	3775
query52	108	110	106	106
query53	297	329	279	279
query54	300	284	270	270
query55	78	78	74	74
query56	283	293	292	292
query57	975	966	938	938
query58	268	262	249	249
query59	1964	2144	2004	2004
query60	318	314	295	295
query61	191	162	164	162
query62	379	350	343	343
query63	294	263	270	263
query64	4478	1279	988	988
query65	3827	3737	3782	3737
query66	1342	428	298	298
query67	14989	15536	14779	14779
query68	7746	1000	698	698
query69	494	348	316	316
query70	1066	961	951	951
query71	371	301	284	284
query72	5797	3473	3437	3437
query73	749	721	302	302
query74	8751	8798	8598	8598
query75	2830	2811	2409	2409
query76	3356	1067	650	650
query77	537	388	277	277
query78	9667	9980	9130	9130
query79	1281	909	587	587
query80	621	582	495	495
query81	533	261	228	228
query82	199	149	112	112
query83	263	260	245	245
query84	265	124	101	101
query85	924	512	450	450
query86	386	322	317	317
query87	2877	2895	2751	2751
query88	3079	2223	2210	2210
query89	396	351	334	334
query90	2020	149	148	148
query91	175	166	138	138
query92	82	71	65	65
query93	998	900	546	546
query94	566	334	291	291
query95	574	387	299	299
query96	588	466	209	209
query97	2367	2406	2333	2333
query98	231	202	203	202
query99	574	552	532	532
Total cold run time: 252098 ms
Total hot run time: 172927 ms

@hello-stephen
Copy link
Contributor

BE UT Coverage Report

Increment line coverage 76.92% (50/65) 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 53.03% (18982/35795)
Line Coverage 39.11% (175951/449941)
Region Coverage 33.67% (136242/404601)
Branch Coverage 34.70% (58893/169705)

@hello-stephen
Copy link
Contributor

BE Regression && UT Coverage Report

Increment line coverage 86.15% (56/65) 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 73.86% (25844/34990)
Line Coverage 61.32% (275168/448708)
Region Coverage 56.41% (230537/408717)
Branch Coverage 58.20% (99080/170254)

@zclllyybb
Copy link
Contributor

please link docs PR here then we can merge it.


@Override
public List<FunctionSignature> getSignatures() {
if (arity() == 0) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

arity=0的情况怎么理解?函数行为是什么?

Copy link
Author

@nagisa-kunhah nagisa-kunhah Jan 16, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

arity=0就是map_concat()里没有参数的case,默认是直接返回MapType的,key和value的类型都是NullType:

private MapType() {
keyType = NullType.INSTANCE;
valueType = NullType.INSTANCE;
}

也可以不单独分出来,都走compute的逻辑结果也一样,但是看create_map那边有单列出来这里就一样写出来。

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Spark也是支持0参数嘛?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

支持的

spark-sql (default)> select map_concat() as m;
{}

return processCreateMap((CreateMap) boundFunction);
}

// type coercion
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

dont modify this

* Compute signatures when arity > 0.
* Extract key and value types, find common type, and construct signatures.
*/
private List<FunctionSignature> computeNonEmptySignatures() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

没必要再单独抽一个函数出来了。

@nagisa-kunhah
Copy link
Author

run buildall

@nagisa-kunhah
Copy link
Author

run buildall

@doris-robot
Copy link

TPC-H: Total hot run time: 32526 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit c35513dc7cf7c3f9f9f4a369f632740eb3bb8f7d, data reload: false

------ Round 1 ----------------------------------
q1	17659	4242	4049	4049
q2	2065	355	278	278
q3	10101	1307	738	738
q4	10212	849	315	315
q5	7511	2066	1994	1994
q6	200	174	143	143
q7	929	829	643	643
q8	9273	1432	1246	1246
q9	5050	4613	4680	4613
q10	6812	1814	1402	1402
q11	560	310	283	283
q12	738	766	583	583
q13	17768	3883	3153	3153
q14	312	290	277	277
q15	584	517	515	515
q16	687	687	625	625
q17	672	785	531	531
q18	6740	6463	6912	6463
q19	1445	1035	712	712
q20	435	392	288	288
q21	3431	2777	2573	2573
q22	1172	1164	1102	1102
Total cold run time: 104356 ms
Total hot run time: 32526 ms

----- Round 2, with runtime_filter_mode=off -----
q1	4334	4223	4245	4223
q2	358	443	327	327
q3	2241	2791	2446	2446
q4	1519	1914	1631	1631
q5	4534	4315	4274	4274
q6	229	172	129	129
q7	2015	1984	1855	1855
q8	2558	2438	2429	2429
q9	7045	7066	7481	7066
q10	2439	2753	2116	2116
q11	552	465	434	434
q12	658	688	575	575
q13	3407	3895	3169	3169
q14	278	282	255	255
q15	526	498	477	477
q16	633	651	614	614
q17	1106	1252	1321	1252
q18	7389	7344	7248	7248
q19	844	810	801	801
q20	1875	1961	1854	1854
q21	4595	4323	4104	4104
q22	1088	1034	950	950
Total cold run time: 50223 ms
Total hot run time: 48229 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 174162 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit c35513dc7cf7c3f9f9f4a369f632740eb3bb8f7d, data reload: false

query5	4440	639	478	478
query6	326	230	202	202
query7	4219	468	267	267
query8	333	241	233	233
query9	8753	2901	2897	2897
query10	514	395	328	328
query11	15221	15093	14864	14864
query12	177	113	111	111
query13	1236	492	377	377
query14	6018	3127	2866	2866
query14_1	2700	2709	2694	2694
query15	194	188	169	169
query16	991	486	475	475
query17	1090	688	576	576
query18	2472	436	339	339
query19	224	222	198	198
query20	121	119	119	119
query21	229	141	119	119
query22	4153	3870	4052	3870
query23	16002	15556	15214	15214
query23_1	15390	15568	15451	15451
query24	7106	1560	1180	1180
query24_1	1191	1195	1184	1184
query25	547	457	402	402
query26	1244	271	153	153
query27	2774	445	286	286
query28	4590	2187	2166	2166
query29	768	544	446	446
query30	318	247	210	210
query31	786	635	563	563
query32	86	77	75	75
query33	535	375	321	321
query34	918	880	536	536
query35	718	778	684	684
query36	883	910	802	802
query37	143	97	89	89
query38	2765	2787	2760	2760
query39	773	746	721	721
query39_1	718	720	716	716
query40	228	143	124	124
query41	73	70	67	67
query42	110	105	107	105
query43	492	437	426	426
query44	1374	766	760	760
query45	187	185	178	178
query46	866	963	595	595
query47	1364	1502	1359	1359
query48	324	336	245	245
query49	622	439	375	375
query50	638	284	218	218
query51	3795	3793	3798	3793
query52	107	109	100	100
query53	294	340	275	275
query54	303	287	271	271
query55	88	89	83	83
query56	330	317	317	317
query57	1024	995	890	890
query58	282	265	270	265
query59	2158	2143	2110	2110
query60	342	350	327	327
query61	214	145	143	143
query62	386	371	309	309
query63	306	267	268	267
query64	4881	1233	962	962
query65	3859	3698	3806	3698
query66	1450	419	311	311
query67	15579	15546	15454	15454
query68	2508	1106	782	782
query69	459	373	337	337
query70	975	943	996	943
query71	343	325	297	297
query72	5287	3205	3196	3196
query73	618	737	326	326
query74	8726	8744	8570	8570
query75	2746	2829	2480	2480
query76	2274	1066	698	698
query77	357	390	307	307
query78	9803	9828	9074	9074
query79	2356	923	591	591
query80	1763	573	494	494
query81	552	268	237	237
query82	1015	149	116	116
query83	361	263	238	238
query84	246	117	97	97
query85	875	486	423	423
query86	403	327	285	285
query87	2974	2857	2766	2766
query88	3506	2599	2567	2567
query89	396	362	325	325
query90	1965	182	178	178
query91	170	162	135	135
query92	78	75	68	68
query93	1170	962	543	543
query94	636	306	280	280
query95	600	405	310	310
query96	636	526	236	236
query97	2346	2338	2312	2312
query98	216	201	202	201
query99	602	575	493	493
Total cold run time: 249574 ms
Total hot run time: 174162 ms

@doris-robot
Copy link

ClickBench: Total hot run time: 26.8 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit c35513dc7cf7c3f9f9f4a369f632740eb3bb8f7d, data reload: false

query1	0.06	0.05	0.05
query2	0.10	0.04	0.04
query3	0.25	0.09	0.08
query4	1.60	0.11	0.11
query5	0.27	0.25	0.25
query6	1.14	0.65	0.66
query7	0.03	0.02	0.02
query8	0.05	0.04	0.05
query9	0.58	0.50	0.49
query10	0.55	0.55	0.55
query11	0.15	0.10	0.11
query12	0.16	0.10	0.11
query13	0.60	0.59	0.59
query14	0.95	0.96	0.94
query15	0.80	0.78	0.78
query16	0.40	0.41	0.37
query17	1.03	1.03	1.06
query18	0.24	0.23	0.22
query19	1.88	1.82	1.87
query20	0.02	0.02	0.01
query21	15.44	0.29	0.15
query22	5.13	0.05	0.05
query23	15.83	0.29	0.10
query24	2.30	0.35	0.66
query25	0.08	0.11	0.06
query26	0.14	0.12	0.12
query27	0.06	0.05	0.06
query28	4.35	1.06	0.88
query29	12.54	3.91	3.12
query30	0.28	0.14	0.13
query31	2.82	0.65	0.40
query32	3.23	0.56	0.46
query33	3.06	3.03	3.02
query34	16.32	5.00	4.40
query35	4.41	4.49	4.43
query36	0.64	0.50	0.50
query37	0.11	0.06	0.07
query38	0.07	0.04	0.03
query39	0.05	0.03	0.03
query40	0.16	0.13	0.14
query41	0.09	0.04	0.03
query42	0.04	0.03	0.03
query43	0.04	0.03	0.04
Total cold run time: 98.05 s
Total hot run time: 26.8 s

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Feature] Impl the function map_concat like spark

6 participants