OALib Journal期刊
ISSN: 2333-9721
费用：99美元

投递稿件

查看量	下载量

相关文章
更多...

Journal of Computer and Communications 2025

Comparing Large Language Models for Generating Complex Queries

DOI: 10.4236/jcc.2025.132015, PP. 236-249

Limin Ma, Ken Pu, Ying Zhu, Wesley Taylor

Keywords: Text-to-SQL, Evaluation, LLM, Generative AI

Full-Text Cite this paper Add to My Lib

Abstract:

This study presents a comparative analysis of a complex SQL benchmark, TPC-DS, with two existing text-to-SQL benchmarks, BIRD and Spider. Our findings reveal that TPC-DS queries exhibit a significantly higher level of structural complexity compared to the other two benchmarks. This underscores the need for more intricate benchmarks to simulate realistic scenarios effectively. To facilitate this comparison, we devised several measures of structural complexity and applied them across all three benchmarks. The results of this study can guide future research in the development of more sophisticated text-to-SQL benchmarks. We utilized 11 distinct Language Models (LLMs) to generate SQL queries based on the query descriptions provided by the TPC-DS benchmark. The prompt engineering process incorporated both the query description as outlined in the TPC-DS specification and the database schema of TPC-DS. Our findings indicate that the current state-of-the-art generative AI models fall short in generating accurate decision-making queries. We conducted a comparison of the generated queries with the TPC-DS gold standard queries using a series of fuzzy structure matching techniques based on query features. The results demonstrated that the accuracy of the generated queries is insufficient for practical real-world application.

References

[1]	Poess, M. and Floyd, C. (2000) New TPC Benchmarks for Decision Support and Web Commerce. ACM SIGMOD Record, 29, 64-71. https://doi.org/10.1145/369275.369291
[2]	Li, J.Y., Hui, B.Y., Qu, G., Yang, J.X., Li, B.H., Li, B.W., Wang, B.L., Qin, B.W., Geng, R.Y. and Huo, N. (2024) Can LLM Already Serve as a Database Interface? A Big Bench for Large-Scale Database Grounded Text-to-SQLs. arXiv: 2305.03111.
[3]	Yu, T., Zhang, R., Yang, K., Yasunaga, M., Wang, D., Li, Z., et al. (2018) Spider: A Large-Scale Human-Labeled Dataset for Complex and Cross-Domain Semantic Parsing and Text-To-SQL Task. Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, Brussels, 31 October-4 November 2018, 3911-3921. https://doi.org/10.18653/v1/d18-1425
[4]	Naveed, H., Khan, A.U., Qiu, S., Saqib, M., Anwar, S., Usman, M., Barnes, N. and Mian, A. (2023) A Comprehensive Overview of Large Language Models. arXiv: 2307.06435.
[5]	Bae, S., Kyung, D., Ryu, J., Cho, E., Lee, G., Kweon, S., Oh, J., Ji, L., Chang, E., Kim, T., et al. (2024) EHRXQA: A Multi-Modal Question Answering Dataset for Electronic Health Records with Chest X-Ray Images. arXiv: 2310.18652.
[6]	Owda, M., Bandar, Z. and Crockett, K. (2011) Information Extraction for SQL Query Generation in the Conversation-Based Interfaces to Relational Databases (C-BIRD). In: O’Shea, J., Nguyen, N.T., Crockett, K., Howlett, R.J. and Jain, L.C., Eds., Agent and Multi-Agent Systems: Technologies and Applications. KES-AMSTA 2011, Springer, 44-53. https://doi.org/10.1007/978-3-642-22000-5_6
[7]	Li, Q., Li, L., Li, Q. and Zhong, J. (2020) A Comprehensive Exploration on Spider with Fuzzy Decision Text-To-SQL Model. IEEE Transactions on Industrial Informatics, 16, 2542-2550. https://doi.org/10.1109/tii.2019.2952929
[8]	Pourreza, M. and Rafiei, D. (2024) DIN-SQL: Decomposed in-Context Learning of Text-to-SQL with Self-Correction. arXiv: 2304.11015.
[9]	Chen, X.J., Wang, T., Qiu, T.H., Qin, J.B. and Yang, M. (2024) Open-SQL Framework: Enhancing Text-to-SQL on Open-Source Large Language Models. arXiv: 2405.06674.
[10]	Li, H., Zhang, J., Li, C. and Chen, H. (2023) RESDSQL: Decoupling Schema Linking and Skeleton Parsing for Text-to-SQL. Proceedings of the AAAI Conference on Artificial Intelligence, 37, 13067-13075. https://doi.org/10.1609/aaai.v37i11.26535
[11]	Muniswamaiah, M., Agerwala, T. and Tappert, C.C. (2019) Federated Query Processing for Big Data in Data Science. 2019 IEEE International Conference on Big Data (Big Data), Los Angeles, 9-12 December 2019, 6145-6147. https://doi.org/10.1109/bigdata47090.2019.9005530
[12]	Taipalus, T. (2020) The Effects of Database Complexity on SQL Query Formulation. Journal of Systems and Software, 165, Article ID: 110576. https://doi.org/10.1016/j.jss.2020.110576
[13]	Peeperkorn, M., Kouwenhoven, T., Brown, D. and Jordanous, A. (2024) Is Temperature the Creativity Parameter of Large Language Models? arXiv: 2405.00492.

Full-Text

Contact Us

service@oalib.com

QQ:3279437679

WhatsApp +8615387084133