|
一个面向边缘服务系统资源监测的预警方法
|
Abstract:
边缘服务系统当中存在着众多计算节点,计算节点的资源是变化的,这些资源分为静态资源和动态资源。为了更好地发挥边缘服务集群系统的性能,充分利用各节点的资源,提升系统的可靠性与稳定性,构建一个针对集群系统资源的监控预警系统是很有必要的。目前有很多集群监控系统都很有特色,但是新有的监测底层工具提供了极大的方便性,本文利用现有的Prometheus和Grafana以及睿象云的优势特性,提取三者各自的优势功能,同时以本文设计的集成中间部件作为调节器,构建了一个集成中间件监控预警系统,通过此监控预警系统来支持边缘服务系统的资源监测的实现。实验结果表明,该监控预警系统能准确检测和及时报警系统出现的异常,保障了服务系统的安全性,使得其稳定运行。
There are many computing nodes in the edge service system, and the resources of the computing nodes are changing, and these resources are divided into static resources and dynamic resources. In order to better utilize the performance of the edge service cluster system, make full use of the resources of each node, and improve the reliability and stability of the system, it is necessary to build a monitoring and early warning system for cluster system resources. At present, many cluster monitoring systems are very distinctive, but the new monitoring underlying tools provide great convenience. This paper uses the existing advantages of Prometheus, Grafana and Cloud Alert to extract the respective advantages of the three functions. At the same time, using the integrated middleware designed in this paper as a regulator, an integrated middleware monitoring and early warning system is constructed, through which the monitoring and early warning system supports the realization of resource monitoring of the edge service system. The experimental results show that the monitoring and early warning system can accurately detect and timely alarm the abnormality of the system, which ensures the security of the service system and makes it run stably.
[1] | 郑洪源, 周良, 吴家祺. WEB服务器集群系统中负载平衡的设计与实现[J]. 南京航空航天大学学报, 2006(3): 347-351. https://doi.org/10.16356/j.1005-2615.2006.03.016 |
[2] | Goscinski, A., Hobbs, M. and Silock, J. (2002) GENESIS: An Efficient, Transparent and Easy to Use Cluster Operating System. Parallel Computing, 28, 557-606. https://doi.org/10.1016/S0167-8191(02)00077-7 |
[3] | 李翠, 陈庆奎. 基于DPDK并行通信的动态监控模型[J]. 计算机应用, 2020, 40(2): 335-341. |
[4] | Zheng, X. and Guo, H. (2021) Research on Subway Construction Monitoring and Warning System Based on Internet of Things Technology. Journal of Physics Conference Series, 1885, Article ID: 022052.
https://doi.org/10.1088/1742-6596/1885/2/022052 |
[5] | Sium, F.S., Ghosh, A. and Al-Hossain, M.J. (2021) IoT Based Smart Energy Monitoring and Warning System. 2020 11th International Conference on Electrical and Computer Engineering (ICECE), Dhaka, 17-19 December 2020, 93-96. https://doi.org/10.1109/ICECE51571.2020.9393104 |
[6] | Lei, T., Lv, F., Liu, J., et al. (2022) Research on Electrical Equipment Monitoring and Early Warning System Based on Internet of Things Technology. Mathematical Problems in Engineering, 2022, Article ID: 6255277.
https://doi.org/10.1155/2022/6255277 |
[7] | Hu, X., Xiang, Y., Li, Y., et al. (2021) Trident: Efficient and Practical Software Network Monitoring. Tsinghua Science & Technology, 26, 452-463. https://doi.org/10.26599/TST.2020.9010018 |
[8] | Katsaros, G., Kübert, R. and Gallizo, G. (2011) Building a Service-Oriented Monitoring Framework with REST and Nagios. IEEE International Conference on Services Computing, SCC 2011, Washington DC, 4-9 July 2011, 426-431.
https://doi.org/10.1109/SCC.2011.53 |
[9] | 赵哲, 谭海波, 赵赫, 王卫东, 李晓风. 基于Zabbix的网络监控系统[J]. 计算机技术与发展, 2018, 28(1): 144-149. |
[10] | 王晓锋, 马丽丽. 基于Prometheus的油田数据泄漏防护大数据系统运维方法[J]. 软件工程, 2021, 24(4): 43-46.
https://doi.org/10.19644/j.cnki.issn2096-1472.2021.04.010 |
[11] | Guia, A.D. and Ballera, M.A. (2021) Multi-Agent Class Timetabling for Higher Educational Institutions Using Prometheus Platform. Indonesian Journal of Electrical Engineering and Computer Science, 22, 1679.
https://doi.org/10.11591/ijeecs.v22.i3.pp1679-1687 |
[12] | 刘小磊, 程伟华, 章路进. 基于Prometheus的云计算资源全链路监控系统[J]. 电子设计工程, 2023, 31(2): 170-174. https://doi.org/10.14022/j.issn1674-6236.2023.02.036 |
[13] | Neto, A.J.A., Neto, J.A.C. and Moreno, E.D. (2022) The Development of a Low-Cost Big Data Cluster Using Apache Hadoop and Raspberry Pi. A Complete Guide. Computers and Electrical Engineering, 104, Article ID: 108403.
https://doi.org/10.1016/j.compeleceng.2022.108403 |
[14] | 李文钊, 赵芳杨, 永毅, 赵思亮. 基于JSON-RPC的Grafana数据可视化的另类方法设计[J]. 数字技术与应用, 2021, 39(10): 193-195. |
[15] | 郭彬, 杨晨, 刘庆涛, 等. 基于InfluxDB与Grafana的物联网监测系统设计[J]. 现代电子技术, 2022(18): 45. |
[16] | 许瑜超, 李桂炎, 周治, 等. 基于开源的Prometheus开发广电网络综合运维网管平台[J]. 广播与电视技术, 2022, 49(8): 134-139. |
[17] | Qiu, R. and Ji, W. (2021) An Embedded Bandit Algorithm Based on Agent Evolution for Cold-Start Problem. International Journal of Crowd Science, 5, 228-238. https://doi.org/10.1108/IJCS-03-2021-0005 |
[18] | 赵伟, 王蓓, 张士祁, 等. 基于Prometheus的openGauss监控系统的关键技术及验证[J]. 郑州大学学报(理学版), 2022(6): 54. |
[19] | Wang, M., Zhang, C. and Yi, X. (2013) Fault Detector of Fault-Tolerant Distributed Systems Based on Self-Adaptive Heartbeat Algorithm. Journal of Beijing University of Aeronautics and Astronautics, 39, 952-956. |
[20] | Singh, R.K. and Verma, H.K. (2022) Redis-Based Messaging Queue and Cache-Enabled Parallel Processing Social Media Analytics Framework. The Computer Journal, 65, 843-857. https://doi.org/10.1093/comjnl/bxaa114 |
[21] | 王欢, 李民, 邓秀辉, 焦宇, 余开朝. 基于Redis缓存数据库和Nginx负载均衡技术的购物网站性能优化[J]. 软件导刊, 2022, 21(8): 114-119. |
[22] | 于爽, 刘从军. 基于改进蚁群算法的负载均衡机制的研究[J]. 计算机与数字工程, 2022, 50(10): 2145-2148+2181. |
[23] | David, J. (2013) Nagios: Building Enterprise-Grade Monitoring Infrastructures for Systems and Networks. 2nd Edition, Pearson, San Clemente. |
[24] | Rihards, O. (2010) Zabbix 1.8 Network Monitoring. Packt Publishing, Birmingham, 428. |