This paper describes a prototype implementing a high degree of transaction resilience in distributed software systems using a non-von Neumann computing model exploiting parallelism in computing nodes. The prototype incorporates fault, configuration, accounting, performance, and security (FCAPS) management using a signaling network overlay and allows the dynamic control of a set of distributed computing elements in a network. Each node is a computing entity endowed with self-management and signaling capabilities to collaborate with similar nodes in a network. The separation of parallel computing and management channels allows the end-to-end transaction management of computing tasks (provided by the autonomous distributed computing elements) to be implemented as network-level FCAPS management. While the new computing model is operating system agnostic, a Linux, Apache, MySQL, PHP/Perl/Python (LAMP) based services architecture is implemented in a prototype to demonstrate end-to-end transaction management with auto-scaling, self-repair, dynamic performance management and distributed transaction security assurance. The implementation is made possible by a non-von Neumann middleware library providing Linux process management through multi-threaded parallel execution of self-management and signaling abstractions. We did not use Hypervisors, Virtual machines, or layers of complex virtualization management systems in implementing this prototype. 1. Introduction The advent of many-core severs with tens and even hundreds of computing cores with high bandwidth communication among them makes the current generation server, networking, and storage equipment and their management systems which have evolved from server-centric and bandwidth limited architectures completely unsuited to use in the next generation computing infrastructure efficiently. It is hard to imagine replicating current TCP/IP-based socket communication, “isolate and fix” diagnostic procedures, and the multiple operating systems (which do not have end-to-end visibility or control of business transactions that span across multiple cores, multiple chips, multiple servers, and multiple geographies) inside the next generation many-core servers without addressing their shortcomings. The many-core servers and processors constitute a network where each node itself is a subnetwork with different bandwidths and protocols (socket-based low bandwidth communication between servers, InfiniBand, or PCI Express bus-based communication across processors in the same server and shared memory-based low latency
References
[1]
D. Patterson, “The trouble with multi-core,” IEEE Spectrum, vol. 47, no. 7, pp. 28–53, 2010.
[2]
J. V. Neumann, “Theory of natural and artificial automata,” in Papers of John Von Neuman on Computers and Computer Theory, W. Aspray and A. W. Burks, Eds., vol. 12 of Charles Babbage Institute Reprint, Series for the History of Computing, pp. 408–474, The MIT Press, Cambridge, Mass, USA, 1986.
[3]
S. Balasubramaniam, K. Leibnitz, P. Lio, D. Botvich, and M. Murata, “Biological principles for future Internet architecture design,” IEEE Communications Magazine, vol. 49, no. 7, pp. 44–52, 2011.
[4]
R. Mikkilineni, “Is the Network-centric Computing Paradigm for Multicore, the Next Big Thing?” Convergence of Distributed Clouds, Grids and Their Management, 2010, http://computingclouds.wordpress.com.
[5]
G. Morana and R. Mikkilineni, “Scaling and self-repair of Linux based services using a novel distributed computing model exploiting parallelism,” in Proceedings of the 20th IEEE International Workshops on Enabling Technologies: Infrastructure for Collaborative Enterprises (WETICE '11), pp. 98–103, June 2011.
[6]
R. Mikkilineni and I. Seyler, “Parallax—a new operating system for scalable, distributed, and parallel computing,” in Proceedings of the IEEE International Symposium on Parallel and Distributed Processing Workshops and Phd Forum (IPDPSW '11), pp. 976–983, 2011.
[7]
R. Mikkilineni and I. Seyler, “Parallax—a new operating system prototype demonstrating service scaling and service self-repair in multi-core servers,” in Proceedings of the 20th IEEE International Workshops on Enabling Technologies: Infrastructure for Collaborative Enterprises (WETICE '11), pp. 104–109, 2011.
[8]
R. Mikkilineni, Designing a New Class of Distributed Systems, Springer, New York, NY, USA, 2011.
[9]
R. Buyya, T. Cortes, and H. Jin, “Single system image,” International Journal of High Performance Computing Applications, vol. 15, no. 2, pp. 124–135, 2001.
[10]
http://www.seamicro.com/.
[11]
J. V. Neumann, “Theory of natural and artificial automata,” in Papers of John Von Neuman on Computers and Computer Theory, W. Aspray and A. W. Burks, Eds., vol. 12 of Charles Babbage Institute Reprint, Series for the History of Computing, p. 454, The MIT Press, Cambridge, Mass, USA, 1986.
[12]
P. Stanier and G. Moore, “Embryos, genes and birth defects,” in The Relationship Between Genotype and Phenotype: Some Basic Concepts, P. Ferretti, A. Copp, C. Tickle, and G. Moore, Eds., p. 5, John Wiley & Sons, London, UK, 2nd edition, 2006.
[13]
F. Tusa, A. Celesti, and R. Mikkilineni, “AAA in a cloud-based virtual DIME network architecture (DNA),” in Proceedings of the 20th IEEE International Workshops on Enabling Technologies: Infrastructure for Collaborative Enterprises (WETICE '11), pp. 110–115, Paris, France, 2011.
[14]
R. Buyya and R. Ranjan, “Special section: federated resource management in grid and cloud computing systems,” Future Generation Computer Systems, vol. 26, no. 8, pp. 1189–1191, 2010.
[15]
R. Buyya, C. S. Yeo, S. Venugopal, J. Broberg, and I. Brandic, “Cloud computing and emerging IT platforms: vision, hype, and reality for delivering computing as the 5th utility,” Future Generation Computer Systems, vol. 25, no. 6, pp. 599–616, 2009.
[16]
M. M. Waldrop, Complexity: The Emerging Science at the Edge of Order and Chaos, Simon and Schuster, New York, NY, USA, 1992.
[17]
M. Mohamed, S. Yangui, S. Moalla, and S. Tata, “Web service micro-container for service-based applications in cloud environments,” in Proceedings of the 20th IEEE International Workshops on Enabling Technologies: Infrastructure for Collaborative Enterprises (WETICE '11), pp. 61–66, IEEE Computer Society, 2011.
[18]
J. Von Neumann, “The General and Logical Theory of Automata,” in Cerebral Mechanisms in Behavior, The Hixon Symposium, Edited by L. A. Jeffress, W. Asprey and A. Burks, Eds., Reprinted in Papers of John von Neumann on Computers and Computing Theory, pp. 456–457, The MIT Press, Cambridge, Mass, USA, 1987.
[19]
S. B. Carroll, The New Science of Evo Devo—Endless Forms Most Beautiful, W. W. Norton & Company, New York, NY, USA, 2005.