oalib
Search Results: 1 - 10 of 100 matches for " "
All listed articles are free for downloading (OA Articles)
Page 1 /100
Display every page Item
APEnet+: a 3D toroidal network enabling Petaflops scale Lattice QCD simulations on commodity clusters  [PDF]
Roberto Ammendola,Andrea Biagioni,Ottorino Frezza,Francesca Lo Cicero,Alessandro Lonardo,Pier Paolucci,Roberto Petronzio,Davide Rossetti,Andrea Salamon,Gaetano Salina,Francesco Simula,Nazario Tantalo,Laura Tosoratto,Piero Vicini
Computer Science , 2010,
Abstract: Many scientific computations need multi-node parallelism for matching up both space (memory) and time (speed) ever-increasing requirements. The use of GPUs as accelerators introduces yet another level of complexity for the programmer and may potentially result in large overheads due to the complex memory hierarchy. Additionally, top-notch problems may easily employ more than a Petaflops of sustained computing power, requiring thousands of GPUs orchestrated with some parallel programming model. Here we describe APEnet+, the new generation of our interconnect, which scales up to tens of thousands of nodes with linear cost, thus improving the price/performance ratio on large clusters. The project target is the development of the Apelink+ host adapter featuring a low latency, high bandwidth direct network, state-of-the-art wire speeds on the links and a PCIe X8 gen2 host interface. It features hardware support for the RDMA programming model and experimental acceleration of GPU networking. A Linux kernel driver, a set of low-level RDMA APIs and an OpenMPI library driver are available, allowing for painless porting of standard applications. Finally, we give an insight of future work and intended developments.
Architectural improvements and 28 nm FPGA implementation of the APEnet+ 3D Torus network for hybrid HPC systems  [PDF]
Roberto Ammendola,Andrea Biagioni,Ottorino Frezza,Francesca Lo Cicero,Pier Stanislao Paolucci,Alessandro Lonardo,Davide Rossetti,Francesco Simula,Laura Tosoratto,Piero Vicini
Physics , 2013, DOI: 10.1088/1742-6596/513/5/052002
Abstract: Modern Graphics Processing Units (GPUs) are now considered accelerators for general purpose computation. A tight interaction between the GPU and the interconnection network is the strategy to express the full potential on capability computing of a multi-GPU system on large HPC clusters; that is the reason why an efficient and scalable interconnect is a key technology to finally deliver GPUs for scientific HPC. In this paper we show the latest architectural and performance improvement of the APEnet+ network fabric, a FPGA-based PCIe board with 6 fully bidirectional off-board links with 34 Gbps of raw bandwidth per direction, and X8 Gen2 bandwidth towards the host PC. The board implements a Remote Direct Memory Access (RDMA) protocol that leverages upon peer-to-peer (P2P) capabilities of Fermi- and Kepler-class NVIDIA GPUs to obtain real zero-copy, low-latency GPU-to-GPU transfers. Finally, we report on the development activities for 2013 focusing on the adoption of the latest generation 28 nm FPGAs and the preliminary tests performed on this new platform.
Status of the APENet project  [PDF]
R. Ammendola,R. Petronzio,D. Rossetti,A. Salamon,N. Tantalo,P. Vicini
Physics , 2005,
Abstract: We present the current status of APENet, our custom 3-dimensional interconnect architecture for PC clusters environment. We report some micro-benchmarks on our recent large installation as well as new developments on the software and hardware side. The low level device driver has been reworked by following a custom hardware RDMA architecture, and MPICH-VMI, an implementation of the MPI library, has been ported to APENet.
APENet: LQCD clusters a la APE  [PDF]
R. Ammendola,M. Guagnelli,G. Mazza,F. Palombi,R. Petronzio,D. Rossetti,A. Salamon,P. Vicini
Physics , 2004, DOI: 10.1016/j.nuclphysbps.2004.11.373
Abstract: Developed by the APE group, APENet is a new high speed, low latency, 3-dimensional interconnect architecture optimized for PC clusters running LQCD-like numerical applications. The hardware implementation is based on a single PCI-X 133MHz network interface card hosting six indipendent bi-directional channels with a peak bandwidth of 676 MB/s each direction. We discuss preliminary benchmark results showing exciting performances similar or better than those found in high-end commercial network systems.
GRAPE-6: A Petaflops Prototype  [PDF]
Piet Hut,Jeffrey M. Arnold,Junichiro Makino,Stephen L. W. McMillan,Thomas L. Sterling
Physics , 1997,
Abstract: We present the outline of a research project aimed at designing and constructing a hybrid computing system that can be easily scaled up to petaflops speeds. As a first step, we envision building a prototype which will consist of three main components: a general-purpose, programmable front end, a special-purpose, fully hardwired computing engine, and a multi-purpose, reconfigurable system. The driving application will be a suite of particle-based large-scale simulations in various areas of physics. The prototype system will achieve performance in the $\sim 50 - 100$ teraflops range for a broad class of applications in this area. The combination of a hardwired petaflops-class computational engine and a front end with sustained speed on the order of 10 gigaflops can produce extremely high performance, but only for the limited class of problems in which there exists a single bottleneck with computing cost dominating the total. While the calculation for which the Grape-4 (our system's immediate predecessor) was designed is a prime example of such a problem, in many other applications the primary computational bottleneck, while still related to an inverse-square (gravitational, Coulomb, etc.) force, requires less than 99% of the computing power. Although the remainder of the CPU time is typically dominated by just one secondary bottleneck, its nature varies greatly from problem to problem. It is not cost-effective to attempt to design custom chips for each new problem that arises. FPGA-based systems can restore the balance, guaranteeing scalability from the teraflops to the petaflops domain, while still retaining significant flexibility. (abbreviated abstract)
Towards Petaflops Capability of the VERTEX Supernova Code  [PDF]
Andreas Marek,Markus Rampp,Florian Hanke,Hans-Thomas Janka
Physics , 2014, DOI: 10.3233/978-1-61499-381-0-712
Abstract: The VERTEX code is employed for multi-dimensional neutrino-radiation hydrodynamics simulations of core-collapse supernova explosions from first principles. The code is considered state-of-the-art in supernova research and it has been used for modeling for more than a decade, resulting in numerous scientific publications. The computational performance of the code, which is currently deployed on several high-performance computing (HPC) systems up to the Tier-0 class (e.g. in the framework of the European PRACE initiative and the German GAUSS program), however, has so far not been extensively documented. This paper presents a high-level overview of the relevant algorithms and parallelization strategies and outlines the technical challenges and achievements encountered along the evolution of the code from the gigaflops scale with the first, serial simulations in 2000, up to almost petaflops capabilities, as demonstrated lately on the SuperMUC system of the Leibniz Supercomputing Centre (LRZ). In particular, we shall document the parallel scalability and computational efficiency of VERTEX at the large scale and on the major, contemporary HPC platforms. We will outline upcoming scientific requirements and discuss the resulting challenges for the future development and operation of the code.
Correlations in commodity markets  [PDF]
Pawe? Sieczka,Janusz A. Ho?yst
Physics , 2008, DOI: 10.1016/j.physa.2009.01.004
Abstract: In this paper we analyzed dependencies in commodity markets investigating correlations of future contracts for commodities over the period 1998.09.01 - 2007.12.14. We constructed a minimal spanning tree based on the correlation matrix. The tree provides evidence for sector clusterization of investigated contracts. We also studied dynamical properties of commodity dependencies. It turned out that the market was constantly getting more correlated within the investigated period, although the increase of correlation was distributed nonuniformly among all contracts, and depended on contracts branches.
Excess Liquidity and Commodity Boom
S. Ohno
Journal of Economics, Business and Management , 2014, DOI: 10.7763/joebm.2014.v2.106
Abstract: This paper presents an investigation of whether excess liquidity has been serving as a driving force for the increase in international commodity prices. This study uses a structural VAR model including two global liquidity indicators and the world production index to examine the determinants of international commodity prices. The lending of tolerant international bankers promoted commodity price might increase before the global financial crisis while the international liquidity squeeze brought about their decline after the Lehman Shock. Among commodities, the prices of industrial metals are more attributable to funding liquidity, and the price of crude oil, with a market believed to be more vulnerable to speculative money inflows, has been less dependent on liquidity. Gold is exceptional. It acted as a safe haven during the period of international financial dysfunction.
Founding Digital Currency on Imprecise Commodity  [PDF]
Zimu Yuan,Zhiwei Xu
Computer Science , 2015,
Abstract: Current digital currency schemes provide instantaneous exchange on precise commodity, in which "precise" means a buyer can possibly verify the function of the commodity without error. However, imprecise commodities, e.g. statistical data, with error existing are abundant in digital world. Existing digital currency schemes do not offer a mechanism to help the buyer for payment decision on precision of commodity, which may lead the buyer to a dilemma between having to buy and being unconfident. In this paper, we design a currency schemes IDCS for imprecise digital commodity. IDCS completes a trade in three stages of handshake between a buyer and providers. We present an IDCS prototype implementation that assigns weights on the trustworthy of the providers, and calculates a confidence level for the buyer to decide the quality of a imprecise commodity. In experiment, we characterize the performance of IDCS prototype under varying impact factors.
The Commodity Price and Exchange Rate Dynamics  [PDF]
Liping Zou, Boliang Zheng, Xiaoming Li
Theoretical Economics Letters (TEL) , 2017, DOI: 10.4236/tel.2017.76120
Abstract: This paper investigates the dynamic relationship between the commodity price and the exchange rate in Australia and New Zealand. We focus on Australia and New Zealand. Not only do their primary commodities account for significant shares of their exports, but also their currencies share some distinctive characteristics that are unique from other commodity currencies. Using country-specific commodity price indices, we examine the relationship between the departure of currency value from its fair value and fundamental macroeconomic variables. Evidence of a strong and robust relationship between the exchange rate and the commodity price has been found. Results indicate that the commodity price can be used to improve the forecast ability of the future exchange rate. Our commodity-price-augmented exchange rate forecasting model consistently outperforms the random-walk model, for both in-sample and out-of-sample forecasting. These results shed some extra lights on policymaking for countries that rely on primary commodity production, and attempt to move towards floating exchange rate regimes as part of their global market liberalization process.
Page 1 /100
Display every page Item


Home
Copyright © 2008-2017 Open Access Library. All rights reserved.