全部 标题 作者
关键词 摘要

OALib Journal期刊
ISSN: 2333-9721
费用:99美元

查看量下载量

相关文章

更多...

Automated Enhanced Parallelization of Sequential C to Parallel OpenMP

Keywords: OpenMP , Par4All , PIPS , PoCC , Polyhedral Model , Cache-Line Size , On-Chip Cache Memory

Full-Text   Cite this paper   Add to My Lib

Abstract:

The paper presents the work towards implementation of a technique to enhance parallel execution of auto-generated OpenMP programs by considering the architecture of on-chip cache memory, thereby achieving higher performance. It avoids false-sharing in 'for-loops' by generating OpenMP code for dynamically scheduling chunks by placing each core’s data cache line size apart. It has been found that most of the parallelization tools do not deal with significant issues associated with multicore such as false-sharing, which can degrade performance. An open-source parallelization tool called Par4All (Parallel for All), which internally makes use of PIPS (Parallelization Infrastructure for Parallel Systems) - PoCC (Polyhedral Compiler Collection) integration has been analyzed and its power has been unleashed to achieve maximum hardware utilization. The work is focused only on optimizing parallelization of for-loops, since loops are the most time consuming parts of code. The performance of the generated OpenMP programs have been analyzed on different architectures using Intel VTune Performance Analyzer. Some of the computationally intensive programs from PolyBench have been tested with different data sets and the results obtained reveal that the OpenMP codes generated by the enhanced technique have resulted in considerable speedup. The deliverables include automation tool, test cases, corresponding OpenMP programs and performance analysis reports.

Full-Text

Contact Us

service@oalib.com

QQ:3279437679

WhatsApp +8615387084133