REPLICA Language System
The REPLICA language systems consists of an easy-to-use high-level parallel
programming language REPLICA, OS/library support including dynamic thread
management for OS tasks, skeletons and OS/REPLICA language support libraries,
a low-level baseline language with C-syntax but with e-/fork-style parallelism
concept, C-libraries, unoptimized MBTAC assembler for minimal REPLICA CMP
configuration, and optimized MBTAC assembler for the configuration at hand
(see Figure 1).
Fig. 1. The Replica language system.
At high level, REPLICA supports three major forms of parallelism common
in parallel computing platforms—data, synchronous subgroup, and task
parallelism, while at low level virtual instruction-level parallelism is
provided as a compiler optimization regardless of the dependencies of the code.
REPLICA DESIGN GOALS
The Replica language’s main design goals are ease of programmability, safety, potential for automatic optimizations, and scalability of the parallel computation – starting from simple instruction level operations to task level parallelism and high level parallel patterns (skeletons).
The core set of low level parallel primitives in Replica resembles those in the e [Forsell04] and Fork languages [Keller01]. The main mechanisms are attributes for specifying the memory storage type of data (private or shared), controlling the thread level parallelism with thread group concepts, switching between the NUMA and PRAM modes, and finally the supported hardware level parallel operation instructions.
In order to share data between threads, the language provides an explicit attribute shared for tagging this data. At the programming language level this makes the variable name refer to the same location on all threads of the group. The shared data adheres to the synchronous CRCW PRAM semantics, which means that unlike contemporary architectures, Replica can still guarantee a deterministic memory model even in this case. However, low level synchronization constructs, such as basic barriers for the threads in a group, are still provided by the language to prevent erroneous use of shared data in case of independent concurrent thread groups or processes.
The control flow inside a thread group is managed in Replica with the standard control constructs. The language provides both synchronous and asynchronous versions of all constructs. These automatically manage the synchronization and splitting / joining of thread groups in case the control flow diverges. The thread and group id can be accessed/inspected via machine intrinsics. Along with these thread intrinsics, the language also directly maps to the architecture specific set of arithmetic multioperations, which can provide significant speed improvements in data parallel and synchronous code.
core feature set is adopted from the C language, but simplified to make the
language easier to parse and to analyze
preliminary set of low level parallel primitives in Replica resembles those in
the e and Fork languages
supports three major forms of parallelism common in parallel computing
platforms—data, subgroup, and task parallelism
[Forsell04] M. Forsell. E – A Language for Thread-Level Parallel Programming on Synchronous Shared Memory NOCs. WSEAS Trans. on Computers, 3(3):807–812, 2004.
[Keller01] J. Keller, C. Keßler, and J. Träff, Practical PRAM Programming, Wiley, New York, 2001.
+358 20 722 2278