A. Scientific monographs
B. International scientific journal papers
M. Forsell, Performance comparison of some shared memory organizations for 2D
mesh-like NOCs, Microprocessors and Microsystems 35, 2 (March 2011), 274-284.
M. Forsell, A PRAM-NUMA Model of Computation for Addressing Low-TLP Workloads,
International Journal of Networking and Computing 1, 1 (January 2011), 21-35.
M. Forsell and V. Leppänen, A moving threads processor architecture MTPA,
Journal of Supercomputing 57, 1 (2011), 5-19.
C. International scientific conference papers
V. Leppänen, M. Forsell and J-M. Mäkelä, Thick Control Flows: Introduction and
Prospects, In the Proceedings of the 2011 International Conference on Parallel
and Distributed Processing Techniques and Applications (PDPTA’11), July 18-21,
2011, Las Vegas, USA.
V. Leppänen, J-M. Mäkelä and M. Forsell, A RISC-Based Moving Tiny Threads
Architecture, In the Proceedings of the 2011 International Conference on
Parallel and Distributed Processing Techniques and Applications (PDPTA’11),
July 18-21, 2011, Las Vegas, USA.
M. Forsell and J. Roivainen, Supporting Ordered Multiprefix Operations in
Emulated Shared Memory CMPs, In the Proceedings of the 2011 International
Conference on Parallel and Distributed Processing Techniques and Applications
(PDPTA’11), July 18-21, 2011, Las Vegas, USA.
M. Forsell, M. Penttonen and V. Leppänen, Cost of Sparse Mesh Layouts
Supporting Throughput Computing, In the Proceedings of EuroMicro Digital
Systems Design 2011 (DSD’11), August 31-September 2, 2011, Oulu, Finland.
V. Leppänen, M. Penttonen and M. Forsell, A Layout for Sparse
Cube-Connected-Cycles Network, In the Proceedings of the CompSysTech'11.
J-M. Mäkelä, V. Leppänen and M. Forsell, RISC-based moving threads multicore
architecture, In the Proceedings of the CompSysTech'11.
J-M. Mäkelä, E. Hansson, M. Forsell, C. Kessler and V.Leppänen, Design
Principles of the Programming Language Replica for Hybrid PRAM-NUMA Many-Core
Architectures, poster, to appear in the Proceedings of the fourth Swedish
Workshop on Multicore Computing (MCC’11), November 23-25, 2011, Linköping,
Sweden.
E. Hansson and C. Kessler, Flexible Scheduling and Thread Allocation for
Synchronous Parallel Tasks, to appear in the Proceedings of the PASA 2012
Workshop/ ARCS 2012, February 29, 2012, Munich, Germany.
D. National scientific conference papers
M. Forsell, Computer Architecture Re(de)fined—the Era of Parallel Computing,
to appear in the Proceedings of the Computer Science I Like Conference,
November 4, 2011, University of Eastern Finland, Kuopio.
E. Other papers
M. Forsell, MCPA—MultiCore Portability Abstraction, the 11th International
Forum on Embedded MPSoC and Multicore, July 4-8, 2011, Beaune, France.
M. Forsell and J. Träff, HPPC 2010: Fourth Workshop on Highly Parallel
Processing on a Chip, In Lecture Notes in Computer Science 6586, (2011), 73-77.
(Hand-out) Proceedings of the 5th Workshop on Highly Parallel Processing on a
Chip, edited by M. Forsell and J. Larsson Träff, August 30, 2011, Bordeaux,
France.
F. Theses
G. Technical reports
H. Other reports
I. Lectures
M. Forsell, Parallelism, programmability and architectural support for them on
multi-core machines, Invited lecture at Artist Summer School Europe 2011,
September 4-9, 2011, Aix-Les-Bains, France.
J. Instructions
K. Presentations
M. Forsell, Removing performance and programmability limitations of chip
multiprocessors, Presentation, Frontier Workshop, June 7, 2011, Kirkkonummi,
Finland.
M. Forsell, Solutions for general purpose and application-specific computing,
Presentation, Steve Furber’s visit to VTT, June 14, 2011, Espoo, Finland.
M. Forsell, Computer Architecture Re(de)fined—the Era of Parallel Computing,
presentation on the Computer Science I Like Conference, November 4, 2011,
University of Eastern Finland, Kuopio.
L. Other
M. Forsell and J. Träff, the 5th Workshop on Highly Parallel Processing on a
Chip (HPPC’11), August 30, 2011, Bordeaux, France.
M. In preparation
M. Forsell, M. Penttonen and V. Leppänen, Cost of Sparse Mesh Layouts
Supporting Throughput Computing, submitted to Microprocessors and
Microsystems, September 31, 2011.
J-M. Mäkelä, E. Hansson, M. Forsell, C. Kessler and V.Leppänen, REPLICA
Language Specification, in preparation, 2011.
M. Forsell and J. Träff, Modeling the relative Efficiency of certain Parallel
Computing Architectures, in preparation, 2012.
M. Forsell, Removing performance and programmability limitations of chip
multiprocessors, in preparation, Frontier Workshop, December 9, 2011, Espoo,
Finland.
____________________________________________________________________________________________________________________________________________
RELATED WORK FROM OTHER PROJECTS
[Forsell94] M. Forsell, Are Multiport Memories Physically Feasible?, Computer
Architecture News 22, 4 (September 1994), 47-54.
[Forsell97] M. Forsell, Implementation of Instruction-Level and Thread-Level
Parallelism in Computers, Dissertations 2, Department of Computer Science,
University of Joensuu, Joensuu, 1997.
[Forsell02a] M. Forsell, Architectural differences of efficient sequential
and parallel computers, Journal of Systems Architecture 47, 13 (July 2002),
1017-1041.
[Forsell02b] M. Forsell, A Scalable High-Performance Computing Solution for
Network on Chips, IEEE Micro 22, 5 (September-October 2002), 46-55.
[Forsell04a] M. Forsell, E—A Language for Thread-Level Parallel Programming
on Synchronous Shared Memory NOCs, WSEAS Transactions on Computers 3, 3 (July
2004), 807-812.
[Forsell04b] M.Forsell, Compiling Thread-Level Parallel Programs with a
C-Compiler, In the Proceedings of the IV Jornadas sobre Programacion y
Lenguajes (PROLE’04), November 11-12, 2004, Malaga, Spain, 215-226.
[Forsell06] M. Forsell, Realizing Multioperations for Step Cached MP-SOCs, In
the Proceedings of the International Symposium on System-on-Chip 2006
(SOC’06), November 14-16, 2006, Tampere, Finland, 77-82.
[Forsell08] M. Forsell and J. Roivainen, Performance, Area and Power
Trade-Offs in Mesh-Based Emulated Shared Memory CMP Architectures, In the
Proceedings of the 2008 International Conference on Parallel and Distributed
Processing Techniques and Applications (PDPTA’08), July 14-17, 2008, Las
Vegas, USA, 471-477.
[Forsell09] M. Forsell, Configurable Emulated Shared Memory Architecture for
general purpose MP-SOCs and NOC regions, In the Proceedings of the 3rd
ACM/IEEE International Symposium on Networks-on-Chip, May 10-13, 2009, San
Diego, USA, 163-172.
[Forsell10a] M. Forsell, A PRAM-NUMA Model of Computation for Addressing
Low-TLP Workloads, In the Proceedings of the 12th Workshop on Advances in
Parallel and Distributed Computational Models (in conjunction with the 24th
IEEE International Parallel and Distributed Processing Symposium, IPDPS’10),
April 19, 2010, Atlanta, USA, 1-8.
[Forsell10b] M. Forsell, TOTAL ECLIPSE—An Efficient Architectural Realization
of the Parallel Random Access Machine, In Parallel and Distributed Computing
Edited by Alberto Ros, IN-TECH, Vienna, 2010, 39-64. (ISBN 978-953-307-057-5)
[Forsell10c] M. Forsell, On the performance and cost of some PRAM models on
CMP hardware, International Journal of Foundations of Computer Science 21, 3
(2010), 387-404.
[Forsell10f] M. Forsell, P. Hofstee, A. Jerraya, C. Jesshope, U. Vishkin and
J. Träff, HPPC 2009 Panel: Are Many-Core Computer Vendors on Track?, Lecture
Notes in Computer Science 6043, (2010), 9-15.
[Forsell10g] M. Forsell, ”Strong Programming Models for MP-SOCs”, In the
Proceedings of the10th International Forum on Embedded MPSoC and Multicore,
June 28 - July 2, 2010, Gifu, Japan.
[Forsell10h] M. Forsell and V. Leppänen, Supporting Concurrent Memory Access
and Multioperations in Moving Threads CMPs, In the proceedings of the 2010
International Conference on Parallel and Distributed Processing Techniques and
Applications (PDPTA’10), July 12-15, 2010, Las Vegas, USA, 377-383.
[Fortune78] S. Fortune and J. Wyllie, Parallelism in Random Access Machines,
Proceedings of 10th ACM STOC, Association for Computing Machinery, New York,
1978, 114-118.
[Jaja92] J. Jaja, Introduction to Parallel Algorithms, Addison-Wesley,
Reading, 1992.
[Keller01] J. Keller, C. Keßler, and J. Träff, Practical PRAM Programming,
Wiley, New York, 2001.
[Leppänen10] V. Leppänen, M. Penttonen and M. Forsell, Layouts for Sparse
Networks Supporting Throughput Computing, In the proceedings of the 2010
International Conference on Parallel and Distributed Processing Techniques and
Applications (PDPTA’10), July 12-15, 2010, Las Vegas, USA, 443-449.
[Ranade91] A. Ranade, How to Emulate Shared Memory, Journal of Computer and
System Sciences 42, (1991), 307-326.
[Vihkin08] U. Vishkin, G. Caragea, Aand B. Lee, Models for Advancing PRAM and
Other Algorithms into Parallel Programs for a PRAM-On-Chip Platform, In
Handbook of Parallel Computing—Models, Algorithms and Applications (editors S.
Rajasekaran and J. Reif), Chapman & Hall/CRC, Boca Raton, 2008, 5-1—5-60.