Exploració per autor "Moretó Planas, Miquel"
Ara es mostren els items 61-80 de 129
-
libPRISM: an intelligent adaptation of prefetch and SMT levels
Ortega, Cristobal; Moretó Planas, Miquel; Casas, Marc; Bertran, Ramon; Buyuktosunoglu, Alper; Eichenberger, Alexandre; Bose, Pradip (Association for Computing Machinery (ACM), 2017)
Text en actes de congrés
Accés obertCurrent microprocessors include several knobs to modify the hardware behavior in order to improve performance under different workload demands. An impractical and time consuming offline profiling is needed to evaluate the ... -
Microarchitectural design-space exploration of an in-order RISC-V processor in a 22nm CMOS technology
Doblas Font, Max; Wright, Andrew; Sonmez, Nehir; Moretó Planas, Miquel; Arvind (European Network of Excellence on High Performance and Embedded Architecture and Compilation (HiPEAC), 2021)
Comunicació de congrés
Accés obertThe purpose of this paper is to explore the trade-offs between IPC and maximum clock frequency in an in-order processor design. This work evaluates the impact on the performance and frequency of different pipeline ... -
Mix-GEMM: An efficient HW-SW architecture for mixed-precision quantized deep neural networks inference on edge devices
Reggiani, Enrico; Pappalardo, Alessandro; Doblas Font, Max; Moretó Planas, Miquel; Olivieri, Mauro; Unsal, Osman Sabri; Cristal Kestelman, Adrián; Barcelona Supercomputing Center (Institute of Electrical and Electronics Engineers (IEEE), 2023)
Text en actes de congrés
Accés obertDeep Neural Network (DNN) inference based on quantized narrow-precision integer data represents a promising research direction toward efficient deep learning computations on edge and mobile devices. On one side, recent ... -
MLP-aware dynamic cache partitioning
Moretó Planas, Miquel; Cazorla Almeida, Francisco Javier; Ramírez Bellido, Alejandro; Valero Cortés, Mateo (Institute of Electrical and Electronics Engineers (IEEE), 2007)
Comunicació de congrés
Accés obertThe limitation imposed by instruction-level parallelism (ILP) has motivated the use of thread-level parallelism (TLP) as a common strategy for improving processor performance. TLP paradigms such as simultaneous multithreading ... -
Modeling and optimizing NUMA effects and prefetching with machine learning
Sánchez Barrera, Isaac; Black-Schaffer, David; Casas, Marc; Moretó Planas, Miquel; Stupnikova, Anastasiia; Popov, Mihail (Association for Computing Machinery (ACM), 2020)
Text en actes de congrés
Accés obertBoth NUMA thread/data placement and hardware prefetcher configuration have significant impacts on HPC performance. Optimizing both together leads to a large and complex design space that has previously been impractical to ... -
Mont-Blanc 2020: Towards scalable and power efficient European HPC processors
Armejach Sanosa, Adrià; Brank, Bine; Cortina Guardia, Jordi; Dolique, François; Hayes, Timothy; Ho, Nam; Lagadec, Pierre-Axel; Lemaire, Romain; López Paradís, Guillem; Marliac, Laurent; Moretó Planas, Miquel; Marcuello Pascual, Pedro; Pleiter, Dirk; Tan, Xubin; Derradji, Said (Institute of Electrical and Electronics Engineers (IEEE), 2021)
Text en actes de congrés
Accés obertThe Mont-Blanc 2020 (MB2020) project has triggered the development of the next generation industrial processor for Big Data and High Performance Computing (HPC). MB2020 is paving the way to the future low-power European ... -
Multicore resource management
Nesbit, Kyle J.; Smith, James E.; Moretó Planas, Miquel; Cazorla, Francisco; Ramírez Bellido, Alejandro; Valero Cortés, Mateo (2008-06)
Article
Accés obertCurrent resource management mechanisms and policies are inadequate for future multicore systems. Instead, a hardware/software interface based on the virtual private machine abstraction would allow software policies to ... -
MUSA: a multi-level simulation approach for next-generation HPC machines
Grass, Thomas; Allande, César; Armejach, Adrià; Rico, Alejandro; Ayguadé Parra, Eduard; Labarta Mancho, Jesús José; Valero Cortés, Mateo; Casas, Marc; Moretó Planas, Miquel (Institute of Electrical and Electronics Engineers (IEEE), 2016)
Text en actes de congrés
Accés obertThe complexity of High Performance Computing (HPC) systems is increasing in the number of components and their heterogeneity. Interactions between software and hardware involve many different aspects which are typically ... -
On the benefits of tasking with OpenMP
Rico, Alejandro; Sánchez Barrera, Isaac; Joao, Jose A.; Randall, Joshua; Casas, Marc; Moretó Planas, Miquel (Springer, 2019)
Text en actes de congrés
Accés obertTasking promises a model to program parallel applications that provides intuitive semantics. In the case of tasks with dependences, it also promises better load balancing by removing global synchronizations (barriers), and ... -
On the convergence of mainstream and mission-critical markets
Girbal, Sylvain; Moretó Planas, Miquel; Grasset, Arnaud; Abella Ferrer, Jaume; Quiñones, Eduardo; Cazorla Almeida, Francisco Javier; Yehia, Sami (Institute of Electrical and Electronics Engineers (IEEE), 2013)
Text en actes de congrés
Accés restringit per política de l'editorialThe computing market has been dominated during the last two decades by the well-known convergence of the highperformance computing market and the mobile market. In this paper we witness a new type of convergence between ... -
On the maturity of parallel applications for asymmetric multi-core processors
Chronaki, Kallia; Moretó Planas, Miquel; Casas, Marc; Rico, Alejandro; Badia Sala, Rosa Maria; Ayguadé Parra, Eduard; Valero Cortés, Mateo (Elsevier, 2019-05-01)
Article
Accés obertAsymmetric multi-cores (AMCs) are a successful architectural solution for both mobile devices and supercomputers. By maintaining two types of cores (fast and slow) AMCs are able to provide high performance under the facility ... -
On the use of many-core Marvell ThunderX2 processor for HPC workloads
Soria Pardos, Víctor; Armejach Sanosa, Adrià; Suárez Gracía, Dario; Moretó Planas, Miquel (2021)
Article
Accés obertMarvell’s ThunderX2 has been the first Arm-based processor with deployments in large-scale HPC production systems, challenging the dominance that x86 processors had in the last decades. While x86 processors and its software ... -
Online prediction of applications cache utility
Moretó Planas, Miquel; Cazorla, Francisco; Ramírez Bellido, Alejandro; Valero Cortés, Mateo (Institute of Electrical and Electronics Engineers (IEEE), 2007)
Text en actes de congrés
Accés obertGeneral purpose architectures are designed to offer average high performance regardless of the particular application that is being run. Performance and power inefficiencies appear as a consequence for some programs. ... -
OpenCL-based FPGA accelerator for semi-global approximate string matching using diagonal bit-vectors
Castells Rufas, David; Marco-Sola, Santiago; Aguado Puig, Quim; Espinosa Morales, Antonio; Moure López, Juan Carlos; Alvarez Martí, Lluc; Moretó Planas, Miquel (Institute of Electrical and Electronics Engineers (IEEE), 2021)
Text en actes de congrés
Accés obertAn FPGA accelerator for the computation of the semi-global Levenshtein distance between a pattern and a reference text is presented. The accelerator provides an important benefit to reduce the execution time of read-mappers ... -
OpenPiton optimizations towards high performance manycores
Leyva Santes, Neiel Israel; Monemi, Alireza; Oliete Escuín, Noelia; López Paradís, Guillem; Abancens Calvo, Xabier; Balkind, Jonathan; Vallejo Gutiérrez, Enrique; Moretó Planas, Miquel; Álvarez Martí, Lluc (Association for Computing Machinery (ACM), 2023)
Text en actes de congrés
Accés obertIn recent years, numerous multicore RISC-V platforms have emerged. Within the RISC-V ecosystem, Networks-on-Chip (NoCs) such as OpenPiton are employed in designs that aim to scale to a large number of cores. This paper ... -
Optimizing computation-communication overlap in asynchronous task-based programs
Castillo, Emilio; Jain, Nikhil; Casas, Marc; Moretó Planas, Miquel; Schulz, Martin; Beivide Palacio, Julio Ramon; Valero Cortés, Mateo; Bhatele, Abhinav (Association for Computing Machinery (ACM), 2019)
Text en actes de congrés
Accés obertAsynchronous task-based programming models are gaining popularity to address the programmability and performance challenges in high performance computing. One of the main attractions of these models and runtimes is their ... -
PARSECSs: Evaluating the impact of task parallelism in the PARSEC benchmark suite
Chasapis, Dimitrios; Casas, Marc; Moretó Planas, Miquel; Vidal Ortiz, Raul; Ayguadé Parra, Eduard; Labarta Mancho, Jesús José; Valero Cortés, Mateo (2015-12-01)
Article
Accés obertIn this work, we show how parallel applications can be implemented efficiently using task parallelism. We also evaluate the benefits of such parallel paradigm with respect to other approaches. We use the PARSEC benchmark ... -
Per-task energy accounting in computing systems
Liu, Qixiao; Jiménez, Víctor; Moretó Planas, Miquel; Abella Ferrer, Jaume; Cazorla, Francisco; Valero Cortés, Mateo (2013)
Report de recerca
Accés obertWe present for the first time the concept of per-task energy accounting (PTEA) and relate it to per-task energy metering (PTEM). We show the benefits of supporting both in future computing systems. Using the shared last-level ... -
Per-task energy metering and accounting in the multicore era
Liu, Qixiao; Moretó Planas, Miquel; Abell, Jaume; Cazorla Almeida, Francisco Javier; Valero Cortés, Mateo (Barcelona Supercomputing Center, 2015-05-05)
Text en actes de congrés
Accés obertEnergy has become arguably the most expensive resource in a computing system. As multi-core processors are the preferred processing platform across different computing domains, measuring the energy usage draws vast attention. ... -
Performance and energy effects on task-based parallelized applications: User-directed versus manual vectorization
Caminal Pallarés, Helena; Caballero de Gea, Diego; Cebrián González, Juan Manuel; Ferrer, Roger; Casas, Marc; Moretó Planas, Miquel; Martorell Bofill, Xavier; Valero Cortés, Mateo (2018-06)
Article
Accés obertHeterogeneity, parallelization and vectorization are key techniques to improve the performance and energy efficiency of modern computing systems. However, programming and maintaining code for these architectures poses a ...