Now showing items 1-3 of 3

    • Beyond the socket: NUMA-aware GPUs 

      Ugljesa, Milic; Villa, Oreste; Bolotin, Evgeny; Arunkumar, Akhil; Ebrahimi, Eiman; Jaleel, Aamer; Ramirez, Alex; Nellans, David (Association for Computing Machinery, 2017-10)
      Conference lecture
      Open Access
      GPUs achieve high throughput and power efficiency by employing many small single instruction multiple thread (SIMT) cores. To minimize scheduling logic and performance variance they utilize a uniform memory system and ...
    • GMT: Enabling easy development and efficient execution of irregular applications on commodity clusters 

      Morari, Alessandro; Villa, Oreste; Tumeo, Antonino; Chavarria Miranda, Daniel; Valero Cortés, Mateo (Association for Computing Machinery (ACM), 2013)
      Conference lecture
      Open Access
      In this poster we introduce GMT (Global Memory and Threading library), a custom runtime library that enables efficient execution of irregular applications on commodity clusters. GMT only requires a cluster with x86 nodes ...
    • Scaling irregular applications through data aggregation and software multithreading 

      Morari, Alessandro; Tumeo, Antonio; Chavarria Miranda, Daniel; Villa, Oreste; Valero Cortés, Mateo (Institute of Electrical and Electronics Engineers (IEEE), 2014)
      Conference report
      Restricted access - publisher's policy
      Emerging applications in areas such as bioinformatics, data analytics, semantic databases and knowledge discovery employ datasets from tens to hundreds of terabytes. Currently, only distributed memory clusters have enough ...