<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:itunes="http://www.itunes.com/dtds/podcast-1.0.dtd" xmlns:dc="http://purl.org/dc/elements/1.1/" version="2.0">
  <channel>
    <title>DSpace Collection:</title>
    <link>http://hdl.handle.net/2117/3113</link>
    <description />
    <pubDate>Sun, 19 May 2013 18:43:57 GMT</pubDate>
    <dc:date>2013-05-19T18:43:57Z</dc:date>
    <itunes:owner>
      <itunes:email>webmaster.bupc@upc.edu</itunes:email>
      <itunes:name>Universitat Politècnica de Catalunya. Servei de Biblioteques i Documentació</itunes:name>
    </itunes:owner>
    <itunes:explicit>no</itunes:explicit>
    <itunes:keywords />
    <item>
      <title>Process variability in sub-16nm bulk CMOS technology</title>
      <link>http://hdl.handle.net/2117/15667</link>
      <description>Title: Process variability in sub-16nm bulk CMOS technology
Authors: Rubio Sola, Jose Antonio; Figueras Pàmies, Joan; Vatajelu, Elena Ioana; Canal Corretger, Ramon
Abstract: The document is part of deliverable D3.6 of the TRAMS Project (EU FP7 248789), of public nature, and shows and justifies the levels of variability used in the research project for sub-18nm bulk CMOS technologies.</description>
      <pubDate>Mon, 26 Mar 2012 18:45:53 GMT</pubDate>
      <guid isPermaLink="false">http://hdl.handle.net/2117/15667</guid>
      <dc:date>2012-03-26T18:45:53Z</dc:date>
      <itunes:author>Rubio Sola, Jose Antonio; Figueras Pàmies, Joan; Vatajelu, Elena Ioana; Canal Corretger, Ramon</itunes:author>
      <itunes:explicit>no</itunes:explicit>
      <itunes:keywords />
      <itunes:summary>The document is part of deliverable D3.6 of the TRAMS Project (EU FP7 248789), of public nature, and shows and justifies the levels of variability used in the research project for sub-18nm bulk CMOS technologies.</itunes:summary>
    </item>
    <item>
      <title>Dynamic fine-grain body biasing of caches with latency and leakage 3T1D-based monitors</title>
      <link>http://hdl.handle.net/2117/15019</link>
      <description>Title: Dynamic fine-grain body biasing of caches with latency and leakage 3T1D-based monitors
Authors: Ganapathy, Shrikanth; Canal Corretger, Ramon; González Colás, Antonio María; Rubio Sola, Jose Antonio
Abstract: In this paper, we propose a dynamically tunable fine-grain body biasing mechanism to reduce active &amp; standby leakage power in caches under process variations.</description>
      <pubDate>Wed, 08 Feb 2012 12:50:44 GMT</pubDate>
      <guid isPermaLink="false">http://hdl.handle.net/2117/15019</guid>
      <dc:date>2012-02-08T12:50:44Z</dc:date>
      <itunes:author>Ganapathy, Shrikanth; Canal Corretger, Ramon; González Colás, Antonio María; Rubio Sola, Jose Antonio</itunes:author>
      <itunes:explicit>no</itunes:explicit>
      <itunes:keywords />
      <itunes:summary>In this paper, we propose a dynamically tunable fine-grain body biasing mechanism to reduce active &amp; standby leakage power in caches under process variations.</itunes:summary>
    </item>
    <item>
      <title>A selective logging mechanism for hardware transactional memory systems</title>
      <link>http://hdl.handle.net/2117/15009</link>
      <description>Title: A selective logging mechanism for hardware transactional memory systems
Authors: Lupon Navazo, Marc; Magklis, Grigorios; González Colás, Antonio María
Abstract: Log-based Hardware Transactional Memory (HTM) systems offer an elegant solution to handle speculative data that overflow transactional L1 caches. By keeping the pre-transactional values on a software-resident log, speculative values can be safely moved across the memory hierarchy, without requiring expensive searches on L1 misses or commits.</description>
      <pubDate>Wed, 08 Feb 2012 11:38:25 GMT</pubDate>
      <guid isPermaLink="false">http://hdl.handle.net/2117/15009</guid>
      <dc:date>2012-02-08T11:38:25Z</dc:date>
      <itunes:author>Lupon Navazo, Marc; Magklis, Grigorios; González Colás, Antonio María</itunes:author>
      <itunes:explicit>no</itunes:explicit>
      <itunes:keywords />
      <itunes:summary>Log-based Hardware Transactional Memory (HTM) systems offer an elegant solution to handle speculative data that overflow transactional L1 caches. By keeping the pre-transactional values on a software-resident log, speculative values can be safely moved across the memory hierarchy, without requiring expensive searches on L1 misses or commits.</itunes:summary>
    </item>
    <item>
      <title>On the effectiveness of hybrid mechanisms on reduction of parametric failures in caches</title>
      <link>http://hdl.handle.net/2117/15007</link>
      <description>Title: On the effectiveness of hybrid mechanisms on reduction of parametric failures in caches
Authors: Ganapathy, Shrikanth; Canal Corretger, Ramon; González Colás, Antonio María; Rubio Sola, Jose Antonio
Abstract: In this paper, we provide an insight on the different proactive read/write assist methods (wordline boosting &amp; adaptive body biasing) that help in preventing (and reducing) parametric failures when coupled with reactive techniques like ECC and redundancy which cope with already existent failures. While proactive and reactive have been previously viewed as complementary techniques, we show that it is not necessarily the case when considering the benefits of such hybrid schemes.</description>
      <pubDate>Wed, 08 Feb 2012 11:07:29 GMT</pubDate>
      <guid isPermaLink="false">http://hdl.handle.net/2117/15007</guid>
      <dc:date>2012-02-08T11:07:29Z</dc:date>
      <itunes:author>Ganapathy, Shrikanth; Canal Corretger, Ramon; González Colás, Antonio María; Rubio Sola, Jose Antonio</itunes:author>
      <itunes:explicit>no</itunes:explicit>
      <itunes:keywords />
      <itunes:summary>In this paper, we provide an insight on the different proactive read/write assist methods (wordline boosting &amp; adaptive body biasing) that help in preventing (and reducing) parametric failures when coupled with reactive techniques like ECC and redundancy which cope with already existent failures. While proactive and reactive have been previously viewed as complementary techniques, we show that it is not necessarily the case when considering the benefits of such hybrid schemes.</itunes:summary>
    </item>
    <item>
      <title>Implementing a hybrid SRAM / eDRAM NUCA architecture</title>
      <link>http://hdl.handle.net/2117/13932</link>
      <description>Title: Implementing a hybrid SRAM / eDRAM NUCA architecture
Authors: Lira Rueda, Javier; Molina Clemente, Carlos; Brooks, David; González Colás, Antonio María
Abstract: In this paper, we propose a hybrid cache architecture that exploits the main features of both memory technologies, speed of SRAM and high density of eDRAM. We demonstrate, that due to the high locality found in emerging applications, a high percentage of data that enters to the on-chip last-level cache are not accessed again before they are replaced</description>
      <pubDate>Wed, 16 Nov 2011 11:21:21 GMT</pubDate>
      <guid isPermaLink="false">http://hdl.handle.net/2117/13932</guid>
      <dc:date>2011-11-16T11:21:21Z</dc:date>
      <itunes:author>Lira Rueda, Javier; Molina Clemente, Carlos; Brooks, David; González Colás, Antonio María</itunes:author>
      <itunes:explicit>no</itunes:explicit>
      <itunes:keywords />
      <itunes:summary>In this paper, we propose a hybrid cache architecture that exploits the main features of both memory technologies, speed of SRAM and high density of eDRAM. We demonstrate, that due to the high locality found in emerging applications, a high percentage of data that enters to the on-chip last-level cache are not accessed again before they are replaced</itunes:summary>
    </item>
    <item>
      <title>vPROBE: Variation aware post-silicon power/performance binning using embedded 3T1D cells</title>
      <link>http://hdl.handle.net/2117/13911</link>
      <description>Title: vPROBE: Variation aware post-silicon power/performance binning using embedded 3T1D cells
Authors: Ganapathy, Shrikanth; Canal Corretger, Ramon; González Colás, Antonio María; Rubio Sola, Jose Antonio
Abstract: In this paper, we present an on-die post-silicon binning methodology that takes into account the effect of static and dynamic variations and categorizes every processor based on power/performance.The proposed scheme is composed of a discretization hardware that exploits the delay/leakage dependence on variability sources characteristic for categorization</description>
      <pubDate>Tue, 15 Nov 2011 14:28:57 GMT</pubDate>
      <guid isPermaLink="false">http://hdl.handle.net/2117/13911</guid>
      <dc:date>2011-11-15T14:28:57Z</dc:date>
      <itunes:author>Ganapathy, Shrikanth; Canal Corretger, Ramon; González Colás, Antonio María; Rubio Sola, Jose Antonio</itunes:author>
      <itunes:explicit>no</itunes:explicit>
      <itunes:keywords />
      <itunes:summary>In this paper, we present an on-die post-silicon binning methodology that takes into account the effect of static and dynamic variations and categorizes every processor based on power/performance.The proposed scheme is composed of a discretization hardware that exploits the delay/leakage dependence on variability sources characteristic for categorization</itunes:summary>
    </item>
    <item>
      <title>FOCSI: A new layout regularity metric</title>
      <link>http://hdl.handle.net/2117/13385</link>
      <description>Title: FOCSI: A new layout regularity metric
Authors: Pons Solé, Marc; Moll Echeto, Francisco de Borja; Rubio Sola, Jose Antonio; Abella Ferrer, Jaume; Vera Rivera, Francisco Javier; González Colás, Antonio María
Abstract: Digital CMOS Integrated Circuits (ICs) suffer from serious layout features printability issues associated to the lithography manufacturing process. Regular layout designs are emerging as alternative solutions to reduce these ICs systematic subwavelength lithography failures. However, there is no metric to evaluate and compare the layout regularity of those regular designs.&#xD;
In this paper we propose a new layout regularity metric&#xD;
called Fixed Origin Corner Square Inspection (FOCSI).&#xD;
FOCSI allows the comparison and quantification of designs&#xD;
in terms of regularity and for any given degree of&#xD;
granularity. When FOCSI is oriented to the evaluation&#xD;
of regularity while applying Lithography Enhancement&#xD;
Techniques, it comprehends layout layers measurements&#xD;
considering the optical interaction length&#xD;
and combines them to obtain the complete layout regularity&#xD;
measure. Examples are provided for 32-bit adders&#xD;
in the 90 nm technology node for the Standard Cell approach&#xD;
and for Via-Configurable Transistor Array regular&#xD;
designs. We show how layouts can be sorted accurately&#xD;
even if their degree of regularity is similar.
Description: Technical Report</description>
      <pubDate>Thu, 29 Sep 2011 08:11:25 GMT</pubDate>
      <guid isPermaLink="false">http://hdl.handle.net/2117/13385</guid>
      <dc:date>2011-09-29T08:11:25Z</dc:date>
      <itunes:author>Pons Solé, Marc; Moll Echeto, Francisco de Borja; Rubio Sola, Jose Antonio; Abella Ferrer, Jaume; Vera Rivera, Francisco Javier; González Colás, Antonio María</itunes:author>
      <itunes:explicit>no</itunes:explicit>
      <itunes:keywords />
      <itunes:summary>Digital CMOS Integrated Circuits (ICs) suffer from serious layout features printability issues associated to the lithography manufacturing process. Regular layout designs are emerging as alternative solutions to reduce these ICs systematic subwavelength lithography failures. However, there is no metric to evaluate and compare the layout regularity of those regular designs.&#xD;
In this paper we propose a new layout regularity metric&#xD;
called Fixed Origin Corner Square Inspection (FOCSI).&#xD;
FOCSI allows the comparison and quantification of designs&#xD;
in terms of regularity and for any given degree of&#xD;
granularity. When FOCSI is oriented to the evaluation&#xD;
of regularity while applying Lithography Enhancement&#xD;
Techniques, it comprehends layout layers measurements&#xD;
considering the optical interaction length&#xD;
and combines them to obtain the complete layout regularity&#xD;
measure. Examples are provided for 32-bit adders&#xD;
in the 90 nm technology node for the Standard Cell approach&#xD;
and for Via-Configurable Transistor Array regular&#xD;
designs. We show how layouts can be sorted accurately&#xD;
even if their degree of regularity is similar.</itunes:summary>
    </item>
    <item>
      <title>Last Bank: dealing with address reuse in non-uniform cache architecture for CMPs</title>
      <link>http://hdl.handle.net/2117/8400</link>
      <description>Title: Last Bank: dealing with address reuse in non-uniform cache architecture for CMPs
Authors: Lira Rueda, Javier; Molina Clemente, Carlos; González Colás, Antonio María
Abstract: In response to the constant increase in wire delays, Non-Uniform Cache Architecture (NUCA) has been introduced as an effective memory model for dealing with growing memory latencies. This architecture divides a large memory cache into smaller banks that can be&#xD;
accessed independently. Banks close to the cache controller therefore have a faster response time than banks located farther away from it. In this paper, we propose and analyse the insertion of an additional bank into the NUCA cache. This is called Last Bank. This extra bank deals with data&#xD;
blocks that have been evicted from the other banks in the NUCA cache. Furthermore, we analyse the behaviour of the cache line replacements done in the NUCA cache and propose two optimisations of Last Bank that&#xD;
provide significant performance benefits without incurring unaffordable implementation costs.</description>
      <pubDate>Mon, 26 Jul 2010 11:58:27 GMT</pubDate>
      <guid isPermaLink="false">http://hdl.handle.net/2117/8400</guid>
      <dc:date>2010-07-26T11:58:27Z</dc:date>
      <itunes:author>Lira Rueda, Javier; Molina Clemente, Carlos; González Colás, Antonio María</itunes:author>
      <itunes:explicit>no</itunes:explicit>
      <itunes:keywords />
      <itunes:summary>In response to the constant increase in wire delays, Non-Uniform Cache Architecture (NUCA) has been introduced as an effective memory model for dealing with growing memory latencies. This architecture divides a large memory cache into smaller banks that can be&#xD;
accessed independently. Banks close to the cache controller therefore have a faster response time than banks located farther away from it. In this paper, we propose and analyse the insertion of an additional bank into the NUCA cache. This is called Last Bank. This extra bank deals with data&#xD;
blocks that have been evicted from the other banks in the NUCA cache. Furthermore, we analyse the behaviour of the cache line replacements done in the NUCA cache and propose two optimisations of Last Bank that&#xD;
provide significant performance benefits without incurring unaffordable implementation costs.</itunes:summary>
    </item>
    <item>
      <title>LRU-PEA: A smart replacement policy for non-uniform cache architectures on chip multiprocessors</title>
      <link>http://hdl.handle.net/2117/8399</link>
      <description>Title: LRU-PEA: A smart replacement policy for non-uniform cache architectures on chip multiprocessors
Authors: Lira Rueda, Javier; Molina Clemente, Carlos; González Colás, Antonio María
Abstract: The increasing speed-gap between processor and memory and the limited memory bandwidth make last-level cache performance crucial for CMP architectures. Non Uniform Cache Architectures (NUCA) has been introduced to deal with this problem. This memory organization divides the whole memory space into smaller pieces or banks allowing&#xD;
nearer banks to have better access latencies than further banks.Moreover, an adaptive replacement policy that efficiently reduces misses in the last-level cache could boost performance, particularly if set associativity is assumed. Unfortunately,&#xD;
traditional replacement policies do not behave properly as they were assumed for single-processors. This paper focuses on Bank&#xD;
Replacement. This policy involves three key decisions when there is a miss: where to place a data within the cache set, which data&#xD;
to evict from the cache set and finally, where to place the evicted data. We propose a novel replacement technique that enables&#xD;
more intelligent replacement decisions to be taken, based on the observation that some type of data are less commonly accessed&#xD;
depending of the bank where they reside. We call this technique as LRU-PEA (Least Recently Used with a Priority Eviction&#xD;
Approach). We show that the proposed technique significantly reduces the requests to the off-chip memory by increasing the&#xD;
hit ratio in the NUCA cache. This translates into an average IPC improvement of 8% and into an Energy per Instruction (EPI) reduction of 5%.</description>
      <pubDate>Mon, 26 Jul 2010 11:50:26 GMT</pubDate>
      <guid isPermaLink="false">http://hdl.handle.net/2117/8399</guid>
      <dc:date>2010-07-26T11:50:26Z</dc:date>
      <itunes:author>Lira Rueda, Javier; Molina Clemente, Carlos; González Colás, Antonio María</itunes:author>
      <itunes:explicit>no</itunes:explicit>
      <itunes:keywords />
      <itunes:summary>The increasing speed-gap between processor and memory and the limited memory bandwidth make last-level cache performance crucial for CMP architectures. Non Uniform Cache Architectures (NUCA) has been introduced to deal with this problem. This memory organization divides the whole memory space into smaller pieces or banks allowing&#xD;
nearer banks to have better access latencies than further banks.Moreover, an adaptive replacement policy that efficiently reduces misses in the last-level cache could boost performance, particularly if set associativity is assumed. Unfortunately,&#xD;
traditional replacement policies do not behave properly as they were assumed for single-processors. This paper focuses on Bank&#xD;
Replacement. This policy involves three key decisions when there is a miss: where to place a data within the cache set, which data&#xD;
to evict from the cache set and finally, where to place the evicted data. We propose a novel replacement technique that enables&#xD;
more intelligent replacement decisions to be taken, based on the observation that some type of data are less commonly accessed&#xD;
depending of the bank where they reside. We call this technique as LRU-PEA (Least Recently Used with a Priority Eviction&#xD;
Approach). We show that the proposed technique significantly reduces the requests to the off-chip memory by increasing the&#xD;
hit ratio in the NUCA cache. This translates into an average IPC improvement of 8% and into an Energy per Instruction (EPI) reduction of 5%.</itunes:summary>
    </item>
  </channel>
</rss>

