Cache

of these articles describes the computer memory Cache, for other meanings sees Caché (term clarifying).

Cache /k æʃ/ (also: Shaded memory) marks a fast buffer in the EDP - memory, that in different devices such as z. B. CCUs or non removable disks toEmployment comes. A Cache contains copies of contents of another (background) memory (see:Layered storage) and accelerates thus the access to it. In order to maximize the use usually several orders of magnitude of the smaller Caches in the comparison to the background accumulator, become with the function mode andOrganization of a Caches the Lokalitätseigenschaften of the access samples used.

Literally from the English translated Cache ( spoken like the cash cash , takes of the French more cacher - hide ) means „secret camp “. The name clarifies the circumstance that a Cache its workmostly in secret performs. For the programmer it is transparency to a large extent. Its existence steps only with performance - relevant optimizations or with rarely necessary operations as for instance the Cache Flush in feature.

Table of contents

use

the goals with the employmenta Caches are a decrease of the access time and/or. a decrease of the number of accesses to the too cachenden memory. That means in particular that the use of Caches is worthwhile itself only, where the access time also significant influence on the total outputhas. While with most (scalar) microprocessors the case is, this z steps. B. not on array processors too, where the access time does not play a very important role. Therefore there usually also without Caches one does, because these none or onlylittle use bring.

A further rather unimportant effect with the use of Caches is the reduced range requirement to the next higher digit plane of the layered storage. Because the majority of the inquiries can be often answered by the Cache (Cache hit, S. and), sinks the numberthe access and thus the range requirement of the too cachenden memory. A modern microprocessor without Cache was expenditure-braked even with infinitely small access time of main storage by the fact that sufficient memory range is not available, because by the omission of the Caches thoseNumber of accesses to main storage and thus the requirement would increase to the memory range strongly. A Cache can be thus also used therefore, in order to reduce the range requirements to the too cachenden memory, which itself z. B. in smallerCosts of this to strike down knows.

With CCUs the use of Caches can contribute thus to reducing the of Neumann Flaschenhals of of Neumann architecture. The remark speed of programs can be enormously increased thereby.

Cache hierarchy

there it technically not or only very muchto build a Cache which is at the same time both largely and fast, can one is with difficulty possible several Caches use - z. B. a small fast and a large slower Cache (however still the orders of magnitude is faster than thatto cachende memory). Thus one knows the competitive goals of small access time and Cachegrösse (important for hit rate, S. realize together and).

If several Caches exist, then these form a Cache hierarchy, the part of the layered storage are. The individual Caches becomes asLevel-1 to level n durchnummeriert (briefly: L1, L2 etc.). The lowest number is thus scanned designation here the Cache with the smallest access time - which as the first. If the L1 Cache does not contain the necessary data, then the next becomes (usually somethingslower, but larger) Cache (thus the L2) scans etc. That happens until the data were found either in a Cache level or all Caches without success was scanned (Cache measure, S. and). In this case must on the relatively slow memoryare accessed.

Modern CCUs have usually two or three Cache levels. More than three is rather uncommon. Non removable disks have only a Cache.

processor Cache

with CCUs can the Cache directly in the processor integrated or externally on the motherboard placedits. Depending upon place of the Caches this works with different clock frequencies: The L1 is integrated nearly always direct in the processor and works therefore with the full processor clock - thus and. A. several gigahertz. An external Cache however often becomesonly with several hundred megahertz clocked.

Usual sizes for L1-Caches are 4 to 256 KiB and for the L2 256 to 2048 KiB.

Modern processors have separate L1-Caches for program and data, partly are also still with the L2 thatCase (Montecito). One speaks here of a Harvard Cachearchitektur. That has the advantage that one can block loading for the different access samples for the Cachedesigns different of program code and data. In addition one can do with separate Caches these spatiallybetter to the respective units on the processor those place and thus the critical paths with the processor layout shorten. Furthermore instructions and data can be read/written at the same time. The disadvantage is that modifying code not very well on modern processorsruns. That gives it however only very rarely, so that this is to be hurt.

non removable disk Cache

with non removable disks is the Cache on the tax plate and is 1-32 MIB largely. For more information see also Festplattencache.

Lokalitätsausnutzung

Caches should be fast. In order to reach this, one uses for the Cache usually another (faster) storage technology than for the too cachenden memory (SRAM ggü. DRAM, DRAM ggü. Magnet disk etc.). Therefore Caches are usually substantialmore expensively regarding price/bit, why Caches are clearly smaller laid out. That leads to the fact that a Cache can have not all data at the same time available. In order to solve the problem, which data are to be held now in the Cache, Lokalitätseigenschaften becomethe access used:

  • temporal Lokalität: Since accesses to data repeat themselves (z. B. with processing program loop), is it rather probable that data, which was already once accessed also still another further mark it is accessed. These datashould be kept thus preferential in the Cache. Thus also a necessity results to remove old data, which were for a long time not used, from the Cache in order to make place for newer. One calls this procedure also “displacement “.
  • spatial Lokalität: Since program code and - data not wildly scattered in the ADDRESS area lie about, but “one behind the other” and partly also only within certain ADDRESS ranges (code, data, stack segment, Heap etc.) can one is arranged with a taken place access in a certain address of itgo out that it is very probable that also in a “obvious” address (speak: Amount of the difference of the two addresses one accesses very small). During the processing of a program z becomes. B. an instruction after the other one processed, whereby these “successively”in the memory lie (if it a jump is not). Many data structures such as arrays lie likewise “one behind the other” in the memory.

The spatial Lokalität is the reason, why one with Caches not individual bytes but the data of whole ADDRESS ranges (“Cache block “or sometimesalso “Cache LINE “mentioned) stores. Additionally to it it facilitates the implementation, because one does not have to record its address for each byte data in the memory, but only for each Cache block (out the many bytes exists). The choice of the block size is an important Design parameter for a Cache, which can affect the achievement strongly (both positively and negatively).

to organization

three possibilities of the Cacheorganisation exist: Directly shown (direct mapped, shortened with DM), fullassociatively (fully associative, VA) and sentence-associatively (SA, sometimes also “quantity-associatively” (set associative) mentioned). First the two are a special case of the satzassoziativen Cache.

  • DM: For a gebebene address there is only one possibility and/or. Cache block, in which the data can be.(Illustration shown by address on exactly a Cacheblock, therefore the name “directly”). Thus one must “check” with an inquiry at the Cache also only for a Cache block (D. h. , S. examine the associated day. and). Minimizes the hardware expenditure forthe day comparators.
  • VA: Here the data of an address can be in any block in the Cache. With an inquiry at the Cache it is necessary to examine all tags of the Caches. Since Caches must be as fast as possible, becomesthis parallel implemented, what increases the necessary hardware expenditure for the quantity of day comparators.
  • SA: For a gebebene address there are exactly n possibilities. N is thereby within the range of 1 to the amount of the Cache block and should of the latter inDivisor its. One calls this Cache then n-multiple satzassoziativ (or briefly: n-multiple associatively).

A small overview of the three types: For this m is the amount of the Cache block and n the Assoziativität of the Caches.

Type number of sets of Assoziativität
DM m 1
VA 1 m and/or. n
SA m/n n

a DM-Cache is thus a SA-Cache with a Assoziativität of unity and as many sentences as Cache block is present. A VA-Cache is a SA-Cache with a as large Assoziativität as Cache block is present and only one sentence.

one calls Cache hit

, Cache measuring the procedure that the data of an inquiry are available to a Cache in selbigem, “Cache hit” (dt.: Cache hit), the reverse case as “Cache measuring” (dt.: “Cache missing”).

Around quantitative yardsticks forthe evaluation of the efficiency of a Caches to receive, one defines two sizes:

  • Hit rate: The number of inquiries, with those a Cache hit arose divided by the number of inquiries placed at this Cache altogether. As one from the definition easyto see, lies this size can between zero and unity. A hit rate of z. B. it means 0,7 (or also 70%) that with 70% of all inquiries to the Cache this could supply the data immediately and with 30% of all inquiriesto fit had.
  • Similarly to it the measurement rate is defined. That is the number of inquiries, with those the data not in the Cache available was divided by the number of entire inquiries.

It applies: Measurement rate = 1 - hit rate.

Three becomeKinds of Cache Misses differentiated:

  • Capacity: The Cache is too small. Data were available in the Cache, however the Cache were removed. Taken place then renewed accesses in this address, then these measure called “Capacity measuring”. Remedy createsonly a larger Cache.
  • Conflict: By the sentence assotiative organization (it applies thus also to DM-Caches) it is possible that in a sentence any longer enough place is not, while in other sentences still free of Cache block is present. Then must in thatovercrowded sentence a block to be removed, although the Cache has actually still place. Again one accesses this distant block, then one calls this Cache measures “Conflict measuring”. Remedy creates an increase the Cache block per sentence - thus an increasethe Assotiativität. With vollassotiativen Caches (which only one sentence have) there is thus no Conflict Misses.
  • Compulsory: With the first access in an address normally the pertinent data are not yet in the Cache. These measure designates one as “Compulsory measure”.It is not only difficult or to prevent. Modern processors possess a unit named “Prefetcher”, which independently speculatively data load into the Caches, if still place is there. To reduce thus tries the number of Compulsory Misses.

These three typescalls one also briefly “three C”. In multiprocessor systems Cache coherency can with the employment - minutes of the type Write Invalidate still another fourth “C” to come, i.e. coherency-measure (English: “Coherency measure “). If by the letter of a processor in oneCache block the same block in the Cache of a second processor to be ejected must, then that leads access of the second processor in an address, which was taken off by this distant Cache block, to one measures, one as if coherency-measure designation.

Function

with the administration of the Caches is meaningful it to hold in each case the blocks in the Cache which are also frequently accessed. For this purpose there are different replacement strategies. A frequently used variant is thereby the LRU strategy (English. least r ecently u sed), which always exchanges at the longest block in the Cache, any longer not accessed. Modern processors (AMD Athlon uvm.) implement usually a pseudo LRU replacement strategy, which works thus nearly like genuine LRU to implement but more easily in hardwareis.

displacement strategies

  • FIFO (roofridge in roofridge Out): The oldest entry is displaced.
  • LRU (Recently Used leases): The entry, which longest one did not access, is displaced.
  • LFU (Frequently Used leases): To few read entryone displaces. However by any means complete time stamps are not stored, which would require a relatively long integer. Rather few bits (two are frequent, in addition, only one is possible), in order to mark an Cache entry as more or less frequent used.The actualization of the bits takes place parallel to a displacement.
  • Random: A coincidental entry is displaced.
  • CLOCK: If a date one accesses, a bit is set. The data are stored in the order of the access. With one measure becomes inthis sequence searched for a date without set bit, which replaced. With all data seen thereby the bit is deleted. It is marked likewise, which date was loaded last into the Cache. From there the search for one beginsDate, which can be replaced.
  • Optimally: The procedure of Belady, with which that storage area is displaced, which one will longest not access, is optimal. It is applicable however only if the complete program sequence admits in advance is (D. h. it is a so-called off-line procedure, contrary to FIFO and LRU, which are on-line procedures). The program sequence is however nearly never in advance well-known; therefore the optimal procedure is in practice not used. However the optimal algorithm can as comparison for other procedures serve.

write strategy

with a write access on a block, which is present in the Cache, gives it two in principle to possibilities:

  • The write access is passed on not immediately to the next higher digit plane. Thus an inconsistency developsbetween Cache and to cachendem memory. The latter contains thus outdated information. Only if the block from the Cache is displaced, this is written also into the next higher digit plane. One calls this behavior Write baking (dt.: “Writing back”). This leads with memory accessby other processors or DMA (direct memory access) devices to problems, because these outdated information would read. Create remedy here Cache coherency - minutes, like z. B. MESI for UMA - systems.
  • The write access is passed on immediately also to the next higher digit plane. Thus the consistency is secured,lasts however perhaps longer than a Write baking and uses more range. One calls this behavior Write Through (dt.: “Durchschreiben”).

Similar to to above one there are just as in principle two possibilities with a write access on a block, which is missing in the Cache:

  • Write Allocate: As is the case for normal Cache measuring the block from the next higher is memory-evenly gotten. The appropriate bytes, which were changed by the write access become thereafter in straight block freshly arrived overwritten.
  • Non Write Allocate: It becomes at the Cache past inthe next higher digit plane written, without the pertinent block is loaded into the Cache. That can bring advantages, with which many written data are never again read for some applications. By the use of Non Write Allocate one prevents a displacing ofother possibly important blocks and thus the measurement rate reduces.

Some command sentences offer also specially instructions, in order to by-attribute at the Cache.

Cache Flush

a Cache Flush causes the complete writing back of Cache contents into main storage. Cache contents remain usuallyuntouched. Such a procedure is necessary, if one liked to re-establish the Cache main storage consistency.

Necessary is that whenever the data are needed by external hardware. Examples: Multi-processor communication; Used delivery one as output buffers part of main storage at the DMAs (direct memory access) - CONTROLLERs; Hardware register(so-called. In/output-area or I/O area). Whereby the latters normally not at all when „cache bar “is classified, D. h. with its access the Cache is gone around.

other

of entries in the Cache

for each Cache block is stored in the Cache the following:

  • the actual data
  • the day (remainder of the address)
  • several status bits how:
    • modified (also sometimes as “dirty” designation): It indicates whether this Cacheblock was changed (only with write bake Cache)
    • various status bits depending upon Cache coherency - for minutes. Thus z. B. everBit for:
      • more owner: Equivalent one too “modified & shared”. It indicates that the block was changed and in other Caches is present. The Owner is responsible to update main storage if it removes the block from its Cache. ThatProcessor, which writes last on the Cacheblock, becomes new Owner.
      • exclusive: It indicates that the block was not changed and in no other Cache is present.
      • shared: Has partial different meanings: For MESI this indicates that thatBlock was not changed, in addition, is present in Caches of other processors (there likewise does not change). With MOESI it means only that the block is present in other Prozessorcaches. Here it is also permitted that the block was changed, thusto main storage is inconsistent. In this case there is however a “Owner” (S. and), which is responsible for updating main storage.
      • uvm.
    • invalid: shows to whether this block freely or is occupied.

be called and coldCaches

a Cache is hot „“, if he works optimally (speak: filled is and only few Cache Misses has), and „coldly “, if he does not do this. A Cache is after start-up first coldly, there it still no data contains and frequently time-consumingData to reload must, and warms up then increasingly, since the stored temporarily data correspond ever to more the requested and little reloading is more necessary. In the ideal condition data accesses are served almost exclusively from the Cache, and which knows a reloading is neglected.

one

finds software Caches the word “Cache” also in the software. Here it describes the same principle as during the hardware implementation: Data become buffered for faster access to a faster medium.

Examples:

  • Browser-Cache: (Netz -> non removable disk)
  • Anwendungen: (Festplatte -> main storage)
work on []

See also

Wiktionary: Cache - word origin, synonyms and translations

Web on the left of

 

  > German to English > de.wikipedia.org (Machine translated into English)