# Learn **Repository Path**: cc1009/Learn ## Basic Information - **Project Name**: Learn - **Description**: No description available - **Primary Language**: Unknown - **License**: Not specified - **Default Branch**: master - **Homepage**: None - **GVP Project**: No ## Statistics - **Stars**: 0 - **Forks**: 0 - **Created**: 2018-09-11 - **Last Updated**: 2020-12-19 ## Categories & Tags **Categories**: Uncategorized **Tags**: None ## README # what is cache? 位于速度相差较大的两种硬件之间,用于协调两者数据传输速度差异的结构。 # why use cache 通常部署缓存用来提高读/写性能。性能可以使延迟,吞吐量,资源利用率等。最常用的是对数据库进行缓存,来提高数据库的性能。 # different kinds of chche 几种缓存的组合形式。 ## look-aside / demand fill cache 旁观缓存/需求填充。 clint will query cache first before querying the data store(数据存储) If it`s a HIT ,it will return the value in chache, If it`s a MISS it will return the value from data store. 这种方式只说明了如何查询,但是并没有说明如何填充缓存。 通常情况下是demand-fill 根据需求填补。需求填补意味着在MISS的情况下, 客户端不仅仅会使用数据存储中的值,而且还会将他放到缓存中。通常见到的旁观缓存也是需求填充缓存,但旁观缓存也可以是独立的,例如:可以同时让缓存和数据存储独立起来,订阅消息队列中相同的日志(主题)。这种情况下缓存是一种旁观缓存,并不是需求填充缓存,并且缓存甚至可能拥有比数据存储更新的数据。 ### how inconsistency happend cint put some value into cache,but the value can already be stale(失效,过时的) - 客户得到一个MISS - 客户端读取DB获取值“A” - 有人将DB更新为值“B”并使缓存条目无效 - 客户端将值“A”写入缓存中 TTL time to live ### 解决办法 加入标志位 有人进行写操作时更新 lease 标志位 - client gets a MISS with lease(标志位) `L0` - client reads DB get value `A` - someone updates the DB to value `B` and invalidates the cache entry, which sets lease to `L1` - client puts value `A` into cache and fails due to lease mismatch 无法写入缓存 ## Write-through / read-through cache write-through缓存 客户端直接将数据写入缓存,缓存负责将数据同步到存储中。 read-through缓存 客户端直接从缓存中读取数据,如果是MISS,缓存负责从存储中查找数据填充并回复给客户端。 不是分布式:For single box problem, as long as update-lock for write and fill-lock for read are grabbed properly, read and writes to the same key can be serialized and it's not hard to see that cache consistency will be maintained. 分布式 If there are many replicas of cache, it becomes a distributed system problem, which a few potential solutions might exist. The most straightforward solution to keep multiple replicas of cache consistent is to have a log of mutations/events and update cache based on that log. This log serves the purpose of single point of serialization. It can be Kafka or even MySQL binlog. As long as mutations are globally total ordered in a way that's easy to replay these events, eventual cache consistency can be maintained. Notice that the reasoning behind this is the same as synchronization in distributed system. 将数据产生变化的事件记录为日志,根据日志区对其他副本进行一致性操作,例如使用kafka或mysql binlog。所有缓存副本都订阅该主题的日志。发生变化对所有副本进行操作。