BranchCache is a generalized content-caching mechanism designed to reduce network bandwidth, specially over WANs.
BranchCache 不同于脱机文件:脱机文件仅缓存文件,BranchCache 缓存内容 —— 任何可用URL标识的东西:文件、网页、HTTP video stream、经数据库或云服务访问的blob。BranchCache 并不访问 CSC cache中的文件, 因为 CSC(client-side caching)是 BranchCache 的用户,脱机文件使用 BranchCache 来构成自己的缓存。
使用 BranchCache 的协议有:
- Server Message Block (SMB) Used to access files on file servers
- HTTP(S) Web pages, video streams, and other content identified by a URL
- Background Intelligent Transfer Service (BITS) Used to transfer files, and runs over HTTP/TLS1.1

BranchCache 操作对于访问缓存内容的应用透明,当客户端启用 BranchCache 后:
1. 从客户端发出的访问内容服务器的请求中携带了 headers/metadata,使内容服务器可知客户端启用了 BranchCache。
2. 服务器首先返回的是描述内容的 content information (CI),而不是被请求访问的内容。CI中包含的是内容分块的哈希值。
这也是 BranchCache 与脱机文件最重要的区别:The content in the BranchCache is not available if the WAN is down.
3. 客户端使用CI从本地缓存中检索出部分或全部内容。
4. 当客户端未从本地获取到缓存内容时,将访问内容服务器获取访问内容,并将该部分内容存入本地缓存。
BranchCache 工作于两种 caching modes:

Hosted Cache(托管缓存模式) 分支机构中设置一台启用 BranchCache 的服务器(Windows Server 2008/R2 及以上),服务器为该分支机构中所有启用 BranchCache 的系统缓存全部内容。
Distributed Cache(分布式缓存模式) 分支机构的客户端自己负责缓存内容,Windows不会尝试把内容缓存均匀分配到各客户端。每台客户端通常都会保存其所访问内容的副本,因而增加了冗余度和一定的弹性,尤其适用于客户端频繁加入和离开分支子网的场景。这种分布式缓存采用 peer-to-peer networking 实现,使用 Web Services Discovery (WS-D) multicast protocol 定位哪个客户端缓存了指定的内容(with a 300ms timeout).
Caching Modes
启用 BranchCache 的系统(包括 BranchCache content servers 和 BranchCache clients and BranchCache hosted cache servers)上维护了两种不同的 local caches。
■ 发布缓存(publication cache)使用 BranchCache Server APIs(PeerDistServerXxxx)为发布内容存储内容信息(CI)元数据。CI 结构中包含内容分块的哈希值、以及用于生成公共/私有内容标识符和加密密钥的secret。
发布(Publishing)是服务端操作,BranchCache 为应用或协议提供了2种方法来生成/管理/存储CI元数据:
1. 使用 BranchCache acceleration 的应用或协议可请求 BranchCache 代为存储CI元数据(存储到BrachCache 发布缓存中),由 BranchCache 管理元数据的生存周期。 This is achieved by publishing via the PeerDistServerXxx APIs, and it is what the HTTP-BranchCache and BITS-BranchCache integrations do.
2. 使用 BranchCache acceleration 的应用或协议也可以请求 BranchCache 只生成但不存储CI元数据,CI 元数据将返回给应用或协议。应用或协议必须自己负责存储和管理元数据。This is what the SMB-BranchCache integrations does。
以上两种情况,与BranchCache集成的协议或使用Branchcache的应用都会自己负责传输CI元数据,BranchCache不会控制数据传输, so that the metadata can be transported with the same level of security, authentication, and authorization that would have been used for retrieving the actual content when BranchCache is not used.
发布缓存中并不存储实际发布的内容,存储的只是CI元数据。默认的发布缓存空间 <= 缓存所处卷大小的%1, which is specified by %SystemRoot%\ServiceProfiles\NetworkService\AppData\Local\PeerDistPub。发布缓存的大小和位置可使用 Netsh 命令修改:
netsh branchcache set publicationcache directory=C:\PublicationCacheFolder
netsh branchcache set publicationcachesize size=20 percent=TRUE
■ Republication Cache 包含元数据(but no secrets) 和实际数据 (chunked in segments and blocks)。目的就是要让其它 BranchCache clients 也可使用缓存的分块内容。Republished content 最多存储 28天,但有可能在缓存空间达到上限时提前被清除。By default, the republication cache is constrained to consume no more than 5 percent of the volume on which it is located, which is by specified by %SystemRoot%\ServiceProfiles\NetworkService\AppData\Local\PeerDistRepub. The location and the size of the republication cache can be changed using NetSh:
netsh branchcache set localcache directory=C:\BranchCache\Localcache
netsh branchcache set localcache size=20 percent=TRUE
配置
BranchCache 可通过本地组策略和netsh命令进行配置:


■ BranchCache Implementation service(PeerDistSvc.dll)服务在启用 BranchCache 时自动启动,服务负责与 kernel mode 驱动交互。
■ HTTP extension driver(PeerDistKM.sys)registers with the Network Module Registrar (NMR) as a client of the http.sys driver and examines all HTTP packets going into and out of the system. It adds files to the cache and retrieves cached content information for published content from the BranchCache service, rather than sending the request to the web server.
■ BranchCache APIs (PeerDistXxx) 由 PeerDist.dll 输出,该动态库利用 LRPC/ALPC 与 BranchCache service 通信。
■ BranchCache HTTP transport(PeerDistHttpTrans.dll)implements the transport on top of which the Peer Content Caching and Retrieval: Retrieval Protocol [MS-PCCRR] exchanges data between BranchCache clients and/or hosted cache servers. Each MS-PCCRR message is encapsulated in a simple transport message, which in turn, is sent over an HTTP request.
■ Web Services Discovery Provider(PeerDistWSDDiscoProv.dll)implements the WS-D protocol to discover which clients on the LAN are caching a particular file (or part of a file).
■ BranchCache Network Shell Helper(PeerDistSh.dll)是对 Netsh.exe 的扩展,为用户提供了一种监控和配置 BranchCache service 的方式。Network Shell helper DLLs are installed by adding a string value to HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\NetSh, which provides the Network Shell with the path to the helper DLL.
■ A standalone variant of all the BranchCache APIs are implemented in PeerDistHashPeerDistHash.dll (仅服务器操作系统可用), which contains all of the BranchCache APIs and functionality and does not require the use of the BranchCache service. This component is designed for use by other Windows features that are tightly integrated with BranchCache, such as the SMB Groveler, which generates the hashes on the server.
■ Hash groveler service in smbhash.exe (运行于文件服务器或web服务器). Groveler 就是 SMB Hash Generation Service, 用于生成、更新和删除内容的哈希值. All groveler I/O runs at low I/O priority so as not to interfere with the normal operation of the system.
BranchCache 使用以下协议:
■ Peer Content Caching and Retrieval: Content Identification, as defined in [MS-PCCRC], defines the content information structures.
■ Peer Content Caching and Retrieval: Discovery Protocol, as defined in [MS-PCCRD], specifies a multicast to discover and locate services based on the Web Services Dynamic Discovery (WS-Discovery) protocol [WS-Discovery]. There are two modes of operations in WS-Discovery: client-initiated probes and service-initiated announcements. Both are sent through IP multicast to a predefined group. The primary role in the Content Caching and Retrieval System is Content Discovery.
■ Peer Content Caching and Retrieval: Retrieval Protocol, as defined in [MS-PCCRR], specifies the messages that are necessary for querying peer-role servers or a hosted cache server for the availability of certain content, and for retrieving the content. The primary role in the Content Caching and Retrieval System is Content Retrieval.
■ Peer Content Caching and Retrieval: Hosted Cache Protocol, as defined in [MS-PCHC], specifies an HTTPS-based mechanism for clients to notify a hosted cache server regarding the availability of content and for a hosted cache server to indicate interest in the content. The primary role in the Content Caching and Retrieval System is Content Notification.
■ Peer Content Caching and Retrieval: Hypertext Transfer Protocol (HTTP) Extensions, as defined in [MS-PCCRTP], specifies a content encoding known as PeerDist that is used by an HTTP/1.1 client and an HTTP/1.1 server to communicate content to each other. The primary role in the Content Caching and Retrieval System is Metadata (Hash) Retrieval.
■ Server Message Block (SMB) Version 2.1 Protocol, as defined in [MS-SMB2]. Version 2.1 of this protocol has enhancements for the detection of content caching-enabled shares and retrieval of metadata related to content caching. The primary role in the Content Caching and Retrieval System is Metadata (Hash) Retrieval.
BranchCache Optimized Application Retrieval: SMB Sequence
本部分描述了BranchCache缓存的内容以何种顺序传递给应用程序(无需对应用程序进行修改)。这里描述的序列适用于使用SMB协议的应用程序:应用程序调用 CreateFile(或 fopen)打开远程文件进行读取。
BranchCache和SMB通过脱机文件组件集成在一起。The Offline Files service opportunistically tries to prefetch files accessed via SMB to optimize network usage and user experience on the client side.
When both offline files and BranchCache are enabled on the client, the following steps occur:

1. 脱机文件驱动(offline files driver)拦截文件读取I/O请求,确定下面5个条件是否满足,以发起文件预取:
a. 脱机文件缓存中尚不存在该数据。如果数据已在缓存中,则此缓存数据将会返回给应用程序,读取数据的请求也不会发往文件服务器。
b. The latency to the server (as observed by the client so far) is above the configured threshold.
c. BranchCache hash generation is enabled on the file share.
d. The target file size is at least 64 KB.
e. The read is beyond the first 64 KB of the file.
2. 如果上述5个条件满足,the offline files driver notifies the offline files service to start prefetching the file.
3. 脱机文件服务接下来从文件服务器检索CI:如果文件服务器中有最新的CI,就将该CI返回给客户端;如果CI不存在或者CI已过期,则SMB hash-generation service就会为被请求读取的文件生成新CI。此时CI 不会返回给客户端,causing offline files to skip BranchCache retrieval for this file.
4. If content information is retrieved from the file server, the offline files service then uses that information to attempt to retrieve data from BranchCache.
5. BranchCache 尝试从 peers 或托管缓存中检索数据。如果数据找到,则返回给 offline files service; otherwise, an error is returned.
6. If data is found in BranchCache, the data is written to the offline files cache and the prefetch thread continues to attempt to retrieve data from BranchCache until it has retrieved up to 8 MB of data or it fails to retrieve data.
7. When the application’s read operation is allowed to proceed, it attempts to read the data from the offline files cache, which is prepopulated by data from BranchCache if the prefetch thread successfully retrieved data. Otherwise, the application’s read is allowed to flow to the server to retrieve data. Data retrieved from the file server is then cached in the offline files cache for later publication to BranchCache.
8. When the Offline Files Service is requested to prefetch data from BranchCache, it also attempts to publish any data to BranchCache for the file from the offline files cache. File data is stored in the offline files cache until the offline files cache needs to reclaim space for newer files. The same data is also stored in BranchCache’s republication cache so that it can be shared with other BranchCache clients and across different protocols/applications integrated with BranchCache.
If the client accesses the same content again (after closing the file and opening it again) and the content has not been changed on the server, the application will be able to retrieve the data from the Offline Files cache without doing the BranchCache lookup. This is called transparent caching.
If the requested data cannot be found through BranchCache, once it is retrieved from the SMB server it will be republished to the BranchCache for access by other clients.
BranchCache Optimized Application Retrieval: HTTP Sequence
本部分描述的序列适用于采用HTTP协议的应用程序。
BranchCache 和 HTTP 紧密集成在一起, both in terms of HTTP.sys on the server side and WinInet and WinHTTP on the client side.
When BranchCache is enabled on both client and server, an application’s HTTP requests are always stalled, waiting for BranchCache retrievals. The HTTP-BranchCache integration focuses on minimizing the usage of the WAN’s bandwidth (even when the WAN happens to be very fast and has very low latency), and all the data that can be retrieved via BranchCache will be transferred via BranchCache.
1. Data retrieval begins with an application issuing an HTTP Request.
2. 当客户端启用 BranchCache 时,the HTTP client stack (either WinInet or WinHTTP) adds headers to the request indicating that the client is capable of understanding the PeerDist HTTP encoding (as defined in [MS-PCCRTP]).
3. The HTTP client stack 将请求发送到 remote content server, typically across the WAN link.
4. The kernel-mode HTTP driver (HTTP.sys) receives the request on the content server. 如果服务器端启用了 BranchCache,则 HTTP.sys 就会把请求的副本转发给 BranchCache HTTP extension driver (PeerDistKM.sys), which keeps track of the request and retrieves content information for that content (identified by its URL and content tags) from the BranchCache service.
5. HTTP.sys delivers the HTTP request to the associated web server in user mode (typically, IIS or a web service) and waits for a response.
6. The HTTP server authenticates and authorizes the client, it generates the response accordingly, and it starts streaming the response down to HTTP.sys.
7. Because BranchCache is enabled, HTTP.sys redirects the response through PeerDistKM.sys.
8. If the content information for that HTTP content is not available (or not yet available) or if the content tags do not match, the following steps occur:
a. PeerDistKM.sys sends a copy of the response stream to the BranchCache service for publication so that the next request for the same URL will find the content information.
b. It allows the response stream to go back to HTTP.sys unchanged.
c. HTTP.sys sends out the response with actual data in it and no BranchCache metadata.
9. If, instead, the content information for that HTTP content is available and, based on content tags, it is found to be up to date with the content returned, the following steps occur:
a. PeerDistKM.sys replaces the body of the response with the content information describing it in BranchCache terms.
b. It modifies the response headers adding that the response is now PeerDist-encoded.
c. It returns the modified (and, in general, much shorter) response stream to HTTP.sys.
d. HTTP.sys sends out the modified response, containing only BranchCache content information metadata, but not any actual content data.
10. 客户端接收到响应。如果响应中包含BranchCache CI,那么 HTTP client stack 就将元数据传递给 BranchCache service, and it starts serving the first application read for the actual contents of that response by asking BranchCache to retrieve the content data associated with the size of that first read.
11. BranchCache 从本地的 republication cache (if available) 中检索到数据, 或者从局域网中其它的 BranchCache clients 或 托管缓存服务器中检索到包含有被请求数据的超集。
12. If any of the requested data is missing, BranchCache signals to the HTTP stack the range of missing data, and the HTTP stack issues a range request back to the remote server for the missing data (or a superset including it).
13. Once all the data is reassembled for the specific application read, it is returned to the application in a way completely transparent to the application.
14. The last three steps are repeated until all the application’s reads on the HTTP response in question are completed.
网友评论