The point of the caching system is to make sure that not every request made by another node hits the CPU of the node responding to the request. It pre-compiles the data in the backend into static JSON files in the hard drive, so when a request that hits a cache (determined by the time range it asks for) comes from the network, the cache can be directly served from the disk, instead of the CPU having to make a database call, compute it on the fly and serve it.
When an Aether node makes a request to another node, it uses HTTP. If that node uses a HTTP GET request, that request will hit a cache, if available. If the node makes a POST request, that is a request to bypass the cache, and it is a request to the remote node to serve live data instead of serving from cache. The remote can still choose to send a link to a cache as a partial response.
In essence, when a sync happens between two nodes, 99% of that sync is the remote node just reading directly from the cache of the sending node. You can see these cache files in your app folder, they are just JSON files. You will see that they are all timestamp ranged, and they all have headers and manifests, which allows the remote node to efficiently search the caches without having to use the resources of the remote node except bandwidth.
I would recommend you to look at the caches that are generated on your computer first. Those should be helpful in figuring out exactly what kind of data is being cached and how. Once you understand the general format, you might want to attempt a full sync with a remote node you’re running on a different computer on your local network. Listening to the HTTP requests made to that computer and the sequence of those HTTP requests will give you the exact order of HTTP requests that a node makes to another node. That order is how caches are traversed by the remote node. It is a very predictable pattern, so it should be fairly easy to repeat it in Java.
There isn’t much magic happening there, it’s relatively detailed, but it’s low complexity in what it does and why it does it. Fundamentally, it’s there because the computers running Aether aren’t servers and they want to stay idle as much as possible, so we ‘pre-bake’ the data the node has on its database into pre-made JSON files so that when a node syncs with another, that sync is mostly free for the responder.