This is not a cache miss, but a waste of the cache lines or cache pollution. 12B are already in the cache line, and one of the 4B will be hit. The other three are transferred, but not used. so in fact there are zero cache misses.
In the worst case scenario, where data is read with 4B interleaving(every 4B has to be read with a new cache line), you will miss twice, as the third 4B is the one you need and will result in a hit. Three consecutive reads -> two misses, one hit.
Sure, but this is about utilization. If there are 12B in the cache line, and you only need the 4B it's a 25% usage.
If you access the 2nd 4B later, they could be still in the cache line and you don't get a miss.
Yeah, I bet on that!
The basic "Hello, World!"?
The cache line block sizes are 4-64bytes. Of course the new architectures have all 64. E.g, the cache line of a Cortex-A8 is 16-words wide.
You will only bring a new cache line on a miss.
It also depends on the type of the cache. You may get misses on type A but not type B and vice versa.
In fact not misses, but cache waste. Whatever, I know what you mean
You probably get a miss for the next data, as the cache is wasted with cold data.
Sure, I agree on that. You should only read data, that you need right now. If you keep data hot, which could be needed in the future, you are hampering the cache and chances are high, that you get a lot of misses for other data.
My example assumed, that no other system needs the data from that component
Sure. It's not a problem with modern architectures.
Cache lines: 16 Byte (Intel 80486) and 128 Bytes (Sandy Bridge).
It's hard to tell, which program will have more misses than the other, as there are also cache hierarchies with inclusive or exclusive caches.
Sorry, for going slightly OT