No, it doesn't work like that. The memory doesn't add up, and each GPU use it only for itself. The whole memory is mirrored while running in CF or SLI mode, so each GPU could render the screen independently of the other one.
The VRAM is more or less mirrored. This is a product of function and not design.
Most multi-GPU rendering methods use the GPUs to render alternate frames, alternate haves of the same frame or alternates lines. This necessitates that they have more or less the same data in their VRAM because their outputs are more or less the same.
When being used for computational tasks this is not necessarily the case
The reason it works that way is because current multi-GPU rendering methods generally use Alternate Frame Rendering (or some variant thereof). They do not work in tandem to render the same frame, they work independently to render alternate frames. Since they work independently this means that they both need to have all the data necessary to render the frame that they are working on without relying on the other GPU.
If you want to have a solid 60 FPS with a single GPU then your GPU needs to render one frame within a maximum of 16.67 milliseconds. If you have two GPUs, the frames are alternated between the GPUs and as such each GPU has 33.33 milliseconds to do its work. However, each frame is not significantly different than the previous frame. The environment doesn't change, the textures don't change, the shaders don't change, the lighting doesn't change, etc... None of the stuff required to render the scene changes. What does change is the time at which that frame is rendered and as such, the motion and position of geometric objects will be slightly different.
All the static rendering data (textures, models, shaders, lighting, untransformed geometry,etc...) has to be constant between the two GPUs. There's simply not enough bandwidth to have one GPU and another pool their memory