Sakkura pretty much got the gist of it from a high level perspective.
Video games use multiple procedures which are tightly syncronized to keep one part from getting ahead of the others.
Unless a game is purely software driven, almost all graphical tasks are delegated to the GPU(s). Key to your question is the raster process. Rasterization is the process of converting scene geometry, shadows, lighting, overlays, etc... to a bitmap that is then written into a frame buffer for eventual transmission to the display.
The entire render pipeline is far too complex to regurgitate here, so I'll focus on the pixel shader only. Pixel shaders are small programs that are run on the GPU at least once per pixel (multiple passes may sometimes be necessary) in order to create the final bitmap. The number of pixels is a function of the resolution.
1280x720 = 921,600 pixels
1920x1080 = 2,073,600 pixels
2560x1600 = 4,096,000 pixels
and so on.
Older graphics cards used to have pixel shaders as a distinct hardware feature and their programability was limited. Modern graphics cards use unified shaders that are highly programmable, the same hardware handles all shader types.
Cranking up the resolution increases the minimum number of pixel shaders that need to be run, which increases the total amount of time that the GPU must spend on each frame before that frame is ready. Since the core components of a game engine are tightly synchronized, the CPU -- which is processing IO, physics, and game logic -- cannot be permitted to get too far ahead of the GPU. Often, the GPU is rendering a scene that is 1-2 frames behind that which the CPU is working on, and the GPU is displaying a bitmap that is 1-2 frames behind that which is being rendered.
Disabling some timing constraints can improve realtime responsiveness, but can cause undesirable side effects. For example, when Vertical Syncronization is enabled, the GPU will only copy a completed bitmap from the render pipe to the frame buffer during the vertical blanking interval (the period between when a frame has finished being sent to a monitor, and the next frame has yet to begin). Disabling Vertical Synchronization allows the GPU to copy the frame from the end of the render pipe to the frame buffer as soon as it is complete, overwriting the contents at the same time that the output driver is reading them. This can cause what is known as "screen tearing".
Doing the same on the CPU side may result in game logic executing so quickly that the player is unable to respond due to slow visual feedback. Alternatively, it may result in the game discarding work that it had just completed because the GPU cannot start the associated render work until the CPU has the next batch ready.