Exactly. This is called "V-Sync", since the buffer swap happens at the vertical synchronization pulse, not during transmission. And when this technique is active, tearing is eliminated. However, this technique is not always active, as you may note that V-Sync is something that can be turned on or off.
"Tearing" is what we call it when a screen renders from two different images, within a single frame. If, for example, I've drawn two screen images which are intended to be displayed one after the other, but the monitor has instead displayed the top half of frame one, and the bottom half of frame two, that's "tearing". This happens due to changing the data the monitor is reading from while the monitor is drawing, instead of during vblank. (In modern programs, this typically happens because the user disabled waiting for vsync in their driver settings)
Yes, he does go on near the end to say that this is normally handled behind the scenes, and it is; V-Sync is implemented at the driver level and works with all games, unless the developers really mess something up; however, many people do disable it for fear of input lag, and it's also hard to know how things are implemented on consoles.
Tearing is
not caused by the data from the GPU arriving and overwriting the buffer in the display as it's being drawn. If this were the case, tearing would occur in every single frame in the same position, because the data stream from the GPU is synchronized with the display's refreshes; that is, the amount of time the GPU takes to send one frame is always the same as the time it takes for one frame to be drawn on the screen. If the display refreshes every 20 ms (50 Hz), then the source device will set its output speed to send exactly 1 frame of data evenly distributed over the course of 20 ms. Frames from the GPU don't just arrive at the display at any random time and just happen to overwrite the display's buffer while it's being drawn from once in a while. If it did that, it would happen at the same position on every refresh cycle, since the timing of the refresh is always going to be the same relative to the transmission (holding aside G-Sync/FreeSync).