Analog is the real world. Things in our world move smoothly from one place to another along a continuous path, whether it is straight or wiggly. Take as an example a toy car tied to a string, and you pull the car around the room. If you used a video camera to record it movements from overhead, you'd see the car move around according to how you had pulled it. But the car never just suddenly stops being in one spot and magically appears at another.
Digital signals are ways to record information as numbers representing small samples of the real world. All such systems have to do several steps to achieve this. First that need information input from the real world to be measured and converted into numbers. In the case of our video of the toy car, this might be taking one frozen frame of the video and using that to convert the position of the car on the floor into a numeric representation. This is like expressing its position on a graph paper that is a map the the floor. This is the first stage of "Analog-to-Digital" or "A to D" data conversion. Then the digital system must store that number in some kind of code - most commonly, a binary math method of recording those numbers - on some recording medium. If you were simply to do your own measurements with a ruler and write down the car's position in the frozen frame image as "x = 297, y = 23", that would be a digital record. If you had this process automated somehow so that the data were recorded in a computer file, that's also a digital data record.
Now, that one digital sample of a frozen frame of the video does not tell you any story, really. There was a whole video of the car's movement. To convert that whole video into a digital version of what it shows, you need a system that examines every single frozen frame of the original video and does the same task. To make it much easier to do and to re-construct the original data later, normally we arrange that the times between original frozen frame samples will all be the same. If we do this, whether by hand taking a long time until you have sheets of paper covered with hand-written car positions, or using some automated system that stores all the information in a computer file, we end up with a series of data records that each represents a brief "snapshot" of reality, converted into some numeric data. The timing of the snapshots is uniformly spaced.
Now, what about "playback". If you replay the original video on a screen, you'll see the car moving smoothly around the floor. To replay the digital version of that, though, will require that we read the data from each frozen frame, draw what that says, then arrange all the resulting images in a sequence and view them one after another. What we actually see is the car at one position, then jumped to a new position in the next drawing, then to new position in the next drawing, etc. But if we do this relatively quickly, our minds interpret things according to what we already have learned about reality. Our minds assume that the car was not merely jumping from spot to spot. It was moving smoothly through all the positions of the sequence of drawings, and we "see" what looks like the original reality. Of course, how "real" it looks depends hugely on the amount of detail we chose to record in our original process of sampling and digitizing the individual frames of the video. I wrote above that all we were doing was converting the position of the toy car on the floor to x and y co-ordinates. But a complete video digitization system such as is really used for making such recordings captures vastly more data from every frame so that the final re-created drawings each looks exactly like the original frozen frame in every detail.
The same process in concept applies to all digital records of reality. For another example, take CD's of music. The original music signal is recorded in a studio by microphones converting the sound waves into electrical signals. The signal is really an analog (continuously varying) value of voltage against time. When it comes time to convert that information to digital form, the system breaks that long analog record into tiny short time slices. For each time slice it takes the voltage at that microsecond and converts it to a single digital representation, then stores it. "Microsecond"? Well, yes. Common ways to digitize audio signals use sampling rates of 44 kHz or higher. That means the analog signal is broken into 44,000 time slices for every second of the time, so one time slice is about 22.7 microseconds long, often shorter for high-quality recordings. So the result of the process is that the analog signal of voltage versus time becomes a long sequence of numbers, each representing the voltage in digital form at a tiny time slice along the way. That's the "A to D" phase of the process.
When it's time to play back the recording, we need a system that will go though the entire file - a sequence of numbers taken at fixed time spacings. For each entry it will need to use a "Digital - to - Analog" converter which can create an output voltage exactly matching what the digital record says for that time slice and feed it out to an amplifier, then proceed to the next time sample record. It must do this at exactly the same rate as the original sampling was done, so that the playback timing exactly matches the original analog record. This is the "D to A" phase of the process, and it reconstructs an analog signal from the digital records of all those time slices. Because of the limits of our own ears and of the analog amplifier equipment, we do not notice at all that the signal is little fixed spots of sound blurbs. We hear continuous music just like the original. A CD disk is simply the medium on which we can store and retrieve the digital data. The CD player reads off that data, performs the D to A conversion, and feeds the resulting analog signals to the audio amplifier / speaker system so we can listen.