The bit depth refers basically to how many 0's and 1's are used in each sample. If you sample at 44.1 Khz you are sampling a waveform 44,100 times per second, that will not change. If you use 20 bit depth, there is less information about each individual sample than if you sampled it at 24 bit depth.
The zero's stand for a single piece of information about each individual sample. There are 44,100 sets of this type of information every second at 44.1 Khz.
20 bit- 00000000000000000000
24 bit- 000000000000000000000000
With the above example, you can see why the higher the sample rate and bit depth, the bigger the file size. Let me know if that clears anything up. Hope I could help.