Video Lectures - not only contents but also form of presentation matters



Why, especially in the time of coronavirus pandemic, not only contents but also the form of presentation matters

I am not an expert in online and video lectures. I am still learning, but I think it will be interesting to write about my progress and experiences. As you can see from the video samples (at the end of this page), not only the technical quality but also the composition and flow of my lectures is gradually improving over time.

Audio

Audio settings. For recording audio I use a good dynamic microphone with cardioid characteristics (Audio-Technica ATR2100 and currently also a borrowed Shure SM7B) and an audio interface (Steinberg UR242). I connect the microphone via an XLR cable to the audio interface and the audio interface via a USB cable to the computer. The ATR2100 has also a USB connection in addition to XLR. While its sounds great when connected by XLR to the audio interface, there is little hiss and other distortions while connected by USB directly to the computer. But even via USB this microphone is incomparably better than any microphone integrated with a cheap headset. I think, the constructors wanted to save costs on the analog-to-digital converter built into this microphone. I also tried one, not cheap USB only microphone and again little hiss and distortions. So only the XLR connection with an audio interface makes sense. Shure SM7B is generaly better than ATR2100, but with a high level of ambient noise the ATR2100 is better. As my experience shows, the next must-have (and must-use) is a pop-filter. Even though Shure SM7B has an integrated pop-filter, it sounds better when this pop-filter is removed and a standard external pop-filter is used instead. Although Shure SM7B is a great microphone I will return this when the pandemic is over, because it is very expensive and I will no longer need to talk to students online.

I record the audio always in 24-bit using Audacity when I record one or two microphones. Audacity allows recording only two tracks, unless you modify it somehow. If I want to record three or four sources of sound I use Reaper (because it is cheap and light-weight, while the software that comes with the Steinberg audio interface although free is very heavy-weight). Reaper on the other hand can practically record as many tracks as there are the audio interface inputs. There were two problems, when I tried to record the audio in 16-bit. First, it was necessary to set the input level very precessily in order not to loose any bits and not to allow for clipping. Second, when processing the audio in 16-bit, then after several operations, as change of amplitude, compression, de-noising etc., the rounded errors tended to sum up and that gave audible artifacts. The limit accuracy of the best DACs is 21-bit and of human ear in ideal conditions no more 20-bit. So recording in 24-bit gives me the additional four bits about which I don't have to care, thus I can set the recording level low to be safely far from clipping and then even use the four lower bits for rounding error accumulation. For this reason recording sound in Camtasia (which I use for video production) is not a good option, as it records the audio in 16-bit. I also export the final version of the soundtrack in 24-bit. Keeping it 24-bit uses only 50% more space for the soundtrack (in the days of large and cheep storage) and allows me for further edition if needed without having access to the project created in Audacity. I also import the 24-bit tracks to Camtasia and sometimes do little volume adjustments in some parts of the track in Camtasia, but nothing more here. I reduce the bit depth (and later possible also the bitrate) only in final video production.

The best sampling frequency for recording speech is the same as the sampling rate of the final production, so I use 44.1 kHz. For recording music it is four times the sampling rate of the final production, that is 176.4 or 192 kHz. This additional oversampling serves two purposes. First, it reduces the random errors, as in this way one final sample averages four samples from the original recording. Second, providing denser data allows for more advanced and aggressive audio postprocessing (as for example speed or tempo change) before audible distortions appear. I don't use such complex postprocessing for speech (does anyone?) so 44.1 kHz is enough.

Audio processing. First of all, to provide good results to my audience, I avoid compressors unless definitely necessary in some very rare cases and I never ever use limiters. I use amplification for adjusting volume levels and try to keep the natural dynamic range, which for my voice is about DR18 (this is the optimal desired result measured after correcting for the recording imperfections. In the raw audio, the DR is usually higher.). While recording as well a piano concert as my little Christmass bell I obtained DR23. (By the way I don't understand, why YouTube normalizes at -14 LUFS, instead of -18 LUFS, as should be done. Most music compressed to DR14 has no power, no breathing, no life. Even DR18 is very frequently not enough (the piano and the bell gave DR23), but it is much more acceptable. Of course no-one must use the maximal loudness of -14 LUFS, but many producers are not capable of thinking by themselves to determine the optimal dynamic range, but if the limit is -14 LUFS - they will use it.)

I noticed that it is impossible to correct the audio track, when the phones are connected to the audio output in the notebook, because the sound cards in all notebooks that I have tried always generated noise (hiss) and had poor resolution. It was impossible to distinguish if the noise comes from the recording itself or from the sound card circuitry and thus if I should do something with that or not. For that reason I use a good external DAC, where the noise it generates is well below the hearing threshold and adequate high-impendance, open headphones with good resolution. In this way I can hear all and correct most of the problems on the audio track. But I always check the final production also on poor quality equipment outside of my silent room to make sure that everything can be also heard on cheap headphones connected to a smartphone in a little noisy environment. If I produced music I would prepare two versions to make everyone happy, because obviously making sure that all important effects are noticeable also while listening on smartphones at the university, requires applying some additional amplifications, what degrades the production quality from the viewpoint of listeners with good equipment. However, in case of video lectures it is enough to make one version that sounds acceptably as well on good as on poor equipment. The only (very slight) equalization I perform is compensation for the proximity effect and variable distance to the microphone in connection with the microphone frequency response correction.

When I record in my room, I really don't need any de-noising, as I get the background noise from the microphone below -70 dB. Several factors contribute to this success: good triple glazed windows and living enough far from the road, the door with an additional sound-resistant layer, notebook with its fans in another room and only monitor and other peripherals in the recording room, dynamic microphone with cardioid characteristics and a good audio interface. But the lectures recorded at the university require noise removal. In this case I set the noise reduction on the noise filters to about -6 dB (more for higher noise level, less for lower) and then I use a noise gate with level reduction set to about -6 dB (depending on noise level). Then I remove manually the most annoying of the remaining clicks, cracks and other unwanted effects. A good idea is also to copy some more pleasant noise from other parts of the recording to replace them, if possible.

On many YouTube tutorials you can find a simple receipt for disaster (they call it "voice recording enhancement"). Just do the following sequence with default parameters: 1. noise reduction, 2, equalization with bass and treble boost, 3. normalization, 4. compression, 5. normalization again. Done. Without paying attention to anything, just automatic settings - one size fits all. Now you sound like a robot, no longer like a human.

Video

Video settings. I noticed that the difference in video quality between a true camera (DSLR or mirroless) connected by a cam link and a webcam is exactly as big as the difference between their prices. For this reason for recording and streaming the lectures I use a Sony A6000 camera with a SEL35F1.8 lens. For recording lectures a prime lens is very good, because there is usually no need for zooming. So in this case it makes sense to lose the versatility of a zoom in order to gain the quality of a prime. I always record with the highest bitrate possible, which for this camera is 50 Mbps. I do it for the same reason for which I record audio in 24-bit: recording with higher bitrate than human beings are able to distinguish guarantees that the edition roundings in postproduction will not give visible and audible artifacts. In theory the higher fps the better video (120 fps better than 60 fps, 60 fps better than 30 fps), but in practice only on that condition that there is enough light and the ISO will not have to go higher. In  the scarcity of light I must find an optimal balance between rough and blury video obtained at lower fps and more noisy video obtained at higher fps. So I make a quick test, or sometimes even I don't as I can guess the best settings from my experience with similar situations. Other factors that help me sometimes are the optimal aperture setting and re-arrangement of lights in the room. I also noticed the preferences of Sony engineers: while recording in limited light conditions with automatic settings the aperture quickly goes to F/1.8 to keep the ISO low.

I always use 30, 60 or 120 fps. I never use 25 or 50 fps. It used to make sense long time ago when people lived side by side with dinosaurs and used incandescent bulbs or the old type 50 Hz fluorescent lamps. Modern sources of light use frequencies over 10 kHz and luminophores and there is no longer any flickering while recording at 60 fps. But the refresh rate of all monitors is 60 Hz (and of better ones also 120 Hz, because it makes a noticeable difference for fast actions) and when I tried to record at 50 fps, I got some jagged artifacts, while playing the movie on a computer. Well, the artifacts were much less severe than the flickering on movies recorded at 60 fps in Europe or at 50 fps in America with old type of lamps in prehistoric times, but still they were present. I record in fullHD, as my camera can't record in 4K. Of course recording in 4K requires more light as on average four times smaller area of the sensor is used to produce one pixel of the video. But what is more important here: unless the action is very static, like a conference or a lecture, a 1080p@60fps video looks better than 4K@30fps. I am not going to buy a new camera with 4K@30fps support only to record lectures and conferences. I want it to be universal, that is with 4K@60fps, however as of April 2020 the few cameras that can record really high quality 4K@60fps videos (full frame, 200 Mbps) are so expensive that to buy one I would need to sell a half of my house.

Several years ago, when I wanted to distribute two versions: 1080p and 720p, I sometimes recorded the screen capture twice, each time in the video native resolution to avoid quality degradation by rescalling. That was especially beneficial for source code (and other texts and fonts) and I used a lot of it. For photos or most of charts it would be waste of time. However, as of April 2020 I produce the videos only in 1080p, because I assume that these days everyone has a fullHD display. Also with 1080p movie from the camera, the final production can be in 4K - no problem - all the screen captures can be in 4K and the video from the camera can be embedded, taking no more than 1/4 of the screen area. However I can't do it, because I don't have a 4K monitor, and currently good 4K IPS 120Hz monitors are quite expensive. Well, not so expensive as good 4k@60fps cameras, so maybe it would be enough to sell one room to buy that monitor, but will anyone buy a single room in my house?

Storage media. I record on an SD card in the camera when I only record the lecture, and via a Cam Link on a computer drive when I also transmit online in the real time. There are two problems with storing the video on the SD card. The first one: Unless you remove it, there is a 30 minute limit on movie recording to an SD card in the cameras. Once I forgot about restarting the recording and only the first 30 minutes of my presentation get recorded. As this took place at one university with audience, there was no way to repeat this. Happily the screen capture and microphone audio of the entire presentation were successfully recorded. The second one: In this particular camera (and maybe also in some other) after long recording in high temperatures, the camera may overheat and shutdown. To prevent it, I position the LCD panel far from the body, which enables better flow of the air and I set the LCD brightness to the minimum.

Photos. Sometimes there is a need to take a photo either of some object or of some page in a book, which for various reasons cannot be scaned. I of course need technical photos, not artistic ones

Room lighting. A good light for lectures is a soft and disperse light, more or less uniform, depending on the desired effect, but without unwanted shadows and with symmetric face exposition. The pictures below show how I have solved the problem in my room, where I give the online and video lectures these days. Why a silver reflector, not a lamp? For three reasons: 1. zero energy consumption, 2. large area of the light source makes the light soft and reduces sharp shadows, 3. homogeneous light color: the reflected light spectrum is almost identical to that of the sunlight coming through the window.

reflector


Presentation

Lecture preparation. The lecture on the same topic, given in a classroom at university premises, presented online via an Internet communicator and recorded on video, will have (and should have) three different ways of presentation. I have noticed that in all three cases it is important to be prepared, to know what I am going to say, but also not to read the script and not to learn it by heart. I obtain the best results, when I know what I want to say, but I don't know ahead in which words to say it. In this way the audience can see and hear that I am a real expert, who has this knowledge in his head and not just someone who has memorized the script. Moreover, in this way the message is more vivid and more involving for the audience, as I can react to the audience questions and comments and to the remaining time, if there is a time limit. It is also more vivid and convincing to those watching it later and also in the case if there is no on-line presentation, but only the prepared video. So I don't agree with the Camtasia Tutorial, where they say to carefully practice audio first and record the screen later. No, it's not true. Audio and screen must be recorded at the same time, otherwise it will not be a live performance and will not look well. Re-recording some parts should be done if the recording is not satisfactory. But if something is re-recorded too many times, the effect will be even worse, as you will stop thinking about what you are saying, you will only cite it from your memory and this won't make a good impression.

Internet communicators. I have experience only with three: Skype, Zoom and MS Teams. The red border in Skype while sharing screens really made me nervous, because it was visible on the Camtasia screen capture. So I use the Skype No Border program to remove it. For me Skype is better than Zoom. When I use Skype the audio and video was better - less compressed. Two poor features of Skype: The First: the shared screen cannot be visible in 1:1 scale (can it be?) and the re-scalling degrades the quality a lot. The second: a noise filter is permanently on and there is no way to turn this off. For persons who have a good microphone and a silent room the noise filter makes only harm (degrades audio quality) with no benefits at all. If I ever need I noise filter or noise gate I will use my own, which I can precisely tune. I don't need and don't want this one in Skype. Zoom enables turning off the noise filter. That's great. But when I tried this, the audio was so compressed that although turning off the noise filter improved the quality, the "raw" audio was in reality far from the original raw audio. The call time in Skype is unlimited. In the free version of Zoom it is 40 minutes. The paid version of Zoom has some additional options, which Skype is missing. However, the paid Zoom is rather a competitor of MS Teams. MS Teams would be the best of the three, with most options, but is not accessible for free for private persons, only to schools and other organizations. So I use Skype and MS Teams when I initialize the meetings, and Zoom only if I am invited by someone to a Zoom meeting.

Tips to optimize video calls. I would add something to the Ten Tips to Optimize Skype Call Quality. First to the tip No.6: I wear a single-color shirt, not in stripes or other tiny complex patterns. It solves two problems: it allows for stronger video compression, which saves bandwidth (and reduces size of the final production) and the video quality is better, as the little stripes cannot be frequently precisely mapped into the pixels on the camera sensor and some artifacts appear. Then a comment to the tip No.3: I cannot close other applications. To conduct the lectures and presentations I frequently need to open many applications at the same time: PowerPoint, Visual Studio, RapidMiner, Firefox, Paint or Gimp, Acrobat Reader, sometimes also Pycharm, Excel, Word, other programs specific to a given lecture/presentation, Audacity and Camtasia for recording and Skype or MS Teams for communication. So what can I do? I can inform the students at the beginning of the meeting, that because I have so many applications open, my computer may unexpectedly restart and the connection may get lost. In this case they should wait until my computer finishes restarting, and then we will continue.

Recording. I obviously never ever record the lectures using the Skype, Zoom or MS Teams recording function, because all the internet communicators drastically compress audio and video and add unwanted noise filtering. I always use dedicated audio and video production software (Audacity, Reaper, Camtasia, SD card of my camera) in this way the recording quality is ten times better and at the same time the size of the video file is twice smaller (the later is always true, where there is a lot of screen capturing). Thus the recorded version of the lecture has always much higher audio and video quality than what the students can hear and see during the online lecture. It is not only because of the compression but also because of the further postproduction. I use Camtasia for the video producton, as I haven't found better software for this purpose within this price range. The price of Camtasia full version is $269, version for education: $182. I have purchased the full version, as this does not impose on me any licensing limits to what I can use it for. Two good points of Camtasia: 1. If the computer crashes during recording, all the recordings up to the moment of the crash are saved. 2. If Camtasia crashes during postproduction, all the modifications done to the project up to the crash are saved.

In Camtasia I record screen, video from the camera connected by a cam link (only if it is also streamed online, otherwise I record it on the SD card in the camera) and system audio (interactions with students). In Audacity I record audio from the microphone. So two audio and two video tracks will be used in the production. The fifth track: audio from the camera is useful to synchronize video from the camera with audio from the microphone. In Camtasia, although I take care to cut and move the video from camera and the (good) audio track from the standalone microphone accordingly, I remove the (poor) audio layer from the camera recording as the very last step in the production, because I never know how many times it maybe useful for synchronization. On earlier stages I only change its volume level to zero. Also to produce a high quality recording of a video conference, each participant should record himself locally in the same way and additionaly we need system audio and the full time of someone's screen for synchronization purposes.

How to do it in Camtasia and Audacity. Cutting, pasting, zooming, transitions, layers, and all other operations on the timeline in Camtasia give me a lot of possibilities. Also all the operations that I can do with the soundtrack in Audacity allows to improve the raw audio a lot. I will not write anything more here, because I have already presented most of my original thoughts and all the how-to-dos can be easily found on the Internet. But we must be watchful to distinguish valuable information from something that should never have been published, as for example the mentioned audio "enhancement" with default parameters.

Yes, it takes some time to prepare the lecture well. But it takes much more time to master the topic of the lecture (sometimes years) and to prepare the contents. Adding relatively little effort into the form of presentation, we can get much better final results. So why not to take advantage of it?

Sample video lectures




Creative Commons License. You are free to copy, share and adapt all articles and software from my web page, provided that you attribute the work to me and place a link to my home page. What you build upon my works may be distributed only under the same or similar license and you may not distort the meaning of my original texts.