Video Lectures - not only contents but also form of presentation matters



Why, especially in the time of Covid pandemic, not only contents but also the form of presentation matters

I am not an expert in online and video lectures. I am still learning, but I think it will be interesting to write about my progress and experiences. As you can see from the video samples (at the end of this page), not only the technical quality but also the composition and flow of my lectures is gradually improving over time.

Audio

Audio settings. For recording audio I use a good dynamic microphone with cardioid characteristics (Audio-Technica ATR2100 and currently also a borrowed Shure SM7B) and an audio interface (Steinberg UR242). I connect the microphone via an XLR cable to the audio interface and the audio interface via a USB cable to the computer. The ATR2100 has also a USB connection in addition to XLR. While its sounds great when connected by XLR to the audio interface, there is little hiss and other distortions while connected by USB directly to the computer. But even via USB this microphone is incomparably better than any microphone integrated with a cheap headset. I think, the constructors wanted to save costs on the analog-to-digital converter built into this microphone. I also tried one, not cheap USB only microphone and again little hiss and distortions. So only the XLR connection with an audio interface makes sense. Shure SM7B is generaly better than ATR2100, but with a high level of ambient noise the ATR2100 is better. As my experience shows, the next must-have (and must-use) is a pop-filter. Even though Shure SM7B has an integrated pop-filter, it sounds better when this pop-filter is removed and a standard external pop-filter is used instead. Although Shure SM7B is a great microphone I will return this when the pandemic is over, because it is very expensive and I will no longer need to talk to students online.

I record the audio always in 24-bit using Audacity when I record one or two microphones. Audacity allows recording only two tracks, unless you modify it somehow. If I want to record three or four sources of sound I use Reaper (because it is cheap and light-weight, while the software that comes with the Steinberg audio interface although free is very heavy-weight). Reaper on the other hand can practically record as many tracks as there are the audio interface inputs. There were two problems, when I tried to record the audio in 16-bit. First, it was necessary to set the input level very precessily in order not to loose any bits and not to allow for clipping. Second, when processing the audio in 16-bit, then after several operations, as change of amplitude, compression, de-noising etc., the rounded errors tended to sum up and that gave audible artifacts. The accuracy limit of the best DACs is 21 bits and the resolution of human ear in ideal conditions is 20 bits. So recording in 24-bit gives me the additional four bits about which I don't have to care, thus I can set the recording level low to be safely far from clipping and then even use the four lower bits for rounding error accumulation. For this reason recording sound in Camtasia (which I use for video production) is not a good option, as it records the audio in 16-bit. I also export the final version of the soundtrack in 24-bit. Keeping it 24-bit uses only 50% more space for the soundtrack (in the days of large and cheep storage) and allows me for further edition if needed without having access to the project created in Audacity. I also import the 24-bit tracks to Camtasia and sometimes do little volume adjustments in some parts of the track in Camtasia, but nothing more here. I reduce the bit depth (and later possible also the bitrate) only in final video production.

The best sampling frequency for recording whatever is as high as possible. This additional oversampling serves two purposes. First, it reduces the random errors, as in this way one final sample averages several samples from the original recording. Second, but it rather applies to music, providing denser data allows for more advanced and aggressive audio postprocessing (as for example speed or tempo change) before audible distortions appear. It works in a similar way as image avaraging or image median blend, which perfectly removes noise from photos and at the same time makes them even sharper, so when I take pictures of a scene, which is mostly static, I always try to use it.

Audio processing. I have started to write an article about the application of artificial intelligence to audio and music processing. Here, I write only about the traditional approach (witout AI). To provide good results to my audience, I avoid compressors unless definitely necessary in some very rare cases and I never ever use limiters. I use amplification for adjusting volume levels and try to keep the natural dynamic range, which for my voice is about DR18 (this is the optimal desired result measured after correcting for the recording imperfections. In the raw audio, the DR is usually higher.). While recording as well a piano concert as my little Christmass bell I obtained DR23. (By the way I don't understand, why YouTube normalizes at -14 LUFS, instead of -18 LUFS, as should be done. Most music compressed to DR14 has no power, no breathing, no life. Even DR18 is very frequently not enough (the piano and the bell gave DR23), but it is much more acceptable. Of course no-one must use the maximal loudness of -14 LUFS, but many producers are not capable of thinking by themselves to determine the optimal dynamic range, but if the limit is -14 LUFS - they will use it.)

I noticed that it is impossible to correct the audio track, when the phones are connected to the audio output in the notebook, because the sound cards in all notebooks that I have tried always generated noise (hiss) and had poor resolution. It was impossible to distinguish if the noise comes from the recording itself or from the sound card circuitry and thus if I should do something with that or not. For that reason I use an external DAC, where the noise it generates is well below the hearing threshold and adequate headphones with good resolution. Currently I use one of the Sennheiser 300 ohm models and Marantz HD-DAC1, because most of the small DACs especially the portable ones are too weak to properly drive 300 ohm headphones, even if it is written on them: headphone impedance: 16-300 Ohm. In this way I can hear all and correct most of the problems on the audio track. But I always check the final production also on poor quality equipment outside of my silent room to make sure that everything can be also heard on cheap headphones connected to a smartphone in noisy environments. On some rare cases I must take a decision for which listening condition to optimize the recording, because obviously making sure that all important effects are noticeable also while listening on smartphones at the university or in the bus, requires applying some additional amplifications, what degrades the production quality from the viewpoint of those who listen in good conditions. Does it really matter? A little bit, because these special funny audio effects are one of the many means designed to stimulate students for learning. The only (very slight) equalization I perform is compensation for the proximity effect and variable distance to the microphone in connection with the microphone/room frequency response correction.

When I record in my room, I really don't need any de-noising, as I naturally get S/N ratio well over 70 dB. Several factors contribute to this success: good triple glazed windows and living enough far from the road, the door with an additional sound-resistant layer, the computer with its fans in another room and only monitor and other peripherals in the recording room, dynamic microphone with cardioid characteristics connected to a good audio interface and recording at 192 kHz. But the lectures recorded at the university require denoising. In this case I set the noise reduction on the noise filters to about -6 dB (more for higher noise level, less for lower) and then I use a noise gate with level reduction set to about -6 dB (depending on noise level). Then I remove manually the most annoying of the remaining clicks, cracks and other unwanted effects. A good idea is also to copy some more pleasant noise from other parts of the recording to replace them, if possible.

On many YouTube tutorials you can find a simple receipt for disaster (they call it "voice recording enhancement"). Just do the following sequence with default parameters: 1. noise reduction, 2, equalization with bass and treble boost, 3. normalization, 4. compression, 5. normalization again. Done. Without paying attention to anything, just automatic settings - one size fits all. Now you sound like a robot, no longer like a human.

Video

Video settings. I noticed that the difference in video quality between a true camera (DSLR or mirroless) and a webcam is exactly as big as the difference between their prices. For this reason for recording and streaming the lectures I use a Sony A6000 camera with a Sony SEL35F18 lens connected via a Cam Link 4K. It works perfectly on my Windows 8.1, in spite that the specification says that the Cam Link requires Windows 10. However, while recording social and cultural events I use one of the Sony full frame cameras and if possible record in 10 bits with different pictures profiles and use mostly DaVinci Resolve for the final movie production, but this is quite a different story. For recording lectures Sony A6000 with a good prime lens is more than a good camera and if it accidentally falls down from a tripod it will be much smaller disaster (this is the only, but very important reason I prefere this camera for this purpose). Good prime lenses for Sony A6xxx cameras are much better than zooms and there is no need for smooth dynamic zooming while recording lectures. So in this case it makes sense to lose the versatility of a zoom in order to gain the quality of a prime. That's not only the sharpness but even more the fact that with a prime I can use f/1.8 (and not f/3.5), what in my room conditions allows lowering ISO from 400 to 100. (Update from November 2021: Sigma has just released three exceptional APS-C lenses for Sony E: 30mm F1.4 DC DN, 56mm F1.4 DC DN and 18-50mm F2.8 DC DN - so if I bought the lens today, it would be definitely one of them instead the Sony SEL35F18. I didn't try the new lenses, but I used Sigma 16mm F1.4 DC DN on my Sony A6000 and Sigma Art 35mm F1.2 and Sigma Art 24-70mm F2.8 on my full frame Sony camera, so I know that Sigma's lenses are exceptional. Knowing how good the Sigma Art 24-70mm F2.8 is, I would even give the new zoom a try.) I always record at the highest bitrate possible, which Sony A6000 is 50 Mbps. I do it for the same reason for which I record audio in 24-bit: recording with higher bitrate than human beings are able to distinguish guarantees that the edition roundings in postproduction will not give visible and audible artifacts. In theory the higher fps the better video (120 fps better than 60 fps, 60 fps better than 30 fps), but in practice the difference between 30 and 60 pfs is always evident, but between 60 and 120 fps only on that condition that you have a 120Hz monitor and the action is really very fast. So I make a quick test, or sometimes even I don't as I can guess the best settings from my experience with similar situations. Other factors that help me sometimes are the optimal aperture setting and re-arrangement of lights in the room. I also noticed the preferences of Sony engineers: while recording in limited light conditions with automatic settings the aperture quickly goes to F/1.8 to keep the ISO low.

As human ear can hear with about 20 bit resolution, human eye can see with up to 250 fps, but in this case it varies a lot depending on the situation. I always use 30 (only for static screen presentations) and 60 or 120 fps for filming real life situations (as 30 fps looks choppy in these cases - totally unacceptable as for 2020). I never use 25 or 50 or 100 fps. It used to make sense in Europe several decades ago when people used incandescent bulbs or the old type 50 Hz fluorescent lamps. Modern sources of light use frequencies over 10 kHz and luminophores and there is no longer any flickering while recording at 60 fps. But the refresh rate of all monitors is 60 Hz (and of better ones also 120 Hz or more) and when I tried to record at 50 fps, I got some jagged artifacts, while playing the movie on a computer. Well, the artifacts were much less severe than the flickering on movies recorded at 60 fps in Europe or at 50 fps in America with old type of lamps in prehistoric times, but still they were present. I record the lectures only in fullHD to keep the files small. Moreover, recording with the effective resolution of 4K requires more light as on average four times smaller area of the sensor is used to produce one pixel of the video. But what is more important here: unless the action is very static, like a conference or a lecture, a 1080p@60fps video looks better than 4K@30fps. 4K@60fps is the best for events, but if the light is poor it is not better than 1080p@60fps. Moreover, only Sony A7S-iii and A1 currently allows for oversampling from full sensor at 4K@60fps (correct me if I am wrong) and the other models make something like line skipping or APS-C read-out, what in poor light makes 4K@60fps even look worse than 1080p@60fps.

Several years ago, when I wanted to distribute two versions: 1080p and 720p, I sometimes recorded the screen capture twice, each time in the video native resolution to avoid quality degradation by rescalling. That was especially beneficial for source code (and other texts and fonts) and I used a lot of it. For photos or most of charts it would be waste of time. However, as of April 2020 I produce the instructional videos only in 1080p, because I assume that these days everyone has a fullHD display. Also with 1080p movie from the camera the final production can be in 4K, using for example Topaz Video Enhance IA and recording all the screen captures direcply in 4K. Moroever, if the video from the camera will be embedded taking no more than 1/4 of the screen area it is just OK in fullHd. But currently I don't bother in 4K production of video lectures, as this would need much additional work and very few students, if any, will notice the difference.

Storage media. There are two problems with storing the video on the SD card. The first one: unless you are able to remove it, there is a 30 minutes limit on movie length recording to an SD card in many cameras. Once I forgot about restarting the recording and only the first 30 minutes of my presentation was recorded. As this took place at one university with audience, there was no way to repeat this. Happily the screen capture and microphone audio of the entire presentation were successfully recorded. The second one: In this particular camera (and also in some others) after long recording in high ambient temperatures, the camera can overheat and shutdown. I use three solutions to prevent these problems. First, I position the LCD panel far from the body, which enables better flow of the air and I set the LCD brightness to the minimum. Second, I use an external power bank from Newell instead of the in-body battery. It not only prevents overheating, as the heat is generated in the power bank outside the camera, but also extends the recording time from about 1 hour on the internal battery to about 3 hours. Third, if the situation allows, I save the recorded video to a computer drive via the Cam Link. It not only reduces the heating of the SD card (and thus of the camera), but also omits the limit of 30 min video length, if there is one.

Room lighting. A good light for lectures is a soft and disperse light, more or less uniform, depending on the desired effect, but without unwanted shadows and with symmetric face exposition. The pictures below show how I have solved the problem in my room, where I give the online and video lectures these days. Why a silver reflector, not a lamp? For three reasons: 1. zero energy consumption, 2. large area of the light source makes the light soft and reduces sharp shadows, 3. homogeneous light color: the reflected light spectrum is almost identical to that of the sunlight coming through the window.

reflector


Presentation

Lecture preparation. The lecture on the same topic, given in a classroom at university premises, presented online via an Internet communicator and recorded on video, will have (and should have) three different ways of presentation. I have noticed that in all three cases it is important to be prepared, to know what I am going to say, but also not to read the script and not to learn it by heart. I obtain the best results, when I know what I want to say, but I don't know ahead in which words to say it. In this way the audience can see and hear that I am a real expert, who has this knowledge in his head and not just someone who has memorized the script. Moreover, in this way the message is more vivid and more involving for the audience, as I can react to the audience questions and comments and to the remaining time, if there is a time limit. It is also more vivid and convincing to those watching it later and also in the case if there is no on-line presentation, but only the prepared video. So I don't agree with the Camtasia Tutorial, where they say to carefully practice audio first and record the screen later. No, it's not true. Audio and screen must be recorded at the same time, otherwise it will not be a live performance and will not look well. Re-recording some parts should be done if the recording is not satisfactory. But if something is re-recorded too many times, the effect will be even worse, as you will stop thinking about what you are saying, you will only cite it from your memory and this won't make a good impression.

Internet communicators. I have experience only with four: Skype, Zoom, Slack and MS Teams. I prefer Skype or Zoom, depending on the situation. First, Skype is the only fully free application of these three. Second, on my computer MS Teams has much higher CPU usage than Skype, what is a big problem for me. Third, free the Zoom connection is limited to 40 minutes. The red border in Skype and in MS Teams while sharing screens really made me nervous, because it was visible on the Camtasia screen capture. So I use the Skype No Border program to remove it. For me Skype is better than Zoom. When I use Skype the audio and video was better - less compressed. Two poor features of Skype and MS Teams: The First: the shared screen cannot be visible in 1:1 scale (can it be?) and the re-scaling degrades the quality a lot. The second: a noise filter is permanently on and there is no way to turn this off (update from May 2021: it is already possible to set the noise supression level to high, low or off). For persons who have a good microphone and a silent room the noise filter makes only harm (degrades audio quality) with no benefits at all. If I ever need I noise filter or noise gate I will use my own, which I can precisely tune. I don't need and don't want this one in Skype. Zoom enables turning off the noise filter. That's great. But when I tried this, the audio was so compressed that although turning off the noise filter improved the quality, the "raw" audio was in reality far from the original raw audio. The call time in Skype is unlimited. The bad of Zoom is that the free version is limited to 40 minute sessions. The paid version of Zoom has some additional options - first of all the possibility to split the meetings into room, which Skype and MS Teams are missing. The good of Zoom is that the screen can be shared in 1:1 scale, without degrading the sharpness. The good of Slack is that you can draw on the screen shared by someone else. So I use Skype whenever I don't need the shared screen quality and Zoom otherwise. I use MS Teams only when my university requires me to use it and Slack when it is prefered by those on the other side of the connection.

Tips to optimize video calls. I would add something to the Ten Tips to Optimize Skype Call Quality. First to the tip No.6: I wear a single-color shirt, not in stripes or other tiny complex patterns. It solves two problems: it allows for stronger video compression, which saves bandwidth (and reduces size of the final production) and the video quality is better, as the little stripes cannot be frequently precisely mapped into the pixels on the camera sensor and some artifacts appear including moire. Then a comment to the tip No.3: I cannot close other applications. To conduct the lectures and presentations I frequently need to open many applications at the same time: PowerPoint, Visual Studio, RapidMiner, Firefox, Paint or Gimp, Acrobat Reader, sometimes also Pycharm, Excel, Word, other programs specific to a given lecture/presentation, Audacity and Camtasia for recording. So what can I do? There are two solutions I use: 1 Remote desktop session to run all the programs on the remote computer and only the internet communicator, Camtasia and Audacity my local computer. 2. I inform my students at the beginning of the meeting, that my computer may unexpectedly restart and the connection may get lost and in that case they should wait until I reconnect.

Recording. I obviously never ever record the lectures using the Skype, Zoom or MS Teams recording function, because all the internet communicators drastically compress audio and video and add unwanted noise filtering. I always use dedicated audio and video production software (Audacity, Reaper, Camtasia, SD card of my camera) in this way the recording quality is ten times better and at the same time the size of the video file is twice smaller (the later is always true, where there is a lot of screen capturing). Thus the recorded version of the lecture has always much higher audio and video quality than what the students can hear and see during the online lecture. It is not only because of the compression but also because of the further postproduction. I use Camtasia for the video production, as I haven't found better software for this purpose within this price range. The price of Camtasia full version is $269, version for education: $182. I have purchased the full version, as this does not impose on me any licensing limits to what I can use it for. Two good points of Camtasia: 1. If the computer crashes during recording, all the recordings up to the moment of the crash are saved. 2. If Camtasia crashes during postproduction, all the modifications done to the project up to the crash are saved.

In Camtasia I record screen, video from the camera connected by a cam link (only if it is also streamed online, otherwise I record it on the SD card in the camera) and system audio (interactions with students). In Audacity I record audio from the microphone. So two audio and two video tracks will be used in the production. The fifth track: audio from the camera is useful to synchronize video from the camera with audio from the microphone. In Camtasia, although I take care to cut and move the video from camera and the (good) audio track from the standalone microphone accordingly, I remove the (poor) audio layer from the camera recording as the very last step in the production, because I never know how many times it may be useful for synchronization. On earlier stages I only change its volume level to zero. Also to produce a high quality recording of a video conference, each participant should record himself locally and additionally we need system audio and the full time of someone's screen for synchronization purposes.

How to do it? To prepare the video I use mostly Camtasia and Audacity, but sometimes also Reapper, RawTherapee and GIMP. I have presented some of my original thoughts, but remember that I am not an expert, on the contrary I am still learning. For that reason don't ask me about all the how-to-dos, as they can be easily found on the Internet, and these instructions are frequently prepared by experts. But we must be watchful to distinguish valuable information prepared by experts from something that should never have been published, as for example the mentioned audio "enhancement" with default parameters.

Yes, it takes some time to prepare the lecture well. But it takes much more time to master the topic of the lecture (sometimes years) and to prepare the contents. Adding relatively little effort into the form of presentation, we can get much better final results. So why not to take advantage of it?

Sample video lectures




Creative Commons License. You are free to copy, share and adapt all articles and software from my web page, provided that you attribute the work to me and place a link to my home page. What you build upon my works may be distributed only under the same or similar license and you may not distort the meaning of my original texts.