How to remove timestamps and extra lines from a Zoom transcript using Notepad++ or BBEdit
In case it would help other people, here’s how I did it. I would have something that looked like this:
9
00:00:36.900 –> 00:00:40.560
Kimberly Hirsh (she/her): Do you agree to participate in the study and to have the interview audio recorded?
With the help of this guide from Drexel and replies to this Stack Overflow post I now can remove the number, the timestamp, and the two extra lines created when I remove those. Here’s how I do it.
- Open the VTT file in my advanced text editor.
- Use the find and replace feature.
- For the thing to be replaced I use the regular expression
^[(\d|\n)].*$
. You don’t need to know what a regular expression is. Just copy and paste that little code bit into the “Find” box. - Make sure either “Regular expression” or “GREP” is selected.
- Click “Replace” to test it once and be sure if it works.
- If it works, click “Replace all.”
For BBEdit:
- Paste
^\s*?\r
in the “Find” box. - Make sure the replace box is empty.
- Repeat steps 5 and 6.
For Notepad++:
7. Then switch so that “Extended” is selected instead of “Regular expression” or “GREP.”
8. Paste \r\n\r\n
in the “Find” box.
9. Put a single space in the replace box.
10. Repeat steps 5 and 6.
I hope this is helpful!