The Toolkit: Transcription Tips

The Toolkit logo

The Toolkit brings together resources for creating, managing, and sharing digital collections to address common concerns we often hear, like this one:

How do I transcribe recordings?

Here are some tips and resources for transcribing (and/or indexing) oral histories and other audio, video and film recordings, compiled from field experts. 

What is transcription?

This section is adapted from the Access section of the Oral History Association’s Archiving Oral History: Manual of Best Practices.

  • Transcripts and video captions aid the discoverability and accessibility of interviews and other a/v materials. Ideally, a transcript will accompany the original recording so that a user can get the entire experience of seeing and/or hearing the material, with the added benefit of being able to do a text search. Project managers should consider encouraging users to engage with the original recording as well as the transcript. Conversely, transcripts offer a solution for users who may have trouble hearing or may not be able to access a/v materials due to broadband or other technical restrictions. 
  • It is possible to get quality transcriptions through various methods, including automated transcription (aka Speech-to-Text Recognition software or Voice Recognition software). These tools are ever improving and may be a good option for some projects/organizations; however, they tend to be expensive. The more expensive a service is, the more accurate you should expect the results to be, which SHOULD save some time in the review process. It is best practice to review first-draft transcripts for accuracy, regardless of how the draft was created. 
  • Organizations or projects managers should create a style guide to ensure consistency across transcripts. Examples of style guides can be found here and here.
  • In addition to, or in lieu of (if transcription isn’t feasible), a transcript, the interview could be indexed to make it more searchable. Indexes allow researchers and users to find the subjects and topics they are interested in more readily within an interview. See more below.

Speech-to-Text Recognition Software

These are only some of the available options; the tools listed here were recommended by field experts from the Oral History Association’s Archives Interest Group in January 2021.

Meral Agish, Queens Memory: “The free basic account has various limits on imports and number of minutes per file but there are ways to work around those. I think the pro account is cheaper than the other major services too. We worked with a group of college students last semester who used the free version of Otter.ai for their transcripts. No major issues reported once we got past the sign-up process.”

James Fowler, Association for Diplomatic Studies and Training: “It doesn’t do quite as well as Temi.com and doesn’t have as slick an interface, but it works well enough for us to justify the cost differences.”

Natalie Milbrodt, Queens Memory: “At Queens Public Library we also had to move away from Trint because of cost. They structure their fees based on how many logins you need and for us with lots of volunteers, that became very expensive! We moved to using Rev because the cost is more straightforward, 25 cents/minute of interview. That allows us to quickly calculate what each interview will cost and to pay as we go.”

Rev offers both machine-generated and human-generated options (see Outsourcing to Vendors below for more).

Doug Boyd, Louie B. Nunn Center for Oral History: “My tests found that the higher success rate came out of Temi.com (which is the engine behind Rev.com’s speech to text), but the difference [between Temi and Trint] was minor. We have found that, as a center with lots of interviews, it was better to go with the service that offered a monthly “unlimited” plan rather than have each upload be a separate transaction. That was going to just be too much for our purchasing procedures and red tape at the University. I prefer the Temi.com’s editing interface, it is really pretty powerful and works a lot like Google Docs, say you want to have someone else do the editing, they don’t have to have a Temi.com account to edit the transcript using their interface. Trint, however, requires that an editor be logged into their Trint account as a “team member”… which adds to the expense.”  

Doug Boyd, Louie B. Nunn Center for Oral History: “I have found that the quality bump you get from the paid services outweighs the significantly more cleanup required from the free services…which are ever improving. I have found the cleanup on something out of Trint.com will vary significantly. At best, for an hour of interview, 2-3 hours minimum, and at worst, 7-10 hours. The cleanup depends on things like recording quality, microphone placement, the presence of a heavy accent, the pace and rhythm of the speakers, do they talk over each other, and such.” 

Brendan Coates, Oral History Projects at the Academy of Motion Picture Arts and Sciences: “[For] people with some technical skills you could consider Kaldi. It’s maybe the leading free, open-source speech-to-text software, although that doesn’t mean it isn’t plagued by the same things as every other speech-recognition program (bad at proper nouns and accents, takes forever, etc.). WGBH used Kaldi as the basis for generating transcriptions for the American Archive of Public Broadcasting (AAPB) and wrote a tool to crowdsource correction, called FIXIT+.”

Recording Tools with options to generate transcripts

Some virtual recording tools offer the ability to generate transcriptions from the recording files. The ones below have been used or recommended by WiLS community members or friends, but there are almost certainly other options available. If you currently use a virtual meeting tool, check the settings to see what recording and transcribing capabilities it has; often these features will only be available for higher tiers of their subscription levels.

Prerequisites: Business, Education, or Enterprise license with cloud recording enabled, account owner or admin privileges

“While FaceTime, Skype, and Zoom provide the ability to have video conversations, TheirStory is a purpose-built application and dedicated online space specifically for facilitating, archiving, and making accessible meaningful conversations now, and for future generations.”

This tool generates a transcription of your video as a downloadable Word document.

People-Powered Transcription

With Volunteers, Interns, and Staff

Transcription (and/or indexing) can be a meaningful task to assign to volunteers, interns and/or other staff members–especially in fluctuating work environments, when some people want or need to work from home.  Finishing a transcript and hearing the stories within a recording can be rewarding; however, transcription can be tedious and not everyone has the necessary patience or attention to detail. It is recommended that you have any interested potential transcribers do one or two test runs before asking them to commit to a transcription project or a certain number of hours to transcribe. To assist in the transcription process, project managers should review the organization’s style guide with all transcriptionists. Also consider creating a project- or organization-specific glossary and/or controlled vocabulary to address how to handle proper names, place names, or other terms that come up frequently in interviews. 

Here are some tips to share with transcribers:

  • The average person can transcribe one audio hour in about 4 hours. It takes most people about one hour to transcribe 15 minutes of a clear, slow audio file.
  • Tools such as Inqscribe and Express Scribe can expediate the transcription process and both have free versions available. 
Crowdsourcing

Crowdsourcing transcription projects has become a popular option for oral history and other a/v collections. Tools to look into if you are interested in crowdsourcing transcription include Amara and Scripto (which works with Omeka sites). Examples of crowdsourced transcription projects can be found here.

Outsourcing to Vendors

You may also opt to outsource your transcription projects to human-powered transcription vendors. This may be a good option particularly for old and/or hard to hear a/v materials. 

As mentioned above, Rev.com offers an option to have your recordings transcribed by human professional transcriptionists. Other suggestions if you are looking for human-powered transcription options:

Indexing & Additional Data

As noted above, indexing a recording can be an option in addition to or instead of creating a full transcript. Some organizations also use tags and subject headings in their catalog records to help users identify the content they are looking for within a/v materials. 

For examples of indexing standards see: “Indexing Interviews in OHMS: An Overview.” Many organizations follow different sets of data entry standards for cataloging purposes.

If you are currently managing the collection of NEW oral history interviews or other recordings, it is critical to collect a variety of information (aka metadata) about the recording in real-time. This information should include, at minimum: technical metadata (i.e., what type of equipment was used), descriptive metadata (i.e., what is the content of the recording – this should include the name of the participants, date, and place, as well as keywords, subjects, and/or a summary), and rights and access information (i.e., what permissions have the participants granted for use of the recording). All of this data will help the future transcriptionist, editor, and end-users of transcripts. It may also serve as a helpful stand-in for a transcript while the recording is in your transcription queue. 

See past Toolkits at https://recollectionwisconsin.org/category/toolkit.

Do you have questions you’d like us to answer? Let us know at info@recollectionwisconsin.org.

This Toolkit post was originally created as a tip sheet for the IMLS Accelerating Promising Practices Community Memory Cohort by WiLS Community Memory and Digital Archives Consultant Ellen Brooks.