jon

Whisper Transcription: Improving Accuracy with a Reference List

Jan 29, 2024 - 2:57pmSummary: To improve the accuracy of whisper transcription for nouns, a feature is requested to provide a list of terms that can be used as reference for correct spelling.

Text

Feature request: give me a list of terms to use in the whisper transcription process to get spelling of nouns right

Similar Entrees

"Unlocking the Potential of Asynchronous Voice Note Conversations"

80.61% similar

A shared 'brain' is being discussed as a platform for asynchronous voice note conversations where metadata could enhance understanding and visualization of conversational threads. The speaker suggests a focus on DEMO rather than DEC as a fork in the road, believing it better suits the work they've been doing with building prototypes. A group experiment is proposed with four members to delve into how these voice notes can overlap and interconnect, with the idea of marking chapters within responses to clarify dialogue. The concept also touches on the nuances of information retrieval, preferring vector databases over direct text searches, hinting at a similarity to the speaker's initial voice note exchanges with Savannah after meeting on a dating app. Voice communication offers significant advantages as a medium, and there's an idea presented here that its power should extend beyond just live conversations. Current messaging apps are filled with voice notes that are often difficult to search, filter, or respond to, though iMessage now has transcripts, which are generally reliable and useful once you've listened to the original voice note. The ability to refer back to transcribed voice notes can aid in crafting thoughtful responses and engaging in more meaningful discussions. The sender of the message suggests that by embracing this approach to communication, we could enhance our conversations and is curious to see how it will develop.

"Embracing Socratic Search Space: A Personal Quest for Deeper Understanding"

78.78% similar

The speaker describes their experience of partially understanding a podcast, particularly a term "Socratic search space," while on a walk and expresses a desire to delve deeper into its meaning. They prefer an interactive approach where they can ask a device to provide references and contextual explanations, as opposed to receiving a summary generated by an AI model like GPT, which might lack the most recent uses of the term. They are skeptical about the capability of language models to provide a comprehensive understanding, given that they might not recognize terms with minimal occurrences in training data. The speaker envisions a system that could compile and present relevant information in a coherent way, enhancing their grasp of the podcast's content and making the learning process more meaningful.

"Categorizing Inputs for a Integrated Burrito System"

78.46% similar

The speaker is considering how to categorize inputs for a burrito-like system, focusing on what constitutes a minimum ingredient for a filling, using metadata like voice notes, images, and GPS tags. They ponder the need to explicitly connect related inputs, such as a photo and a voice note about the same subject, or whether temporal and spatial proximity should implicitly link them. The speaker also reflects on the holistic context influencing inputs, including mood and environment, questioning how far explicit bundling should go. Ultimately, they imply that inputs with similar timing and location could be considered related without the need for explicit connection, likening this to lab notes.

"Maximizing Thought Organization through Screen Recording and Visual Mapping"

78.01% similar

The text discusses the concept of using screen recording to capture and organize thoughts, particularly when mapping them out with supportive graphics or diagrams, enhancing the process with features like rich audio and linking possibilities. The author suggests that a system similar to rewind.ai's capture format could be utilized, allowing for full-text search and leveraging metadata from shared Figma files to extract links and possibly map these as concept maps. This method aims to enhance the searchability, filtering, and querying of content, integrating into a platform the author refers to as "burrito dot place." The author contemplates the addition of robust social context to screen recordings, considering them as potential raw input for content understanding, akin to the role of audio, and builds upon themes previously explored in R-Log.

"Creative Planning and Design Strategies: A Snapshot of Ideas"

77.95% similar

The photograph shows a sheet of lined paper with various hand-drawn elements and handwritten notes in different areas, using primarily red but also some black ink. On the left side, a small video camera icon appears followed by a note about an agent joining someone in a video for questioning, recording insights, and data testing. There are mentions of 'tavern curio next steps?' and 'text & interfaces w/eli'. On the upper right, there are sketches resembling abstract flowers or designs labeled 'stage & playground' and 'fractional participation'. Below these, there's a note about 'rubin...pressfield podcast ep praying to the muse' and 'Reflect .app supportswr'. A drawing of a heart-like symbol is also seen towards the bottom right.