polish transcriptions

📁 zarvent/obsidian-skills 📅 Jan 1, 1970

总安装量

周安装量

#35054

全站排名

安装命令

npx skills add https://github.com/zarvent/obsidian-skills --skill polish transcriptions

Skill 文档

Polish Transcriptions Skill

Transform raw, machine-generated transcriptions into polished, cognitively-ordered Obsidian notes that are both readable and complete.

Objective

Convert poorly transcribed audio/video content (workshops, lectures, meetings, interviews) into well-structured, publication-ready documents while preserving 100% of the original information.

Core Principles

1. Zero Information Loss

[!danger] Critical Requirement Never omit, summarize, or compress information from the original. Every detail, example, tangent, question, and answer must be preserved. The output should contain MORE structure, not LESS content.

2. Cognitive Reorganization

Transform stream-of-consciousness speech into logical document sections:

Speech Pattern	Transforms To
Topic jumping	Grouped sections with headers
Repetition	Single consolidated statement
Filler words/false starts	Clean prose
Tangents	Callouts or integrated context
Q&A interruptions	Blockquote dialogues

3. Semantic Structure Over Chronological Order

Reorganize content by meaning, not by when things were said. A 2-hour rambling lecture about three topics becomes three clean sections, even if the speaker jumped between them.

Transformation Process

Phase 1: Analysis

Before writing anything:

Read the entire transcript â Understand all topics covered
Identify main themes â What are the 3-7 core topics?
Categorize content types:
- Core instruction/information
- Examples and anecdotes
- Q&A interactions
- Meta-commentary (jokes, digressions)
- Action items or recommendations
Map relationships â Which topics depend on others?

Phase 2: Structure Design

Create a logical outline:

## [Main Topic 1]

### [Subtopic 1.1]

### [Subtopic 1.2]

---

## [Main Topic 2]

...

Use horizontal rules (---) to separate major topic shifts.

Phase 3: Content Transformation

Apply these transformations systematically:

Headers and Hierarchy

## Main Section <!-- H2 for major topics -->

### Subsection <!-- H3 for subtopics -->

#### Point or Example <!-- H4 for specific items when needed -->

Dialogues and Q&A

Preserve speaker identities with blockquotes:

> **Participante:** Â¿CÃ³mo funciona X?
> **Instructor:** X funciona de esta manera...

For multi-turn exchanges:

> **Estudiante:** Primera pregunta
> **Profesora:** Respuesta inicial
> **Estudiante:** Pregunta de seguimiento
> **Profesora:** Respuesta expandida

Callouts for Special Content

Content Type	Callout to Use
Key concept/principle	`> [!important]`
Practical advice	`> [!tip] RecomendaciÃ³n`
Warning/caution	`> [!warning]`
Interesting aside	`> [!note]`
Real-world example	`> [!example]`
Quoted wisdom	`> [!quote]`
Action items	`> [!todo]`
Summary	`> [!abstract]` or `> [!tldr]`
Success/conclusion	`> [!success]`

Tables for Structured Data

Convert comparison discussions into tables:

| Columna 1 | Columna 2 | Columna 3 |
| --------- | --------- | --------- |
| Dato 1    | Dato 2    | Dato 3    |

Lists for Enumerated Content

When the speaker lists things (even implicitly):

- Item one
- Item two
  - Sub-item
- Item three

Mermaid Diagrams for Processes

When a process or flow is described:

```mermaid
graph LR
    A[Paso 1] --> B[Paso 2]
    B --> C[Paso 3]
    C --> D[Resultado]
```

Code Blocks for Technical Content

```python
# Example code from the presentation
def example():
    return "formatted code"
```

Formatting Standards

Frontmatter

Always include appropriate YAML frontmatter:

---
date: YYYY-MM-DD
professor: "[[Speaker Name]]"
# or
speaker: "[[Speaker Name]]"
# optional
tags:
  - workshop
  - topic
---

Text Formatting

Purpose	Syntax	Example
Key terms first mention	`bold`	machine learning
Technical terms	`code`	`SQL`
Emphasis	`italic`	very important
Highlighting	`==text==`	==critical deadline==

Links

Create wikilinks for concepts that deserve their own notes:

Esto se relaciona con [[machine learning]] y [[data science]].

Anti-Patterns (What NOT To Do)

â Summarizing

<!-- BAD: Lost information -->

El instructor hablÃ³ sobre varios temas de datos.

<!-- GOOD: Preserves detail -->

El instructor cubriÃ³ tres Ã¡reas principales:

1. **IntegraciÃ³n de datos** â consolidar informaciÃ³n de mÃºltiples fuentes
2. **Limpieza y transformaciÃ³n** â ordenar, depurar y preparar los datos
3. **AnÃ¡lisis exploratorio** â comprender patrones y comportamientos

â Removing “Unimportant” Content

<!-- BAD: Removes color and context -->

(omitted anecdote about COVID impact)

<!-- GOOD: Preserves as callout -->

> [!example] Caso Real: El Impacto del COVID-19
> En un banco donde trabajÃ©, tenÃamos modelos de predicciÃ³n de mora...

â Flattening Dialogue

<!-- BAD: Loses attribution -->

Se discutiÃ³ que SQL es el lenguaje principal.

<!-- GOOD: Preserves interaction -->

> **Estudiante:** Â¿QuÃ© es SQL?
> **Profesora:** SQL es el lenguaje de programaciÃ³n de bases de datos.

â Over-Structuring

<!-- BAD: Too many headers for simple content -->

#### DefiniciÃ³n de Dato

##### Tipo 1

###### Subtipo A

<!-- GOOD: Appropriate nesting -->

### Tipos de Datos

- **Tipo 1:** DescripciÃ³n
  - Subtipo A

Quality Checklist

Before delivering the polished document:

Information complete â All original content is present
Logical structure â Grouped by topic, not chronology
Frontmatter present â Date, speaker/professor, optional tags
Headers used correctly â H2 for sections, H3 for subsections
Dialogues preserved â Q&A in blockquote format with speaker names
Callouts appropriate â Important points in [!tip], [!important], etc.
Tables where helpful â Comparisons and structured data formatted
Mermaid diagrams â Processes visualized when described
Bold for key terms â First mention of important concepts
Wikilinks created â Concepts linked with [[concept]]
Horizontal rules â Major topic separations marked with ---
Clean prose â No filler words, false starts, or transcription artifacts
No orphan headers â Every header has content below it

Example Transformation

Before (Raw Transcription)

bueno entonces ehh vamos a ver lo de las bases de datos entonces
una base de datos es pues como un lugar donde guardas cosas no?
ah esperen me olvidÃ© de decirles mi nombre soy Carmen ehh entonces
como les decÃa hay diferentes tipos de bases de datos algunas son
relacionales otras no relacionales las relacionales usan SQL que es
un lenguaje de programaciÃ³n bueno no exactamente programaciÃ³n pero
sirve para consultar datos entonces SQL significa structured query
language y sirve para hacer consultas a la base de datos...

After (Polished Document)

---
date: 2025-08-08
professor: "[[Carmen MarÃn]]"
---

## IntroducciÃ³n a las Bases de Datos

Una **base de datos** es un almacÃ©n centralizado donde se guardan
y organizan datos para su posterior acceso y manipulaciÃ³n.

### Tipos de Bases de Datos

| Tipo              | CaracterÃsticas                 |
| ----------------- | ------------------------------- |
| **Relacional**    | Utiliza SQL, estructura tabular |
| **No relacional** | NoSQL, estructuras flexibles    |

### SQL (Structured Query Language)

**SQL** es el lenguaje estÃ¡ndar para interactuar con bases de datos
relacionales. Permite realizar consultas, inserciones, actualizaciones
y eliminaciones de datos.

> [!note] AclaraciÃ³n
> Aunque SQL contiene elementos de programaciÃ³n, tÃ©cnicamente es un
> lenguaje de consulta, no un lenguaje de programaciÃ³n de propÃ³sito general.

Workflow Integration

Suggested Process

Read the obsidian-markdown skill first for syntax reference
Analyze the complete raw transcript
Outline the logical structure
Transform section by section
Review against the quality checklist
Verify no information was lost by comparing key facts

Output Location

Polish transcriptions should be saved to the appropriate location in the user’s vault, typically:

03 resources/ for workshops and external content
01 projects/.../classes/ for academic lectures
Same directory as source with a new filename

Success Criteria

A successfully polished transcription:

Reads like a well-written article â Not like speech
Contains all original information â Nothing omitted
Uses Obsidian features effectively â Callouts, tables, diagrams
Has clear cognitive structure â Easy to navigate and reference
Preserves speaker personality â Quotes and dialogues maintain voice
Is immediately usable â No further editing needed by user

References

GitHub 仓库 ↗ ← 返回陌讯 Skills 聚合平台