Skip to content

Commit 872a202

Browse files
committed
fix(extractor): filtrar 'Translate' y botones de UI de Threads como ruido
El botón 'Translate' que Jina captura de la UI de Threads se colaba como texto válido al final del contenido del post. Añadidos al filtro isGenericThreadsText: translate, see translation, see original, traducir, ver original/traducción (insensible a mayúsculas).
1 parent 7fc7d22 commit 872a202

1 file changed

Lines changed: 1 addition & 1 deletion

File tree

src/lib/utils/post-extractor.ts

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -512,7 +512,7 @@ function isGenericThreadsText(value?: string): boolean {
512512
if (!value) return false
513513
const v = value.trim()
514514
// Etiquetas de metadatos de perfil/post que Jina renderiza
515-
if (/^(?:author|follow|followers?|following|published|likes?|reposts?|replies|related\s+threads|related\s+posts|destacadas|ver\s+actividad)$/i.test(v)) return true
515+
if (/^(?:author|follow|followers?|following|published|likes?|reposts?|replies|related\s+threads|related\s+posts|destacadas|ver\s+actividad|translate|see\s+translation|see\s+original|traducir|ver\s+(?:original|traducci[oó]n))$/i.test(v)) return true
516516
// Strings de login/auth o navegación de hilos
517517
return /log in to see more replies|sign in (?:with|to) (?:instagram|facebook|threads)|join threads to share ideas|threads\s*\s*log in|log in or sign up|log in with (?:your\s+)?(?:instagram|facebook|username)|sign up for threads|continue with (?:instagram|facebook)|create (?:a )?new account|forgot (?:your )?password|don'?t have an account|join the conversation|see what people are talking about|what'?s on your mind/i.test(v)
518518
}

0 commit comments

Comments
 (0)