x/wiki

Fork 0

mirror of https://github.com/waynesutton/markdown-site.git synced 2026-01-12 04:09:14 +00:00

Files

Wayne Sutton 55f4ada61a new: npx create-markdown-sync CLI , ui , related post thumbnails features

2026-01-10 23:46:08 -08:00

6.2 KiB

Raw Blame History

Semantic Search

Type: page Date: 2026-01-11

Semantic Search

Semantic search finds content by meaning, not exact words. Ask questions naturally and find conceptually related content.

Press Cmd+K then Tab to switch to Semantic mode. For exact word matching, see Keyword Search.

When to use each mode

Use case	Mode
"authentication error" (exact term)	Keyword
"login problems" (conceptual)	Semantic
Find specific code or commands	Keyword
"how do I deploy?" (question)	Semantic
Need matches highlighted on page	Keyword
Not sure of exact terminology	Semantic

How semantic search works

┌─────────────────────────────────────────────────────────────────────────┐
│                     SEMANTIC SEARCH FLOW                                │
└─────────────────────────────────────────────────────────────────────────┘

  ┌──────────────┐    ┌─────────────────┐    ┌──────────────────┐
  │ User query:  │───▶│ OpenAI API      │───▶│ Query embedding  │
  │ "how to      │    │ text-embedding- │    │ [0.12, -0.45,    │
  │  deploy"     │    │ ada-002         │    │  0.78, ...]      │
  └──────────────┘    └─────────────────┘    └────────┬─────────┘
                                                      │
                                                      ▼
                                           ┌─────────────────────┐
                                           │ Convex vectorSearch │
                                           │ Compare to stored   │
                                           │ post/page embeddings│
                                           └──────────┬──────────┘
                                                      │
                                                      ▼
                                           ┌─────────────────────┐
                                           │ Results sorted by   │
                                           │ similarity score    │
                                           │ (0-100%)            │
                                           └─────────────────────┘

Your query is converted to a vector (1536 numbers) using OpenAI's embedding model
Convex compares this vector to stored embeddings for all posts and pages
Results are ranked by similarity score (higher = more similar meaning)
Top 15 results returned

Technical comparison

Aspect	Keyword	Semantic
Speed	Instant	~300ms
Cost	Free	~$0.0001/query
Highlighting	Yes	No
API required	No	OpenAI

Configuration

Semantic search requires an OpenAI API key:

npx convex env set OPENAI_API_KEY sk-your-key-here

If the key is not configured:

Semantic search returns empty results
Keyword search continues to work normally
Sync script skips embedding generation

Enable/Disable Semantic Search

Semantic search is disabled by default to avoid requiring API keys for forks. Enable it via src/config/siteConfig.ts:

semanticSearch: {
  enabled: true, // Enable semantic search (requires OPENAI_API_KEY)
},

When disabled (default):

Search modal shows only keyword search (no mode toggle)
Embedding generation skipped during sync (saves API costs)
No OpenAI API key required

When enabled:

Search modal shows both Keyword and Semantic modes
Embeddings generated during npm run sync
Requires OPENAI_API_KEY in Convex

To enable semantic search:

Set semanticSearch.enabled: true in siteConfig.ts
Set OPENAI_API_KEY in Convex: npx convex env set OPENAI_API_KEY sk-xxx
Run npm run sync to generate embeddings

How embeddings are generated

When you run npm run sync:

Content syncs to Convex (posts and pages)
Script checks for posts/pages without embeddings
For each, combines title + content into text
Calls OpenAI to generate 1536-dimension embedding
Stores embedding in Convex database

Embeddings are generated once per post/page. If content changes, a new embedding is generated on the next sync.

Files involved

File	Purpose
`convex/schema.ts`	`embedding` field and `vectorIndex` on posts/pages
`convex/embeddings.ts`	Embedding generation actions
`convex/embeddingsQueries.ts`	Queries for posts/pages without embeddings
`convex/semanticSearch.ts`	Vector search action
`convex/semanticSearchQueries.ts`	Queries for hydrating search results
`src/components/SearchModal.tsx`	Mode toggle (Tab to switch)
`scripts/sync-posts.ts`	Triggers embedding generation after sync

Limitations

No highlighting: Semantic search finds meaning, not exact words, so matches can't be highlighted
API cost: Each search query costs ~$0.0001 (embedding generation)
Latency: ~300ms vs instant for keyword search (API round-trip)
Requires OpenAI key: Won't work without OPENAI_API_KEY configured
Token limit: Content is truncated to ~8000 characters for embedding

Similarity scores

Results show a percentage score (0-100%):

90%+: Very similar meaning
70-90%: Related content
50-70%: Loosely related
<50%: Weak match (may not be relevant)

Resources

Convex Vector Search
OpenAI Embeddings
Keyword Search - Full-text search documentation

6.2 KiB Raw Blame History

Semantic Search

Type: page Date: 2026-01-11

Semantic Search

When to use each mode

How semantic search works

Technical comparison

Configuration

Enable/Disable Semantic Search

How embeddings are generated

Files involved

Limitations

Similarity scores

Resources

6.2 KiB

Raw Blame History