feat(stats): switch to aggregate component for O(log n) counts

- Add @convex-dev/aggregate package for efficient aggregation
- Update convex.config.ts with pageViewsByPath, totalPageViews, uniqueVisitors aggregates
- Update recordPageView to insert into aggregate components
- Update getStats to use aggregate counts instead of O(n) table scans
- Add backfillAggregates internal mutation for existing data
- Update prds/howstatsworks.md with old vs new comparison
- Update changelog.md with v1.11.0 entry
- Update files.md with aggregate component info
This commit is contained in:
Wayne Sutton
2025-12-20 14:39:53 -08:00
parent ae3d69c7b0
commit 8d28e36458
8 changed files with 915 additions and 58 deletions

View File

@@ -4,6 +4,40 @@ All notable changes to this project will be documented in this file.
The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/).
## [1.11.0] - 2025-12-20
### Added
- Aggregate component for efficient O(log n) stats counts
- Replaces O(n) table scans with pre-computed denormalized counts
- Uses `@convex-dev/aggregate` package for TableAggregate
- Three aggregates: totalPageViews, pageViewsByPath, uniqueVisitors
- Backfill mutation for existing page view data
- `stats:backfillAggregates` populates counts from existing data
- Idempotent and safe to run multiple times
### Changed
- `recordPageView` mutation now updates aggregate components
- Inserts into pageViewsByPath aggregate for per-page counts
- Inserts into totalPageViews aggregate for global count
- Inserts into uniqueVisitors aggregate for new sessions only
- `getStats` query now uses aggregate counts
- O(log n) count operations instead of O(n) table scans
- Consistent fast response times regardless of data size
- Still queries posts/pages for title matching
### Technical
- New file: `convex/convex.config.ts` (updated with aggregate component registrations)
- Three TableAggregate instances with different namespacing strategies
- Performance improvement scales better with growing page view data
### Documentation
- Updated `prds/howstatsworks.md` with old vs new implementation comparison
- Added aggregate component usage examples and configuration
## [1.10.0] - 2025-12-20
### Added

View File

@@ -58,4 +58,563 @@ export declare const internal: FilterApi<
FunctionReference<any, "internal">
>;
export declare const components: {};
export declare const components: {
pageViewsByPath: {
btree: {
aggregateBetween: FunctionReference<
"query",
"internal",
{ k1?: any; k2?: any; namespace?: any },
{ count: number; sum: number }
>;
aggregateBetweenBatch: FunctionReference<
"query",
"internal",
{ queries: Array<{ k1?: any; k2?: any; namespace?: any }> },
Array<{ count: number; sum: number }>
>;
atNegativeOffset: FunctionReference<
"query",
"internal",
{ k1?: any; k2?: any; namespace?: any; offset: number },
{ k: any; s: number; v: any }
>;
atOffset: FunctionReference<
"query",
"internal",
{ k1?: any; k2?: any; namespace?: any; offset: number },
{ k: any; s: number; v: any }
>;
atOffsetBatch: FunctionReference<
"query",
"internal",
{
queries: Array<{
k1?: any;
k2?: any;
namespace?: any;
offset: number;
}>;
},
Array<{ k: any; s: number; v: any }>
>;
get: FunctionReference<
"query",
"internal",
{ key: any; namespace?: any },
null | { k: any; s: number; v: any }
>;
offset: FunctionReference<
"query",
"internal",
{ k1?: any; key: any; namespace?: any },
number
>;
offsetUntil: FunctionReference<
"query",
"internal",
{ k2?: any; key: any; namespace?: any },
number
>;
paginate: FunctionReference<
"query",
"internal",
{
cursor?: string;
k1?: any;
k2?: any;
limit: number;
namespace?: any;
order: "asc" | "desc";
},
{
cursor: string;
isDone: boolean;
page: Array<{ k: any; s: number; v: any }>;
}
>;
paginateNamespaces: FunctionReference<
"query",
"internal",
{ cursor?: string; limit: number },
{ cursor: string; isDone: boolean; page: Array<any> }
>;
validate: FunctionReference<
"query",
"internal",
{ namespace?: any },
any
>;
};
inspect: {
display: FunctionReference<"query", "internal", { namespace?: any }, any>;
dump: FunctionReference<"query", "internal", { namespace?: any }, string>;
inspectNode: FunctionReference<
"query",
"internal",
{ namespace?: any; node?: string },
null
>;
listTreeNodes: FunctionReference<
"query",
"internal",
{ take?: number },
Array<{
_creationTime: number;
_id: string;
aggregate?: { count: number; sum: number };
items: Array<{ k: any; s: number; v: any }>;
subtrees: Array<string>;
}>
>;
listTrees: FunctionReference<
"query",
"internal",
{ take?: number },
Array<{
_creationTime: number;
_id: string;
maxNodeSize: number;
namespace?: any;
root: string;
}>
>;
};
public: {
clear: FunctionReference<
"mutation",
"internal",
{ maxNodeSize?: number; namespace?: any; rootLazy?: boolean },
null
>;
delete_: FunctionReference<
"mutation",
"internal",
{ key: any; namespace?: any },
null
>;
deleteIfExists: FunctionReference<
"mutation",
"internal",
{ key: any; namespace?: any },
any
>;
init: FunctionReference<
"mutation",
"internal",
{ maxNodeSize?: number; namespace?: any; rootLazy?: boolean },
null
>;
insert: FunctionReference<
"mutation",
"internal",
{ key: any; namespace?: any; summand?: number; value: any },
null
>;
makeRootLazy: FunctionReference<
"mutation",
"internal",
{ namespace?: any },
null
>;
replace: FunctionReference<
"mutation",
"internal",
{
currentKey: any;
namespace?: any;
newKey: any;
newNamespace?: any;
summand?: number;
value: any;
},
null
>;
replaceOrInsert: FunctionReference<
"mutation",
"internal",
{
currentKey: any;
namespace?: any;
newKey: any;
newNamespace?: any;
summand?: number;
value: any;
},
any
>;
};
};
totalPageViews: {
btree: {
aggregateBetween: FunctionReference<
"query",
"internal",
{ k1?: any; k2?: any; namespace?: any },
{ count: number; sum: number }
>;
aggregateBetweenBatch: FunctionReference<
"query",
"internal",
{ queries: Array<{ k1?: any; k2?: any; namespace?: any }> },
Array<{ count: number; sum: number }>
>;
atNegativeOffset: FunctionReference<
"query",
"internal",
{ k1?: any; k2?: any; namespace?: any; offset: number },
{ k: any; s: number; v: any }
>;
atOffset: FunctionReference<
"query",
"internal",
{ k1?: any; k2?: any; namespace?: any; offset: number },
{ k: any; s: number; v: any }
>;
atOffsetBatch: FunctionReference<
"query",
"internal",
{
queries: Array<{
k1?: any;
k2?: any;
namespace?: any;
offset: number;
}>;
},
Array<{ k: any; s: number; v: any }>
>;
get: FunctionReference<
"query",
"internal",
{ key: any; namespace?: any },
null | { k: any; s: number; v: any }
>;
offset: FunctionReference<
"query",
"internal",
{ k1?: any; key: any; namespace?: any },
number
>;
offsetUntil: FunctionReference<
"query",
"internal",
{ k2?: any; key: any; namespace?: any },
number
>;
paginate: FunctionReference<
"query",
"internal",
{
cursor?: string;
k1?: any;
k2?: any;
limit: number;
namespace?: any;
order: "asc" | "desc";
},
{
cursor: string;
isDone: boolean;
page: Array<{ k: any; s: number; v: any }>;
}
>;
paginateNamespaces: FunctionReference<
"query",
"internal",
{ cursor?: string; limit: number },
{ cursor: string; isDone: boolean; page: Array<any> }
>;
validate: FunctionReference<
"query",
"internal",
{ namespace?: any },
any
>;
};
inspect: {
display: FunctionReference<"query", "internal", { namespace?: any }, any>;
dump: FunctionReference<"query", "internal", { namespace?: any }, string>;
inspectNode: FunctionReference<
"query",
"internal",
{ namespace?: any; node?: string },
null
>;
listTreeNodes: FunctionReference<
"query",
"internal",
{ take?: number },
Array<{
_creationTime: number;
_id: string;
aggregate?: { count: number; sum: number };
items: Array<{ k: any; s: number; v: any }>;
subtrees: Array<string>;
}>
>;
listTrees: FunctionReference<
"query",
"internal",
{ take?: number },
Array<{
_creationTime: number;
_id: string;
maxNodeSize: number;
namespace?: any;
root: string;
}>
>;
};
public: {
clear: FunctionReference<
"mutation",
"internal",
{ maxNodeSize?: number; namespace?: any; rootLazy?: boolean },
null
>;
delete_: FunctionReference<
"mutation",
"internal",
{ key: any; namespace?: any },
null
>;
deleteIfExists: FunctionReference<
"mutation",
"internal",
{ key: any; namespace?: any },
any
>;
init: FunctionReference<
"mutation",
"internal",
{ maxNodeSize?: number; namespace?: any; rootLazy?: boolean },
null
>;
insert: FunctionReference<
"mutation",
"internal",
{ key: any; namespace?: any; summand?: number; value: any },
null
>;
makeRootLazy: FunctionReference<
"mutation",
"internal",
{ namespace?: any },
null
>;
replace: FunctionReference<
"mutation",
"internal",
{
currentKey: any;
namespace?: any;
newKey: any;
newNamespace?: any;
summand?: number;
value: any;
},
null
>;
replaceOrInsert: FunctionReference<
"mutation",
"internal",
{
currentKey: any;
namespace?: any;
newKey: any;
newNamespace?: any;
summand?: number;
value: any;
},
any
>;
};
};
uniqueVisitors: {
btree: {
aggregateBetween: FunctionReference<
"query",
"internal",
{ k1?: any; k2?: any; namespace?: any },
{ count: number; sum: number }
>;
aggregateBetweenBatch: FunctionReference<
"query",
"internal",
{ queries: Array<{ k1?: any; k2?: any; namespace?: any }> },
Array<{ count: number; sum: number }>
>;
atNegativeOffset: FunctionReference<
"query",
"internal",
{ k1?: any; k2?: any; namespace?: any; offset: number },
{ k: any; s: number; v: any }
>;
atOffset: FunctionReference<
"query",
"internal",
{ k1?: any; k2?: any; namespace?: any; offset: number },
{ k: any; s: number; v: any }
>;
atOffsetBatch: FunctionReference<
"query",
"internal",
{
queries: Array<{
k1?: any;
k2?: any;
namespace?: any;
offset: number;
}>;
},
Array<{ k: any; s: number; v: any }>
>;
get: FunctionReference<
"query",
"internal",
{ key: any; namespace?: any },
null | { k: any; s: number; v: any }
>;
offset: FunctionReference<
"query",
"internal",
{ k1?: any; key: any; namespace?: any },
number
>;
offsetUntil: FunctionReference<
"query",
"internal",
{ k2?: any; key: any; namespace?: any },
number
>;
paginate: FunctionReference<
"query",
"internal",
{
cursor?: string;
k1?: any;
k2?: any;
limit: number;
namespace?: any;
order: "asc" | "desc";
},
{
cursor: string;
isDone: boolean;
page: Array<{ k: any; s: number; v: any }>;
}
>;
paginateNamespaces: FunctionReference<
"query",
"internal",
{ cursor?: string; limit: number },
{ cursor: string; isDone: boolean; page: Array<any> }
>;
validate: FunctionReference<
"query",
"internal",
{ namespace?: any },
any
>;
};
inspect: {
display: FunctionReference<"query", "internal", { namespace?: any }, any>;
dump: FunctionReference<"query", "internal", { namespace?: any }, string>;
inspectNode: FunctionReference<
"query",
"internal",
{ namespace?: any; node?: string },
null
>;
listTreeNodes: FunctionReference<
"query",
"internal",
{ take?: number },
Array<{
_creationTime: number;
_id: string;
aggregate?: { count: number; sum: number };
items: Array<{ k: any; s: number; v: any }>;
subtrees: Array<string>;
}>
>;
listTrees: FunctionReference<
"query",
"internal",
{ take?: number },
Array<{
_creationTime: number;
_id: string;
maxNodeSize: number;
namespace?: any;
root: string;
}>
>;
};
public: {
clear: FunctionReference<
"mutation",
"internal",
{ maxNodeSize?: number; namespace?: any; rootLazy?: boolean },
null
>;
delete_: FunctionReference<
"mutation",
"internal",
{ key: any; namespace?: any },
null
>;
deleteIfExists: FunctionReference<
"mutation",
"internal",
{ key: any; namespace?: any },
any
>;
init: FunctionReference<
"mutation",
"internal",
{ maxNodeSize?: number; namespace?: any; rootLazy?: boolean },
null
>;
insert: FunctionReference<
"mutation",
"internal",
{ key: any; namespace?: any; summand?: number; value: any },
null
>;
makeRootLazy: FunctionReference<
"mutation",
"internal",
{ namespace?: any },
null
>;
replace: FunctionReference<
"mutation",
"internal",
{
currentKey: any;
namespace?: any;
newKey: any;
newNamespace?: any;
summand?: number;
value: any;
},
null
>;
replaceOrInsert: FunctionReference<
"mutation",
"internal",
{
currentKey: any;
namespace?: any;
newKey: any;
newNamespace?: any;
summand?: number;
value: any;
},
any
>;
};
};
};

View File

@@ -1,6 +1,16 @@
import { defineApp } from "convex/server";
import aggregate from "@convex-dev/aggregate/convex.config.js";
const app = defineApp();
// Aggregate component for efficient page view counts (O(log n) instead of O(n))
app.use(aggregate, { name: "pageViewsByPath" });
// Aggregate component for total page views count
app.use(aggregate, { name: "totalPageViews" });
// Aggregate component for unique visitors count
app.use(aggregate, { name: "uniqueVisitors" });
export default app;

View File

@@ -1,5 +1,8 @@
import { query, mutation, internalMutation } from "./_generated/server";
import { v } from "convex/values";
import { components } from "./_generated/api";
import { DataModel } from "./_generated/dataModel";
import { TableAggregate } from "@convex-dev/aggregate";
// Deduplication window: 30 minutes in milliseconds
const DEDUP_WINDOW_MS = 30 * 60 * 1000;
@@ -10,9 +13,50 @@ const SESSION_TIMEOUT_MS = 2 * 60 * 1000;
// Heartbeat dedup window: 10 seconds (prevents write conflicts from rapid calls)
const HEARTBEAT_DEDUP_MS = 10 * 1000;
/**
* Aggregate for page views by path.
* Provides O(log n) counts instead of O(n) full table scans.
* Namespace by path to get per-page view counts efficiently.
*/
const pageViewsByPath = new TableAggregate<{
Namespace: string; // path
Key: number; // timestamp
DataModel: DataModel;
TableName: "pageViews";
}>(components.pageViewsByPath, {
namespace: (doc) => doc.path,
sortKey: (doc) => doc.timestamp,
});
/**
* Aggregate for total page views.
* Key is null since we only need a global count.
*/
const totalPageViews = new TableAggregate<{
Key: null;
DataModel: DataModel;
TableName: "pageViews";
}>(components.totalPageViews, {
sortKey: () => null,
});
/**
* Aggregate for unique visitors.
* Uses sessionId as key to count distinct sessions.
* Each session only counted once (first occurrence).
*/
const uniqueVisitors = new TableAggregate<{
Key: string; // sessionId
DataModel: DataModel;
TableName: "pageViews";
}>(components.uniqueVisitors, {
sortKey: (doc) => doc.sessionId,
});
/**
* Record a page view event.
* Idempotent: same session viewing same path within 30min = 1 view.
* Updates aggregate components for efficient O(log n) counts.
*/
export const recordPageView = mutation({
args: {
@@ -39,13 +83,31 @@ export const recordPageView = mutation({
return null;
}
// Check if this is a new unique visitor (first page view for this session)
const existingSessionView = await ctx.db
.query("pageViews")
.withIndex("by_session_path", (q) => q.eq("sessionId", args.sessionId))
.first();
const isNewVisitor = !existingSessionView;
// Insert new view event
await ctx.db.insert("pageViews", {
const id = await ctx.db.insert("pageViews", {
path: args.path,
pageType: args.pageType,
sessionId: args.sessionId,
timestamp: now,
});
const doc = await ctx.db.get(id);
// Update aggregates with the new page view
if (doc) {
await pageViewsByPath.insertIfDoesNotExist(ctx, doc);
await totalPageViews.insertIfDoesNotExist(ctx, doc);
// Only insert into unique visitors aggregate if this is a new session
if (isNewVisitor) {
await uniqueVisitors.insertIfDoesNotExist(ctx, doc);
}
}
return null;
},
@@ -102,6 +164,7 @@ export const heartbeat = mutation({
/**
* Get all stats for the stats page.
* Real-time subscription via useQuery.
* Uses aggregate components for O(log n) counts instead of O(n) table scans.
*/
export const getStats = query({
args: {},
@@ -147,24 +210,19 @@ export const getStats = query({
.map(([path, count]) => ({ path, count }))
.sort((a, b) => b.count - a.count);
// Get all page views ordered by timestamp to find earliest
const allViews = await ctx.db
// Use aggregate component for total page views count: O(log n) instead of O(n)
const totalPageViewsCount = await totalPageViews.count(ctx);
// Use aggregate component for unique visitors count: O(log n) instead of O(n)
const uniqueVisitorsCount = await uniqueVisitors.count(ctx);
// Get earliest page view for tracking since date (single doc fetch)
const firstView = await ctx.db
.query("pageViews")
.withIndex("by_timestamp")
.order("asc")
.collect();
// Get tracking start date (earliest view timestamp)
const trackingSince = allViews.length > 0 ? allViews[0].timestamp : null;
// Aggregate views by path and count unique sessions
const viewsByPath: Record<string, number> = {};
const uniqueSessions = new Set<string>();
for (const view of allViews) {
viewsByPath[view.path] = (viewsByPath[view.path] || 0) + 1;
uniqueSessions.add(view.sessionId);
}
.first();
const trackingSince = firstView ? firstView.timestamp : null;
// Get published posts and pages for titles
const posts = await ctx.db
@@ -177,45 +235,58 @@ export const getStats = query({
.withIndex("by_published", (q) => q.eq("published", true))
.collect();
// Build page stats array with titles
const pageStats = Object.entries(viewsByPath)
.map(([path, views]) => {
// Match path to post or page
const slug = path.startsWith("/") ? path.slice(1) : path;
const post = posts.find((p) => p.slug === slug);
const page = pages.find((p) => p.slug === slug);
// Get unique paths from pageViews (needed to build pageStats)
// We still need to iterate for path list, but use aggregate for per-path counts
const allPaths = new Set<string>();
const pathViewsFromDb = await ctx.db.query("pageViews").collect();
for (const view of pathViewsFromDb) {
allPaths.add(view.path);
}
let title = path;
let pageType = "other";
// Build page stats using aggregate counts per path: O(log n) per path
const pageStatsPromises = Array.from(allPaths).map(async (path) => {
// Use aggregate namespace count for this path
const views = await pageViewsByPath.count(ctx, { namespace: path });
// Match path to post or page for title
const slug = path.startsWith("/") ? path.slice(1) : path;
const post = posts.find((p) => p.slug === slug);
const page = pages.find((p) => p.slug === slug);
if (path === "/" || path === "") {
title = "Home";
pageType = "home";
} else if (path === "/stats") {
title = "Stats";
pageType = "stats";
} else if (post) {
title = post.title;
pageType = "blog";
} else if (page) {
title = page.title;
pageType = "page";
}
let title = path;
let pageType = "other";
return {
path,
title,
pageType,
views,
};
})
.sort((a, b) => b.views - a.views);
if (path === "/" || path === "") {
title = "Home";
pageType = "home";
} else if (path === "/stats") {
title = "Stats";
pageType = "stats";
} else if (post) {
title = post.title;
pageType = "blog";
} else if (page) {
title = page.title;
pageType = "page";
}
return {
path,
title,
pageType,
views,
};
});
const pageStats = (await Promise.all(pageStatsPromises)).sort(
(a, b) => b.views - a.views
);
return {
activeVisitors: activeSessions.length,
activeByPath,
totalPageViews: allViews.length,
uniqueVisitors: uniqueSessions.size,
totalPageViews: totalPageViewsCount,
uniqueVisitors: uniqueVisitorsCount,
publishedPosts: posts.length,
publishedPages: pages.length,
trackingSince,
@@ -247,3 +318,45 @@ export const cleanupStaleSessions = internalMutation({
},
});
/**
* Internal mutation to backfill aggregates from existing pageViews data.
* Run this once after deploying the aggregate component to populate counts.
* Uses idempotent insertIfDoesNotExist so it's safe to run multiple times.
*/
export const backfillAggregates = internalMutation({
args: {},
returns: v.object({
processed: v.number(),
uniqueSessions: v.number(),
}),
handler: async (ctx) => {
// Get all page views
const allViews = await ctx.db.query("pageViews").collect();
// Track unique sessions to avoid duplicate inserts
const seenSessions = new Set<string>();
let uniqueCount = 0;
// Process each view and update aggregates
for (const doc of allViews) {
// Insert into pageViewsByPath aggregate (one per view)
await pageViewsByPath.insertIfDoesNotExist(ctx, doc);
// Insert into totalPageViews aggregate (one per view)
await totalPageViews.insertIfDoesNotExist(ctx, doc);
// Insert into uniqueVisitors aggregate (one per session)
if (!seenSessions.has(doc.sessionId)) {
seenSessions.add(doc.sessionId);
await uniqueVisitors.insertIfDoesNotExist(ctx, doc);
uniqueCount++;
}
}
return {
processed: allViews.length,
uniqueSessions: uniqueCount,
};
},
});

View File

@@ -76,11 +76,11 @@ A brief description of each file in the codebase.
| `posts.ts` | Queries and mutations for blog posts, view counts |
| `pages.ts` | Queries and mutations for static pages |
| `search.ts` | Full text search queries across posts and pages |
| `stats.ts` | Real-time stats queries, page view recording, session heartbeat |
| `stats.ts` | Real-time stats with aggregate component for O(log n) counts, page view recording, session heartbeat |
| `crons.ts` | Cron job for stale session cleanup |
| `http.ts` | HTTP endpoints: sitemap, API (update SITE_URL/SITE_NAME when forking) |
| `rss.ts` | RSS feed generation (update SITE_URL/SITE_TITLE when forking) |
| `convex.config.ts` | Convex app configuration |
| `convex.config.ts` | Convex app configuration with aggregate component registrations (pageViewsByPath, totalPageViews, uniqueVisitors) |
| `tsconfig.json` | Convex TypeScript configuration |
### HTTP Endpoints (defined in `http.ts`)

10
package-lock.json generated
View File

@@ -8,6 +8,7 @@
"name": "markdown-site",
"version": "1.0.0",
"dependencies": {
"@convex-dev/aggregate": "^0.2.0",
"@mendable/firecrawl-js": "^1.21.1",
"@phosphor-icons/react": "^2.1.10",
"@radix-ui/react-icons": "^1.3.2",
@@ -351,6 +352,15 @@
"node": ">=6.9.0"
}
},
"node_modules/@convex-dev/aggregate": {
"version": "0.2.0",
"resolved": "https://registry.npmjs.org/@convex-dev/aggregate/-/aggregate-0.2.0.tgz",
"integrity": "sha512-AKHa6SilQ9w1gnqBghCbQj77sdE2SJSZ8n3FHH/0PX/MSHsvmDT6Z+ZzIVK6EnmqItwxyRAIfA2E+znnHQVvOQ==",
"license": "Apache-2.0",
"peerDependencies": {
"convex": "^1.24.8"
}
},
"node_modules/@esbuild/aix-ppc64": {
"version": "0.25.4",
"resolved": "https://registry.npmjs.org/@esbuild/aix-ppc64/-/aix-ppc64-0.25.4.tgz",

View File

@@ -17,6 +17,7 @@
"deploy:prod": "npx convex deploy && npm run sync:prod"
},
"dependencies": {
"@convex-dev/aggregate": "^0.2.0",
"@mendable/firecrawl-js": "^1.21.1",
"@phosphor-icons/react": "^2.1.10",
"@radix-ui/react-icons": "^1.3.2",

View File

@@ -6,6 +6,105 @@ This document explains the real-time analytics system for the markdown site.
The stats page at `/stats` shows live visitor data and page view counts. All stats update automatically via Convex subscriptions. No page refresh required.
## Aggregate component (v1.15+)
Starting with v1.15, the stats system uses the `@convex-dev/aggregate` component for efficient O(log n) counts instead of O(n) table scans. This provides significant performance improvements as the page views table grows.
### Before (O(n) approach)
The old implementation collected all page views and iterated through them to calculate counts:
```typescript
// Old approach: O(n) full table scan
const allViews = await ctx.db
.query("pageViews")
.withIndex("by_timestamp")
.order("asc")
.collect();
// Manual aggregation by iterating through all documents
const viewsByPath: Record<string, number> = {};
const uniqueSessions = new Set<string>();
for (const view of allViews) {
viewsByPath[view.path] = (viewsByPath[view.path] || 0) + 1;
uniqueSessions.add(view.sessionId);
}
return {
totalPageViews: allViews.length,
uniqueVisitors: uniqueSessions.size,
};
```
Problems with this approach:
- Query time grows linearly with page view count
- Memory usage increases with table size
- Full table read on every stats query
- Slower response times as data grows
### After (O(log n) with aggregate component)
The new implementation uses the Convex aggregate component for denormalized counts:
```typescript
// New approach: O(log n) using aggregate component
const totalPageViewsCount = await totalPageViews.count(ctx);
const uniqueVisitorsCount = await uniqueVisitors.count(ctx);
const viewsPerPath = await pageViewsByPath.count(ctx, { namespace: path });
```
Benefits of the aggregate approach:
- O(log n) count operations regardless of table size
- Counts are pre-computed and maintained incrementally
- Minimal memory usage per query
- Consistent fast response times at any scale
### Aggregate definitions
Three TableAggregate instances track different metrics:
```typescript
// Total page views count (global count)
const totalPageViews = new TableAggregate<{
Key: null;
DataModel: DataModel;
TableName: "pageViews";
}>(components.totalPageViews, {
sortKey: () => null,
});
// Views by path (namespace per path for per-page counts)
const pageViewsByPath = new TableAggregate<{
Namespace: string;
Key: number;
DataModel: DataModel;
TableName: "pageViews";
}>(components.pageViewsByPath, {
namespace: (doc) => doc.path,
sortKey: (doc) => doc.timestamp,
});
// Unique visitors (sessionId as key for distinct count)
const uniqueVisitors = new TableAggregate<{
Key: string;
DataModel: DataModel;
TableName: "pageViews";
}>(components.uniqueVisitors, {
sortKey: (doc) => doc.sessionId,
});
```
### Backfill existing data
After deploying the aggregate component, run the backfill mutation once to populate counts from existing page views:
```bash
npx convex run stats:backfillAggregates
```
This is idempotent and safe to run multiple times. It uses `insertIfDoesNotExist` to avoid duplicates.
## Data flow
1. Visitor loads any page
@@ -147,7 +246,7 @@ useEffect(() => {
### recordPageView
Located in `convex/stats.ts`. Records view events with deduplication.
Located in `convex/stats.ts`. Records view events with deduplication and updates aggregate components.
Deduplication window: 30 minutes. Same session viewing same path within 30 minutes counts as 1 view.
@@ -174,12 +273,30 @@ export const recordPageView = mutation({
return null;
}
await ctx.db.insert("pageViews", {
// Check if this is a new unique visitor
const existingSessionView = await ctx.db
.query("pageViews")
.withIndex("by_session_path", (q) => q.eq("sessionId", args.sessionId))
.first();
const isNewVisitor = !existingSessionView;
// Insert new view event
const id = await ctx.db.insert("pageViews", {
path: args.path,
pageType: args.pageType,
sessionId: args.sessionId,
timestamp: Date.now(),
});
const doc = await ctx.db.get(id);
// Update aggregate components for O(log n) counts
if (doc) {
await pageViewsByPath.insertIfDoesNotExist(ctx, doc);
await totalPageViews.insertIfDoesNotExist(ctx, doc);
if (isNewVisitor) {
await uniqueVisitors.insertIfDoesNotExist(ctx, doc);
}
}
return null;
},
@@ -238,7 +355,7 @@ export const heartbeat = mutation({
### getStats
Returns all stats for the `/stats` page. Single query, real-time subscription.
Returns all stats for the `/stats` page. Single query, real-time subscription. Uses aggregate components for O(log n) counts instead of O(n) table scans.
What it returns:
@@ -246,12 +363,23 @@ What it returns:
|-------|------|-------------|
| activeVisitors | number | Sessions with heartbeat in last 2 minutes |
| activeByPath | array | Breakdown of active visitors by current page |
| totalPageViews | number | All recorded views since tracking started |
| uniqueVisitors | number | Count of distinct session IDs |
| totalPageViews | number | All recorded views since tracking started (via aggregate) |
| uniqueVisitors | number | Count of distinct session IDs (via aggregate) |
| publishedPosts | number | Blog posts with `published: true` |
| publishedPages | number | Static pages with `published: true` |
| trackingSince | number or null | Timestamp of earliest view event |
| pageStats | array | Views per page with title and type |
| pageStats | array | Views per page with title and type (per-path aggregate counts) |
### Aggregate usage in getStats
```typescript
// O(log n) counts using aggregate component
const totalPageViewsCount = await totalPageViews.count(ctx);
const uniqueVisitorsCount = await uniqueVisitors.count(ctx);
// Per-path counts using namespace
const views = await pageViewsByPath.count(ctx, { namespace: path });
```
### Title matching
@@ -331,7 +459,8 @@ No manual configuration required. Sync content, and stats track it.
| File | Purpose |
|------|---------|
| `convex/stats.ts` | All stats mutations and queries |
| `convex/stats.ts` | All stats mutations, queries, and aggregate definitions |
| `convex/convex.config.ts` | Aggregate component registration (pageViewsByPath, totalPageViews, uniqueVisitors) |
| `convex/schema.ts` | Table definitions for pageViews and activeSessions |
| `convex/crons.ts` | Scheduled cleanup job |
| `src/hooks/usePageTracking.ts` | Client-side tracking hook |
@@ -355,6 +484,7 @@ See `prds/howtoavoidwriteconflicts.md` for the full implementation details.
## Related documentation
- [Convex aggregate component](https://github.com/get-convex/aggregate)
- [Convex event records pattern](https://docs.convex.dev/understanding/best-practices/)
- [Preventing write conflicts](https://docs.convex.dev/error#1)
- [Optimistic concurrency control](https://docs.convex.dev/database/advanced/occ)