updated PLAN.md, finished phase 5

This commit is contained in:
Joe Lothan 2026-05-18 00:26:50 -04:00
parent 4963866427
commit e5035d9a28

115
PLAN.md
View file

@ -213,80 +213,36 @@ Binary: `pipeline/05_bundle_gen/` (6 files: main.go, bundle.go, convert.go, db.g
---
## Phase 5: Frontend (Stage 6)
## Phase 5: Frontend (Stage 6) [COMPLETED — v1]
Begins after Phase 4 is complete — we use real bundle data from the 100K pipeline run for frontend development.
### Steps 5.1-5.6 [COMPLETED]
### Step 5.1: Local Dev Server
Files: `frontend/index.html` and `frontend/site.js`
Serve the generated bundles from S3 locally for frontend development:
**Architecture:**
- Vanilla JS, no framework. Two files: HTML (with inline CSS) + JS.
- Fetches random bundle JSONs from `tabs/{N}.json`, renders tabs as rows filling the viewport.
- Seeded PRNG (`Date.now()` + mulberry32) — every visitor sees unique tab arrangement.
- Infinite scroll: loads more bundles as user approaches the bottom.
- Tracks loaded bundle IDs in a Set to avoid duplicates.
```bash
# Sync a few bundles locally for testing
aws s3 sync s3://everytab-site/tabs/ ./local-tabs/ --max-items 10
# Serve with any static file server
python -m http.server 8000
```
**Tab rendering:**
- Browser-specific tab styling via `navigator.userAgent` detection (Chrome, Firefox, Safari).
- Inactive tab appearance by default, selected/active style when iframe is open.
- Light mode default, auto-switches to dark mode via `prefers-color-scheme`.
- Bidirectional marquee: each row randomly scrolls left or right at different speeds (90-150s per cycle).
- Tabs duplicated in DOM for seamless marquee loop (`translateX(-50%)`).
- Hover shows full title as native tooltip.
- External link indicator (↗) on tabs that don't allow iframes.
**Done when:** Can fetch real bundle JSON from a local dev server.
**Iframe viewer:**
- Inline, not overlay — opens between tab rows, pushes content down (75vh height).
- Header shows favicon, title, external link, and close button.
- Sandboxed iframe (`allow-scripts allow-same-origin allow-forms`).
- Close via X button, Escape key.
- Only one viewer open at a time.
### Step 5.2: Basic Tab Rendering
Build `frontend/index.html` and `frontend/site.js`:
1. HTML: minimal shell with a container div, inline CSS for tab styling
2. JS: fetch a bundle, render tabs as rows filling the viewport
3. Tab appearance: mimic Firefox tab shape (rounded top corners, slight border)
4. Each tab shows favicon (16x16 or 32x32 img from data URI) + truncated title
5. No-icon tabs show title only
Focus: get the visual density right. How many tabs fit across? How many rows fill the viewport? This determines `ENTRIES_PER_BUNDLE`.
**Done when:** Page renders tabs from a mock bundle. Visually looks like a page full of browser tabs.
### Step 5.3: Marquee Animation
Add horizontal marquee to each row:
- CSS `@keyframes` animation, translateX
- Each row at slightly different speed and direction (some left, some right)
- Smooth, subtle movement — not distracting, just enough to feel alive
- Rows need extra tabs beyond viewport width to avoid gaps during scroll
**Done when:** Rows scroll smoothly, no visual glitches at edges.
### Step 5.4: Interaction — Click, Iframe, Close
Implement tab click behavior:
1. If `iframe_ok`: show an overlay with iframe loading the site (`{protocol}://{hostname}`)
2. If `!iframe_ok`: open in new tab (`target="_blank"`, add rel="noopener")
3. Visual indicator on tabs that will open externally (small icon/badge)
4. Close overlay: X button + click-outside + Escape key
**Done when:** Clicking tabs works correctly for both iframe and external cases.
### Step 5.5: Infinite Scroll + Random Bundle Loading
Implement:
1. Seeded PRNG using `Date.now()` — generates deterministic sequence of bundle indices
2. On page load: fetch first bundle, render
3. Scroll detection: when user approaches bottom, fetch next random bundle
4. Track loaded bundle IDs in a Set (no duplicates)
5. Append new rows below existing ones
6. Handle edge case: all bundles loaded (unlikely with 50K+ bundles but handle gracefully)
`TOTAL_BUNDLES` is a constant baked into the JS at build time.
**Done when:** Infinite scroll works, new bundles load seamlessly, no duplicate bundles.
### Step 5.6: Frontend Build Script
Write `pipeline/06_frontend/build.sh`:
1. Read total bundle count (from pipeline output or S3)
2. Inject `const TOTAL_BUNDLES = {M};` into site.js
3. Copy index.html + site.js to S3 `everytab-site/`
4. Invalidate CloudFront (if distribution exists)
**Done when:** Build script produces deployable frontend with correct bundle count.
**`TOTAL_BUNDLES`** baked into HTML at build time. Build script (`pipeline/06_frontend/build.sh`) still TODO — currently hardcoded.
---
@ -570,10 +526,26 @@ On completion, each program prints a summary line and writes its stats JSON (wit
- Bundle sizes are very heterogeneous (39KB to 198KB) due to icon size variance. Average 216KB is well within our target.
- SVG favicons are ~3.5% of downloaded icons (5,128 out of 156K). Supporting SVG rasterization would recover ~1,077 hosts. Deferred to future improvement.
### Phase 5 — Completed 2026-05-18
**Changes from original plan:**
- Inline iframe viewer instead of full-screen overlay. Opens between tab rows, pushes content down (75vh).
- Browser-specific tab styling (Chrome/Firefox/Safari) via userAgent detection — original plan deferred this to v2.
- Light/dark mode via `prefers-color-scheme` — original plan just targeted Firefox dark theme.
- No progress bar in any Go program — per-item log lines + summary at end is the pattern across the project.
- `TOTAL_BUNDLES` hardcoded in HTML for now — build script (Step 5.6) still TODO.
**Lessons learned:**
- CSS marquee with alternating directions needs care: right-scrolling rows must start at `translateX(-50%)` and animate to `0`, not the reverse. Both directions use the same duplicated DOM structure.
- `width: max-content` on the tab row is essential — without it, flex container constrains to viewport width and percentage-based translateX is wrong.
- Tab hover expansion (removing max-width) causes layout shifts that make neighboring tabs impossible to click. Native tooltip (`tab.title`) is simpler and has no side effects.
- Hundreds of animating DOM elements cause frame drops on weaker GPUs. `will-change: transform` helps but slower animation speeds help more.
---
## Future Improvements
### Pipeline
- **WARC parser: retry on fetch errors** — Currently 3 fetch errors out of 100K (tolerable loss). Could add 1 retry with backoff for transient S3 errors.
- **WARC parser: batch DB inserts** — Currently one INSERT per icon. Using pgx batch or CopyFrom could improve DB write throughput and potentially unblock higher concurrency.
- **WARC parser: investigate throughput ceiling** — 300 hosts/sec at both 100 and 500 concurrency suggests a bottleneck. Profile to determine if it's S3 response latency, Postgres writes, or something else. For the full 30M run this determines wall-clock time (~28 hours at current rate).
@ -583,3 +555,10 @@ On completion, each program prints a summary line and writes its stats JSON (wit
- **Icon download: download large link_rel icons** — Currently skipping declared sizes >64x64. Re-run with broader filter for future high-res projects.
- **Bundle gen: SVG rasterization** — ~1,077 hosts have SVG-only favicons. Could add `rsvg-convert` or a Go SVG library to rasterize these.
- **Bundle gen: smarter downscaling** — Currently nearest-neighbor to 32x32 for >128px icons. Could use bilinear/Lanczos for better quality, or preserve aspect ratio for non-square icons.
### Frontend
- **Performance: reduce DOM / animation cost** — Pause marquee animation on off-screen rows (IntersectionObserver). Virtualize rows to reduce total DOM element count.
- **Cross-browser tab styling** — Polish Chrome/Firefox/Safari tab appearances to more closely match real browser tabs. Test on actual browsers, use screenshots as reference.
- **Mobile layout** — Current design assumes desktop viewport. Need responsive tab sizing and touch-friendly interaction.
- **Build script**`pipeline/06_frontend/build.sh` to inject TOTAL_BUNDLES and deploy to S3 + CloudFront invalidation.
- **Stats page** — Serve `stats.json` and render pipeline stats (host count, icon coverage, crawl date) on the site.