rewrote icon selection in english rather than sql
This commit is contained in:
parent
5a2e37ae06
commit
6cf6049698
1 changed files with 15 additions and 30 deletions
|
|
@ -276,39 +276,24 @@ WHERE url_path = '/'
|
||||||
|
|
||||||
**Tool:** SQL script
|
**Tool:** SQL script
|
||||||
|
|
||||||
**Process:** For each host, select the best icon from its completed downloads:
|
**Process:** For each host, select the best icon from all its completed downloads.
|
||||||
|
|
||||||
```sql
|
**Selection priority (decision flow):**
|
||||||
UPDATE hosts h SET best_icon_s3_key = (
|
|
||||||
SELECT i.s3_key FROM icons i
|
|
||||||
WHERE i.host_id = h.id
|
|
||||||
AND i.scan_state = 'completed'
|
|
||||||
ORDER BY
|
|
||||||
-- Prefer standard square sizes
|
|
||||||
CASE
|
|
||||||
WHEN i.width = i.height AND i.width IN (64, 48, 32, 16) THEN 0
|
|
||||||
WHEN i.width = i.height AND i.width <= 64 THEN 1
|
|
||||||
WHEN i.width <= 64 AND i.height <= 64 THEN 2
|
|
||||||
ELSE 3
|
|
||||||
END,
|
|
||||||
-- Among valid options, prefer larger
|
|
||||||
i.width DESC,
|
|
||||||
-- Prefer PNG/GIF/ICO over SVG/WebP for simpler processing
|
|
||||||
CASE
|
|
||||||
WHEN i.content_type IN ('image/png', 'image/gif', 'image/x-icon', 'image/vnd.microsoft.icon') THEN 0
|
|
||||||
WHEN i.content_type IN ('image/webp') THEN 1
|
|
||||||
WHEN i.content_type IN ('image/svg+xml') THEN 2
|
|
||||||
ELSE 3
|
|
||||||
END,
|
|
||||||
-- Smaller file size as tiebreaker
|
|
||||||
i.file_size ASC
|
|
||||||
LIMIT 1
|
|
||||||
);
|
|
||||||
```
|
|
||||||
|
|
||||||
**Note on SVG/WebP:** These are downloaded and stored during scanning but are lower priority for bundle selection. Rasterizing SVG to PNG adds complexity; WebP re-encoding to PNG may increase size. If a host ONLY has SVG/WebP icons, we still use them (convert in bundle generation). But if PNG/GIF/ICO alternatives exist, prefer those.
|
1. Standard square sizes (32x32, 64x64, 48x48, 16x16) — ideal for tab display. Prefer larger.
|
||||||
|
2. Other square sizes ≤64px — close enough. Prefer larger.
|
||||||
|
3. Non-square but both dimensions ≤64px — acceptable. Prefer larger.
|
||||||
|
4. Everything else (180x180, 192x192, SVG with no dimensions, etc.) — last resort, will be downscaled in bundle generation.
|
||||||
|
|
||||||
**Stats emitted:** Hosts with icons selected, hosts without any icon, icon size distribution, format distribution of selected icons.
|
Within the same tier: prefer PNG/GIF/ICO over WebP over SVG, then smaller file size as tiebreaker.
|
||||||
|
|
||||||
|
Does not distinguish between `favicon_ico` and `link_rel` sources — purely based on what was actually downloaded and its dimensions/format.
|
||||||
|
|
||||||
|
Uses `DISTINCT ON (host_id)` for efficient single-pass selection. See `pipeline/04_best_icon/select.sql`.
|
||||||
|
|
||||||
|
**Note on SVG/WebP:** Lower priority because rasterizing SVG adds complexity and WebP-to-PNG re-encoding may increase size. Only selected when no raster alternatives exist.
|
||||||
|
|
||||||
|
**Stats emitted:** Hosts with icons selected, hosts without any icon.
|
||||||
|
|
||||||
### Stage 5: Bundle Generation
|
### Stage 5: Bundle Generation
|
||||||
|
|
||||||
|
|
|
||||||
Loading…
Add table
Add a link
Reference in a new issue