|
|
8c005c4f6c
|
two phase best icon selection with a temporary table
|
2026-05-25 20:55:29 -04:00 |
|
|
|
a819dabb57
|
updated number of async writers to warc_parse to accomidate faster db nvme write speeds
|
2026-05-25 19:41:25 -04:00 |
|
|
|
bfb7d8f883
|
updated gitignore
|
2026-05-25 19:40:42 -04:00 |
|
|
|
1afbc41599
|
automated ec2 setup and build
|
2026-05-25 18:29:37 -04:00 |
|
|
|
bf8b932cdc
|
switched from rds to i5 ec2 for nvme disk read/write speeds
|
2026-05-25 18:17:07 -04:00 |
|
|
|
c93d1736fe
|
tune unbound to take up less memory for our use case
|
2026-05-25 17:30:29 -04:00 |
|
|
|
cb8d23842c
|
better gated firefox specific code
|
2026-05-25 17:18:28 -04:00 |
|
|
|
8ceb31bcbb
|
increase bundlegen producer host amount to ensure workers aren't starved
|
2026-05-25 16:21:49 -04:00 |
|
|
|
4c7a0f54f7
|
disable keepalives so connections stop after data transfer complete
|
2026-05-25 16:20:31 -04:00 |
|
|
|
ca90b7071e
|
optimize db for bulk insert by turning off indexes and vacuum
|
2026-05-25 14:16:40 -04:00 |
|
|
|
eec486880a
|
about everytab tab bolded title
|
2026-05-21 01:02:08 -04:00 |
|
|
|
0f0acb642f
|
fixed firefox marquee rollover flicker
|
2026-05-21 00:56:50 -04:00 |
|
|
|
fe3d5f7039
|
speed is fixed, no longer dependent on viewport width
|
2026-05-21 00:44:00 -04:00 |
|
|
|
b53fd7844b
|
smoother marquee and fixed icon jitter/stutter in firefox
|
2026-05-21 00:38:33 -04:00 |
|
|
|
4fa40c7b47
|
improved write efficency, though we are still bottlenecking on RDS - will switch to local postgres for future runs
|
2026-05-20 22:38:23 -04:00 |
|
|
|
baf657a8ed
|
updated PLAN.md and ARCHITECTURE.md with new instance type and performance concerns
|
2026-05-20 13:17:03 -04:00 |
|
|
|
b419b5bf6c
|
updated plan.md after 3M test
|
2026-05-20 13:14:06 -04:00 |
|
|
|
8dce702e8d
|
upped buffer sizes and switched to 2xlarge to increase speed
|
2026-05-20 12:59:12 -04:00 |
|
|
|
1df9a234cf
|
updated pipeline README to use compression and new flow
|
2026-05-20 11:54:48 -04:00 |
|
|
|
6352b9253f
|
upped swap to 8G
|
2026-05-20 11:54:17 -04:00 |
|
|
|
024e0513ba
|
upped icon downloading concurrency
|
2026-05-20 11:00:17 -04:00 |
|
|
|
91f48f249a
|
1T for ec2 hd
|
2026-05-20 10:19:02 -04:00 |
|
|
|
ead6366ed0
|
up ulimit for more connection
|
2026-05-20 10:18:48 -04:00 |
|
|
|
6d8ba61102
|
update warc parsing with new 3 stage producer, worker, consumer model, increasing speed and saturating cores
|
2026-05-20 10:18:15 -04:00 |
|
|
|
0efec72e45
|
print every 100 bundles
|
2026-05-20 10:17:35 -04:00 |
|
|
|
426abe1c90
|
upped concurrency of icon downloading
|
2026-05-20 09:47:18 -04:00 |
|
|
|
3bc355e503
|
improved bundle cli output with progress
|
2026-05-20 09:46:59 -04:00 |
|
|
|
86cff37533
|
download cc-index to home not tmp (which is tmpfs)
|
2026-05-20 09:35:06 -04:00 |
|
|
|
9308b5e039
|
download cc-index first with aws cli instead of streaming it
|
2026-05-20 08:14:22 -04:00 |
|
|
|
564919c5cc
|
added downloaded_at timestamp to icon table
|
2026-05-20 01:35:13 -04:00 |
|
|
|
ec33b2e857
|
bump up s3 warc retries to 6 to avoid 503 errors
|
2026-05-20 01:30:46 -04:00 |
|
|
|
081866f62e
|
update bundle gen to use channels and goroutines to saturate disk and not block on db access + bundle coalesing and uploading
|
2026-05-20 01:28:52 -04:00 |
|
|
|
902928235c
|
updated best icon selection logic
|
2026-05-20 01:15:08 -04:00 |
|
|
|
03e343a136
|
cap number of favicons to 50 per host
|
2026-05-20 00:53:24 -04:00 |
|
|
|
cd896427eb
|
shuffle icon link batches before putting them in the channel
|
2026-05-20 00:50:40 -04:00 |
|
|
|
27203ff085
|
updated bot rate
|
2026-05-20 00:50:17 -04:00 |
|
|
|
963d9209ca
|
cleaner dns error handling
|
2026-05-20 00:35:55 -04:00 |
|
|
|
c9ea462e97
|
check all CSP headers for iframe disallowing
|
2026-05-20 00:32:56 -04:00 |
|
|
|
a8177a1583
|
improve stats generation
|
2026-05-20 00:31:38 -04:00 |
|
|
|
0c9ad5bfd6
|
count iframes only if there isn't an error
|
2026-05-20 00:29:28 -04:00 |
|
|
|
3264288752
|
capped random favicons for frontend at 100
|
2026-05-20 00:17:12 -04:00 |
|
|
|
56ae26cbef
|
added bmp decoder to bundler
|
2026-05-20 00:11:53 -04:00 |
|
|
|
7d24b406aa
|
redundant min
|
2026-05-20 00:10:04 -04:00 |
|
|
|
eb40995c60
|
just overwrite bundles, don't delete then re-add
|
2026-05-20 00:09:53 -04:00 |
|
|
|
d6ef34a1dc
|
go mod tidy
|
2026-05-20 00:07:48 -04:00 |
|
|
|
258c6c5f3a
|
updated ARCHITECTURE.md
|
2026-05-19 23:46:06 -04:00 |
|
|
|
2f1547a912
|
switched bundle host field to url to retain http
|
2026-05-19 23:38:14 -04:00 |
|
|
|
7f36e99443
|
updated random value to double precision float
|
2026-05-19 23:37:50 -04:00 |
|
|
|
41c0eb5c49
|
updated PLAN.md with another 3M run to test code changes
|
2026-05-19 13:42:19 -04:00 |
|
|
|
a28cd2b056
|
updated pipeline README
|
2026-05-19 13:06:48 -04:00 |
|