everytab/pipeline/02_warc_parse
2026-05-25 19:41:25 -04:00
..
db.go improved write efficency, though we are still bottlenecking on RDS - will switch to local postgres for future runs 2026-05-20 22:38:23 -04:00
log.go improve stats generation 2026-05-20 00:31:38 -04:00
main.go updated number of async writers to warc_parse to accomidate faster db nvme write speeds 2026-05-25 19:41:25 -04:00
parser.go cap number of favicons to 50 per host 2026-05-20 00:53:24 -04:00
process.go added warc parser 2026-05-17 20:25:59 -04:00
warc.go bump up s3 warc retries to 6 to avoid 503 errors 2026-05-20 01:30:46 -04:00