I have another ~2000 posts to archive and I'll have full backup including javascript expanded pages and comments in warc.gz format. It's almost 30k posts and takes me about 3-4 weeks to do it without getting blocked if I run it 24/7. I programmed it to be multi-threaded but then I find out I was getting blocked with captcha after not long unless I throttled it slower than 1 thread can do anyway. Then my first backup I completed I screwed up because I changed it halfway through to run headless which I didn't notice wasn't allowing capture of the expanded comments for whatever reason. It has been a good experience. I'll upload it somewhere whenever it finishes.
PizzagateBot ago
I have another ~2000 posts to archive and I'll have full backup including javascript expanded pages and comments in warc.gz format. It's almost 30k posts and takes me about 3-4 weeks to do it without getting blocked if I run it 24/7. I programmed it to be multi-threaded but then I find out I was getting blocked with captcha after not long unless I throttled it slower than 1 thread can do anyway. Then my first backup I completed I screwed up because I changed it halfway through to run headless which I didn't notice wasn't allowing capture of the expanded comments for whatever reason. It has been a good experience. I'll upload it somewhere whenever it finishes.
sensitive ago
@TruthGeek, good question, please ask in https://voat.co/v/AskPizzagate. Have to remove per rules 1 and 4.
jstrotha0975 ago
Glad people are keeping records, I don't have the patience to do that.
PizzagateBot ago
Hi! I created the following archive link(s) for this voat submission:
WARC files are created with https://webrecorder.io/
WARCs can be viewed offline with WARC replay tools like https://github.com/webrecorder/webrecorderplayer-electron
Final WARC will be created after 1 week from posting.