We explore serving original wikitext from static files and rendering as html with javascript. Our experiments with small-client-wiki encourage us. If served from github, we could accept pull requests.
We copy the latest markup for all pages. This takes about four hours with the raid resync competing for io.
ssh c2.com \ 'cd cgi; tar czf - wiki.wdb' | \ tar xzvf - wiki.wdb
We try serving this to the client but find the GS group separator characters don't make it through jquery.get.
In perl we split page text into alternating key, value with a split on $SEP. In retrospect it might be the 200 bit that is causing trouble.
my $SEP = "\263"; # something unlikely
I decide to replace these with a c program.
while ((ch=getchar())>0) { if (ch==0263) { putchar('<'); ... putchar('>'); } else { putchar(ch); } }
And convert this to a ruby hash which I output in json.
Hash[raw.split(/<<<<gs>>>>/).each_slice(2).to_a]
A complete conversion takes six minutes.
real 6m21.116s user 3m12.223s sys 3m40.350s
The json format lets us collect statistics with jq pipelines like this study of rev numbers.
(cd static/pages ls | while read i do cat $i done) | \ jq '.rev' | \ sort | uniq -c | sort -n
Small revs are the most frequent.
1365 "9" 1372 "8" 1616 "7" 1803 "6" 2029 "5" 2272 "4" 2298 "1" 2452 "2" 2514 "3"
With some adjustments we get the largest rev counts.
1 4520 1 4544 1 5353 1 7095 1 10023 1 15008 1 157406 1 736411
# Remodel
In October 2016 we had a second disk error that required resyncing the raid array. With ATA in userspace? this is really slow, taking days. I shut down apache to give it more io bandwidth. When it turned back on several days later I was pounded by new traffic, probably folks cloning a copy of wiki in case it disappeared permanently. I shut just wiki down to get some relief then announced a prolonged closure. post
I setup a repo and described my plan in three phases with checkbox items in issues #1, #2 and #3. I hoped to get through the first over the weekend but it took me a week committing and commenting most every day. By Sunday a week later I was making improvements one would expect from a remodel, not all in the checklists. github
# Restoration