1p0/n54 archive interlinks, errors, search status

User1

RETIRED Admin, pm OFF
There are 2 known 1p0 rip 'issues' file wise:

There was literally no way to get a "100% clean" rip under our circumstances, so what we got there really is about as good as I can ever get it to be for the actual automated rip portion. The archive is just so damn big that errors were unavoidable (in fact somewhat remarkable we even managed to finesse the old server into giving up any kind of a rip at all, let alone a flawless one)... but, 99% of the actual post content files DID get ripped successfully, so for the most part, "it's all there" on our server, content wise. There are maybe a couple/few hundred actual absent or 'garbage' post content files by my last estimate, out of somewhere like 150,000+ post files total. That's not a bad result when you look at the sum. We will find and manually replace any missing/bad pages over the course of time through normal usage probably.

The other issue is nav links that errored out on rip. For these, the contents did actually get ripped and are on the 1p0 server, and from the main numbered index pages, the trees for these threads are correctly linked, the glitch seems restricted to just the nav links INSIDE a post (oddly the trend seems to be mostly inside the topmost post on some random threads). Even here, if you really go dig around 1p0, you will see, the huge majority of these did translate and are working, but yes there are a significant number of them that didn't rip right for a variety of reasons... (including our very own best of :rolleyes: gees!)

I do know there was a period back in about '05(?) when N54 reconfigged all their links to be queries with ?s in them for a while, that whole block of urls could not be translated for linking, in spite of the fact we did actually get the files for them. As we find such links they will have to be manually edited to point to the correct 1p0 file locations, and then these random post files where the trees inside a post still point to the n54 location for whatever other reason, will have to be manually fixed as they turn up as well.

For now, at least, if you click a 'bad' 1p0 link, most of them will transport you to the mirrored location on n54, so you DO still receive the content you clicked for (just from the wrong server) and you can just click your browser back button, or the 'return to 1p0' link at top-right in n54 posts, to nav back when you're done looking at that thread, similar to how we always navigated the old forum anyway...

We will overcome each of those hurdles, its just going to be an ongoing process through time and use and targeted manual revision updates as such error pages are located... Really all these issues are workable in any case, IF/WHEN we can get a functional search engine running on the 1p0 archive - that is the single biggest 'issue' holding 1p0 back from assuming "51%" of the archive mantle at this stage - and this issue is in work as we speak
 
this may also help

I do have UltraEdit now so for starters it "should" be possible for me to run full searches on my hard drive for some html string like <a href="http://network54.com/* (twice, with and without www) and build up a definitive list at least of every file with a linking error, so we dont have to hunt them down blind. I may even be able to use this, coupled with an error log or two from when I made the sitemap index text files for google, to build up a whole comprehensive list of every error, including any absent/garbage files as well... While waiting for more word from the server host re: mnogosearch installation, I'll give UltraEdit here a shot and see what I can come up with to share...
 
update

Mnogosearch is successfully installed on the 1p0 server! :clap:

Now I need to do a big bunch of config experimentation and setup the indexing database for it, then incorporate the search interface into the 1p0 pages in google's place, and we should be able to do something with it... more to come
 
Back
Top