Have you ever had a situation in your life where you wanted to have a time machine for collecting evidence in the past? Because you need to access a page from a website that no longer exists because it was shut down. Then this little article is for you.
Here is a time machine for Internet homepages http://wayback.com/
It is free.
Wayback provides archived copies of 339 Billion (!) webpages back to 1996.
Simply type in a URL or even some search terms into the search box and press Enter. Wait.
In the search result, use the timeline at the top of the calendar to pick a year. Only the days highlighted with a circle contain an archive.
And here is how my firm´s webpage looked like 20 years ago: http://bit.ly/Schweiger1999
Wayback can Provide Court-Proof Evidence
The second circuit, based in New York, was asked to review an appeal by an Italian computer hacker in which he sought to exclude screenshots of websites run by him that tied him to a virus and botnet he was ultimately convicted over. Prosecutors had taken screenshots of his webpages from the Internet Archive and used them as trial evidence – and he wanted the files thrown out.
The Italian computer hacker lost his appeal.
The second circuit ruling supports a similar ruling from the third circuit back in 2011 (United States v. Bansal) where a witness testified “from personal knowledge” how the Wayback Machine worked and how reliable it was. The court decided this provided “sufficient proof” that its mirrored pages were authentic.
That may help you one day.
What You Need to Know
The lag time for new material, i.e. new webpages and updates, to become automatically accessible is about six months. It takes so long for the crawler to run through all 340 billion webpages, checking whether there is a change.
Wayback also has a feature for archiving any publicly accessible webpage on-demand, immediately. This feature works also for online PDF files. Once the crawling and indexing is complete, a URL to the archived copy will be provided. Check out the manual for this feature if you need that.
I would suggest that you submit there all infringing webpages that you want to observe. I do that repeatedly and I renew them on a regular basis.
(Please note that you can also archive a webpage on your own computer, my little article about that is here: https://www.linkedin.com/pulse/online-content-evidence-secured-easy-way-free-entire-schweiger/)
If you create a user account on Wayback, you can also archive your own material, like images, audio files, text files, even if they are not on a webpage. This makes yourself eternal. I wish I could have done that with my first Master´s Thesis, from 1993. It was quite spectacular for the time when I wrote, and it would be a nice piece of prior art for the area of robotics, but it is today buried somewhere in a carton box in a warehouse in my old hometown.
This archiving works also for old software. Check out the “Classic PC Games” department. You can find there what you have played 20 years ago. You can download and install it on your computer today, and play it again.
How to Delete an archived Webpage
In principle, yes, you can delete a page from there. Wayback says in its terms and conditions that if asked by an author or publisher, it “may remove that portion of the Collections without notice.” Its FAQ says that site owners can “send an email request for us to review”.
The problem is that delete requests are not always answered positively.
How to Exclude a Webpage from being Archived
Wayback has based its approach to exclusion requests on a policy created by UC Berkeley (archived version here). Under this policy, archivists should provide a ‘self-service’ approach that site owners can use to remove their materials using robots.txt files.
Robots.txt files are instructions left on sites for crawlers, telling them what they should not look at. Under the policy, a site owner could simply add one of these files at the top level of their site with a specific instruction for the Internet Archive, and then submit their site using a form.
The problem is that this mechanism does not work all the time.
Deep Pockets may be able to Shut down Wayback
Wayback is a non-profit organization with about $17.5m in revenues in 2016. One big lost lawsuit could finish it. A nice donation may on the other hand help them to overcome a temporary financial drought period.
That may be an important piece of information if one day you have to work on a delicate and sensitive case.
A word to the wise is sufficient.
Of Course, Google offers a FREE Alternative
The alternative is Google Cache https://cachedviews.com/
Google crawls the web and takes snapshots of each page as a backup just in case the current page is not available. These pages then become part of Google’s cache. These Google cached pages can be extremely useful if a site is temporary down, you can always access these page by visiting Google’s cached version.
Call to Action
Try it out today.
Open a free Wayback account.
Check out your earliest available firm´s webpage. Then create a Wayback webpage snapshot of today.
Then try out Google CachedViews.
Martin “find again” Schweiger