Frequently Asked Questions

I just downloaded a site, but only the home page works

You are probably viewing the website on your own computer, or you forgot to upload the .htaccess file to your server. Use our step-by-step installation guide to see a working website.

We use a file called “.htaccess” to rewrite URLs. This means that a URL such as example.com/page/ or example.com/page.php will actually redirect to an HTML file. However, you won't be able to see this in your browser. The content will be accessible on the “pretty URL”. This technique is good for SEO'ers who want incoming backlinks to redirect to the original URL.

There are also surprising amount of people who mix up the demo version and the paid version. If you ordered both, then make sure to use the files from the paid version. And yes, that seems really obvious.

Note that an .htaccess file only works on Apache servers. We can translate this file to work on other servers, such as Microsoft IIS or Nginx for a small additional fee. More than 95% of our customers have a hosting account that runs on Apache, so this this rarely happens.

Can you recover the original PHP/CFM/ASPX Files?

PHP is a server side scripting language, that normally generates html files.

This means you need access to the backend to download PHP files. The Internet archive never had access to the server/backend of a website, so it does not have these PHP files.

The only thing that the archive has is the html output, that was generated by the PHP file. We can restore this under the old .PHP URL, but it is technically an HTML file.

WordPress is also written in PHP, so – for the WordPress integration - we reverse engineer the PHP pages. These pages will not always function exactly like in the original website, but they will look the same.

WordPess also uses a MySQL database, which we also reverse engineer, in case you order the WordPress conversion.

Why Can Large Websites Take Several Hours To Scrape?

There are a few reasons why large websites are slow:

  • The Internet Archive is slow.
  • We triple check every broken link to make sure that it is indeed a broken link and not a broken archive.org server. This means that especially websites with many broken links tend to be slow.
  • The archive blocks our IP if we scrape to fast. We are in direct contact with the Internet archive and they requested us to use a custom User Agent so they can track our behavior. They have the legal right to shut us down, so we have to be nice to them.

If you want to help the Internet Archive, then support them with a financial service, so they can invest in a faster infrastructure. They are good guys, who rely on open source techniques, which is unfortunately one of the reasons their speed is limited. We give a free recovery if you send us a screenshot of a donation upwards of $25.

What about copyrights?

We don't know the laws from all countries. It's probably illegal in some countries.

However, for most websites the copyright is credited in the footer to the domain, such as "Copyright © example.com" . So if you own the domain, you could make a case for owning the copyrights.

It's a grey area for sure, and there is probably nobody who can tell you with certainty, since there is not much legal precedence.

For the expired content, it's unlikely anybody will every find out or care. We make sure the content is not published elsewhere on the internet, so it will be difficult for the other party to provide evidence of financial damages.

In reality: most content on the Internet is duplicated elsewhere, so it's unlikely that these type of borderline cases will cause you any problems. The worst case scenario is receiving an official DMCA takedown request. You then have to remove the content before a certain date to avoid legal action.

We never heard from a customer who actually got into legal troubles because of using our services.