Question 1

I just downloaded a site, but only the home page works

Accepted Answer

You are probably viewing the website on your own computer, or you forgot to upload the .htaccess file to your server. Use our step-by-step installation guide to see a working website.

We use a file called ".htaccess" to rewrite URLs. This means that a URL such as example.com/page/ or example.com/page.php will actually redirect to an HTML file. However, you won't be able to see this in your browser. The content will be accessible on the “pretty URL”. This technique is good for SEO'ers who want incoming backlinks to redirect to the original URL.

There are also a surprising number of people who mix up the demo version and the paid version. If you ordered both, then make sure to use the files from the paid version. And yes, that seems really obvious.

Note that a .htaccess file only works on Apache servers. We can translate this file to work on other servers, such as Microsoft IIS or Nginx for a small additional fee. More than 95% of our customers have a hosting account that runs on Apache, so this rarely happens.

Question 2

Can you recover the original backend files, such as the database or PHP/CFM/ASPX files?

Accepted Answer

PHP is a server side scripting language, that normally generates html files.

This means you need access to the backend (=server side) to download PHP files. The Internet archive never had access to the server/backend of a website, so it does not have these PHP files.

The only thing that the archive has is the html output, that was generated by the PHP file. We can restore this under the old .PHP URL, but it is technically an HTML file.

WordPress is also written in PHP, so – for the WordPress integration - we reverse engineer the PHP pages. These pages will not always function exactly like in the original website, but they will look the same.

WordPess also uses a MySQL database, which we also reverse engineer, in case you order the WordPress conversion.

**The above is a short answer. We also recommend reading our blog post about the differences between frontend and backend.**

Question 3

Can you recover the backend files, such as the original WordPress database?

Accepted Answer

No, but we can reverse engineer a new database to some extend. A backend file cannot be seen by website visitors. That is the entire definition of the concept "backend". For example, a visitor cannot download your database or PHP files. Therefore a webscraper cannot download those files either. A visitor or webscraper can only download frontend files, such as JavaScript, HTML/CSS and pictures.

A web scraper basically has the same limitations as a normal visitor. It has no access to the backend of a website. Neither does it have access to a part of the website that is protected by a login process, such as domain.com/wp-login.php.

The Wayback Machine also uses a web scraper to create the archive of websites on archive.org. The Wayback Machine therefore also has never had access to the backend of any site. And if a file is not on archive.org, then we obviously cannot recover it either.

This also means that any functionality of your website that connects to the backend, will not function correctly anymore. For example, a contact form needs to connect to the database (=backend) to save data. Maybe it also uses other backend files to send automatic emails. The result is that a contact form on a recovered website looks the same as before, but it won't work anymore.

Some URLs end in .PHP or .ASP. This doesn't mean that a website user can see those files. It can only see the response that was created by those files. This response is normally an HTML file - even though the URL ends with . PHP/ASP. The result is that that we cannot recover the original PHP/ASP files and their functionality. We only recover the layout/looks of those pages.

If you order our WordPress conversion, then we reverse engineer a new "backend". We then create a new WordPress dashboard which you can access on domain.com/wp-login.php. This few dashboard doesn't have all the theme options (to change colors and widgets etc) but it does allow you to edit the current content in a visual WYSIWYG editor.

For the WordPress conversion, we also reverse-engineer a new MySQL database. However, it's not perfect. For example, your old contact forms will look the same, but won't work anymore. We can fix contact forms for an additional price or you can manually do this yourself.

Question 4

Why Can Large Websites Take Several Hours To Scrape?

Accepted Answer

There are a few reasons why large websites are slow:

The Internet Archive is slow.
We triple check every broken link to make sure that it is indeed a broken link and not a broken archive.org server. This means that especially websites with many broken links tend to be slow.
The archive blocks our IP if we scrape to fast. We are in direct contact with the Internet archive and they requested us to use a custom User Agent so they can track our behavior. They have the legal right to shut us down, so we have to be nice to them.

If you want to help the Internet Archive, then support them with a financial service, so they can invest in a faster infrastructure. They are good guys, who rely on open source techniques, which is unfortunately one of the reasons their speed is limited. We give a free recovery if you send us a screenshot of a donation upwards of $25.

Question 5

What about copyrights?

Accepted Answer

We don't know the laws from all countries. Most of our customers use our software to recover a site that they created themselves, which is probably legal everywhere.

If you are not the original creator of that site, then it's probably illegal in some countries to recover the content of a website.

It's a grey area for sure, and there is probably nobody who can tell you with certainty, since there is not much legal precedence.

For the expired content, it's unlikely anybody will every find out or care. We make sure the content is not published elsewhere on the internet, so it will be difficult for the other party to provide evidence of financial damages.

In reality: most content on the Internet is duplicated elsewhere, so it's unlikely that these type of borderline cases will cause you any problems. The worst case scenario is receiving an official DMCA takedown request. You then have to remove the content before a certain date to avoid legal action. If you use our software to create a Private Blog Network (PBN), and you are afraid of legal problems, then we recommend removing or replacing the original contact details (phone number, street address).

We never heard from a customer who actually got into legal troubles because of using our services.

Question 6

Do you offer an unlimited monthly plan for WordPress site restorations?

Accepted Answer

We do not offer an unlimited subscription for WordPress conversion, because it takes our developer 1-2 hours per domain.The process is only partially automatic, which is why the price is higher than the regular HTML solution.

Please note that the HTML scraper can still recover websites that were originally made with WordPress. They will look exactly the same as the original website. You just won't be able to use the WordPress dashboard (unless you opt for the WordPress conversion)

Question 7

What is de delivery time for WordPress conversions?

Accepted Answer

The time of delivery depends on the number of pages of the website. A small website is scraped in less than an hour and a large website might take up to a few days. After the scraping has finished, our developer usually delivers the WordPress conversion within 24 to 48h. We guarantee delivery within 72h (not counting weekends).

Question 8

Is there any way to restore a page from the Internet Archive so that links work directly to online pages rather than to archived pages?

Accepted Answer

Our script removes the archive prefix automatically. We restore all links as how they used to be when the website was still working online. There will be no traces left of the Wayback Machine.

Question 9

What is the difference if the archive.org circle around the date is blue or green?

Accepted Answer

A blue circle means a status code of 2xx, such as 200. This is the normal status code for a regular web page on the Wayback Machine. A blue circle is usually a safe choice.
A green circle signifies a 3xx status code, which means a redirect. Try to avoid the green dots when picking a date to scrape. It's better to get the target URL which the redirect leads to.
Orange means an error with a 4xx status code.
A red dot around the date means a server-side error, which carries a 5xx status code.

Question 10

When I go to the scraping result page on waybackmachinedownloader.com, I see many links to waybackmachinedownloader.com. Why do you create these links?

Accepted Answer

These are links to pages that were not availabe on archive.org. Search engines do not like broken links, so our software automatically redirects broken links to the front page of a website (= the root domain).

If you preview the scraping results on our website, they will look like links to waybackmachinedownloader.com, because that is the front page of our website. However, after uploading the files to your domain, those links will point to yourdomain.com

Question 11

Do you really download all captured pages? Or just download one page on the URL I'll give you?

Accepted Answer

We download entire websites - not just one page. For every domain, we download up to 20,000 URLs/pages.

If you can browse to a page, by starting from the front page from a certain date, then we download that page. Our software works like a human user who clicks on all links on the front page. Then it will visit those links and again click on all the links it can find on those pages. It will continue like this until it found all pages.

For every URL, we download only 1 version. So if domain.com/page1/ was archived by archive.org on multiple dates, then we only scrape 1 date.

This date is usually close to the date that you picked for the front page. If you picked a front page with a date from 2016, and our scraper finds an inner page with 2 dates, then we recover the page that is closest to 2016. For example, if you choose https://web.archive.org/web/20160315050823/www.example.com as the front page, and we see that example.com/page1 was archived in 2015 and 2018, then we recover the 2015 version.

For more information, see this blog post: Improved backend now also recovers pages without internal link path

Question 12

About large WordPress conversions (2k+ pages)

Accepted Answer

If your website is extraordinarily large (2000 pages or more) then make sure to read this section. There is no problem for the HTML version, but from our experience, this causes problems with the WordPress conversion. With the WordPress conversion, we have to reverse engineer the theme and database. Especially the database causes performance issues, since a reverse-engineered database tends to be bigger and slower.

For large websites, we can either refund the WordPress part of your order ($80) or create a hybrid solution. In this case, the front page and all the pages that are two click away from the front page will be integrated into WordPress. The remaining pages will work normally as static HTML pages. Those pages will look the same for visitors, but you will only be able to edit them as HTML files. If you don't opt for the refund within the next 24 hours, our developer will assume you want the hybrid solution, and send you the WordPress installation within 2-3 days.

Question 13

I followed the installation guide and my browser says there are "too many redirects". What to do?

Accepted Answer

When you have an unexpected redirect loop error (“domain.com redirected you too many times”) or 500 error:

1. For WordPress users, this is usually caused by incorrect file/folder permissions. This should be 755 for all folders and sub-folders. It should be 644 for all files. For information on how to set permissions see this guide

2. The redirect loop or the 500 error can also be caused by corrupted files after using cPanel to unzip the files. Try to extract (unzip) the files locally and then transfer them via FTP software in an uncompressed form.

Question 14

Can you restore an archived website from/to Joomla or Drupal? Or can you restore Joomla/Drupal sites and convert them to Wordpress?

Accepted Answer

For a Joomla site, we scrape the HTML output from the Joomla installation. Then - if you order the WordPress conversion - we convert the HTML pages into a WordPress installation. This does come with some limitations, which you can read about on https://www.waybackmachinedownloader.com/en/wordpress-conversion/

The same process is applicable to sites built with CMS platforms such as Drupal, Shopify and Magento. Please note that due to the limitations as described on the URL above, a site with a lot of custom backend code - such as Magento or Shopify - won't give good results. Ecommerce website will look the same as on archive.org, but it won't function the same because many objects rely heavily on backend code.

In general, our HTML to WordPress conversion is useful for sites with a lot of content, but with little interaction with the visitor. Good examples are magazines, news sites, blogs etc.

Question 15

How many files can you download? What is the total size of the website?

Accepted Answer

You can view all files that are available on archive.org via these URLs: https://web.archive.org/details/bowdbeal.com and https://web.archive.org/web/*/bowdbeal.com/* Simply replace "bowdbeal.com" with your domain, to see the total number of available files for your domain.

Note: we might not be able to scrape all files, because the Wayback Machine does not always respond to our request. In general, we try about 5 times to download a file with different IP addresses, before we give up on it.

Question 16

How do I test a website before uploading?

Accepted Answer

If it's an HTML-only website, then you can simply upload the same files to a subdomain like test.domain.com

For WordPress conversions, there are thee ways to test it:

Test it locally on something like WAMP. We use this for quality control ourselves.
Host it on another domain or IP address. Then change your "hosts file" in windows to tell your browser that the domain is hosted on a different IP. Here is a Windows guide and a guide for Mac/Linux users.
Pay us an extra $20. Then we make the files compatible for another domain, so you can test it on another (sub)domain. Please then specify which domain you want to use for testing, for example, test.domain.com.

Question 17

What is the difference between the HTML and WordPress version?

Accepted Answer

All internet sites are HTML sites. WordPress consists of PHP code that creates HTML files as their output. As a website visitor, you cannot see the PHP code. You can only see the HTML code, that your browser translates into visually appealing pages.

We can recover the HTML files. However, the original WordPress installation and dashboard, which you can access via /wp-admin/ is part of the backend. By definition, website visitors and webscrapers cannot see the backend.

Our WordPress version is a reverse-engineered HTML-to-WordPress conversion, based on our HTML version.

If a site was originally built with WordPress, then you do not necessarily need to order the HTML-to-WordPress conversion. You can simply use only the HTML output of the original WP installation. However, if you want to edit the pages with the visual editor in the WordPress dashboard, then you need the HTML-to-WordPress conversion.

Question 18

I have to recover a WordPress website from the Wayback machine, which plan should I buy?

Accepted Answer

It depends on how you want to use the site. Please first read https://www.waybackmachinedownloader.com/en/wordpress-conversion/

There are basically 3 options:
1. HTML-only version for $19. If a site was originally built with WordPress, then you do not necessarily need to order the HTML-to-WordPress conversion. You can simply use only the HTML output of the original WP installation. However, if you want to edit the pages with the visual editor in the WordPress dashboard, then you need the HTML-to-WordPress conversion.
2. Standard HTML-to-WordPress conversion. This is what most customers buy, and it costs an additional $80. The WordPress dashboard has limited functionality, but you can easily edit text, images, links etc.
3. The high-end HTML-to-WordPress conversion. This version of the HTML-to-WordPress conversion comes with an elegant and user-friendly WordPress dashboard. It is the best choice for future-proof websites that need regular editing, and full compatibility with third party-plugins such as visual page builders. The price depends on the site, but it's usually between $300 and $2,000.

FAQ