What can be recovered from the Web Archive?

News and updates
Аватара пользователя
Сообщения: 11
Зарегистрирован: 27 янв 2021, 22:19

Sometimes our users ask why the website was not fully restored? Why the website doesn't it work the way I would like it to? There are several answers to this, and the very first one is that the website is being restored from the web archive, and therefore you can only restore what is and nothing more.

The Wayback Machine only saves the outer part of the website and cannot save the inner structure, admin panel, database, and so on. If the website was previously dynamic, then after restoring from the archive it will be static. Contact forms, comment boxes, and online purchases elements will not work. With a few exceptions - if it's all implemented in Java scripts that were saved by the Web Archive. You need to be more careful with them, because it often happens that they transmit or take some data from third-party domains, and if before, for example, there was a visiter counter script , now, after the domain was rebuilt, there can be everything including malvare.

After restoring the website, we recommend to check with our CMS all external links in the Java scripts code using the http: // and https: // templates and figure out what they do.

Another example of why restored website is not working as expected is not loaded CSS styles. On some websites, styles may be on a different domain. Our spider script will processes only links from one domain and does not follow external links. This issue can be easily verified by looking in the site code for the URL where the site styles are located. And if they look like this - https://another_domain.com/styles/main.css, it is better to download CSS styles from the Web Archive and manually upload them to the site using our CMS.

And finally, the third and most common case of incorrect operation of the restored site is incorrectly set recovery time intervals. The date the site was archived on archive.org does not mean that the entire site, from start to finish, was archived at that time. In fact, all files, styles, scripts, images were saved at different times. Too narrow period of time setted in our system, as a rule, leads to the fact that a significant part of the website will be not restored. Sometimes it is not easy to choose right timestamp, but to help you, we have an article on how to do it - https://archivarix.com/en/blog/3-how-do ... rchiveorg/