Guides: Web archives: Home

Can't find what you're looking for?

Links break all the time. People reorganize websites or simply remove something they've put online.

Below are some suggestions for how to get around this persistent problem.

Finding webpages and whole websites that have disappeared, or finding old versions of them

If you follow a link and don't find what you expected, the first thing to do is search on that particular website to see if content is now located somewhere else. Many websites have a search function (look for a text box or magnifying glass icon) and try searching that way. Alternatively, use the site: operator on Google or other search engines to restrict your search to a particular website. For example:

library hours site:unt.edu

will search all webpages whose URL ends in "unt.edu" that contain both the words "library" and "hours" (that is, searching for those words anywhere on the UNT website).

If that doesn't work, you might need to find a copy in a web archive. There are a number of organizations that crawl the web and save copies of websites. You can use Time Travel to search many of these web archives, including the Internet Archive's Wayback Machine (the most well known), all at once, looking for "mementos" (prior versions of webpages).

There are some additional web archives for particular types of websites:

If you are interested in content from the website of a US federal government agency (many of which have addresses ending in .gov or .mil), use the End of Term Web Archive, which contains copies of US government websites captured at the end of each recent presidential term, to search and browse just these websites. But if you are interested in a federal agency or commission that reached the end of its charge or was significantly changed, or from before 2008, try the CyberCemetery.
To access archives created by Texas state agencies, and to search across them, use TRAIL.
If you are interested in old versions of UNT websites:
- UNT Web Archives (browse by semester)
- UNT Web Archive (browse by URL)

Note that web archiving works by following links that appear on pages; therefore, websites with search forms that you need to use in order to reach documents on the site are generally poorly captured.

Saving a copy of a webpage or whole website

WebCite is a membership-supported organization that allows an author or editor to take a snapshot of a Web resource cited within an article and cite that snapshot.
Perma.cc is similar but dedicated just to legal literature
Webrecorder captures your interactions with web pages and lets you save privately or publicly.
You can have the Internet Archive's Wayback Machine take a snapshot on demand (see instructions)

If you are not able to capture a webpage using one of these tools, or if you need to make a private, portable copy, consider using a tool such as FireShot to capture a webpage as a PDF or image file.

Building web or social-media archives (datasets of webpages or posts for study)

Twitter provides a special way to gain access to their archive of tweets for academic research.

YouTube has also begun offering API access through its YouTube Researcher Program.

The UNT Libraries can crawl the Web to collect news stories, social media posts, or other webpages related to certain topics, gathering the data for study by researchers. (For example, see a “Yes All Women” Twitter Dataset.) To request creation of a Web dataset, or if you have any questions about web archiving activities at UNT Libraries, contact Mark Phillips.

Alternatively, you can scrape the web on your own using a tool such as Web Scraper.

Perhaps, though, you don't need to build your own. Some have been created by others and made available:

Web-archive datasets on the Archive-It website
DocNow Tweet Catalog

Analyzing your data

Content Analysis
by John Martin Last Updated Dec 13, 2024 264 views this year

Sharing your dataset

While copyright or licensing restrictions will likely prevent you from sharing the collection of documents that you study, you can still share your list of sources, codebook, scripts, and other data that would allow another researcher to replicate your findings. The UNT Libraries can help you: see our information on research data management.

Search Systems

Getting Started

Advanced Research Support

Checking Out Materials

Delivery Services

From Other Libraries

Equipment

Additional Information

Talk to an Expert

Scholarly/Professional Help

Help with Borrowed Items

Topical Help

Additional Needs

Help with Technology & Printing

Outside & Self Help

Course Support for Students

Other Learning Support

Writing, Citing, and More

For Faculty

Locations

Technology

Study & Reservations

Outside the Box

Rules & Policies

People

Get Involved

Administrative

Find Us

Contributing

Documentation & Forms