Books+ Search Results

British periodicals dataset. Collection II

Title
British periodicals dataset. Collection II, 1681-1939.
Publication
[Ann Arbor, Michigan] : [ProQuest LLC], [between 2010 and 2018?]
Physical Description
1 online resource (approximatedly 16,600 files)
Local Notes
Access is available to the Yale community.
Notes
Title and variant titles devised by cataloger.
The XML directory contains one zip file per year (for example, 1869_0). The uncompressed folder, once expanded from the zip file, will have a different name that does *not* represent the year. Instead, it will match the 'xxxx' part of the PDF zip files (see below), such as 1869_0_xxxx.zip in the PDF directory. Inside the XML folder is a year folder and a sequence number, such as 1869_0. Inside this are article-segmented XML files. It would take some work to reconnect the XML files to the PDF files programmatically. In addtion to the PDF folder below, there are six zip archives at theh top level, alongside the XML and PDF folder. Five of these are duplicates, and one [1877_5_2879.zip] is unique and should be moved into the 00010101_99991231 folder. The PDF directory [00010101_99991231] contains one or more zip file per year (for example, 1869_0_xxxx and 1741_1_xxxx). The numbers after the year do *not* refer to the month. Inside these archives are page-segmented PDFs that do not contain OCR (Optical Character Recognition). These are named identically to the files in the XML folder, above.
Description based on record for source database.
Access and use
Access restricted by licensing agreement and agreement to terms of use.
Summary
Dataset of articles for text data mining (TDM) from 300 British periodicals on literature, music, art, drama, archaeology, and architecture dating from 1681-1939, selected from the the ProQuest microfilm collections English literary periodicals, British periodicals in the creative arts, and additional titles. The set contains data in PDF format (segmented into articles) and XML format (segmented into pages).
Variant and related titles
British periodicals online dataset. Collection II
British periodicals online dataset. Collection 2
British periodicals II dataset
English literary periodicals dataset
British periodicals in the creative arts dataset
Archives unbound dataset collection.
Format
Books / Data Sets / Online
Language
English
Added to Catalog
April 29, 2021
System details note
System requirements: PDF viewer and XML reader.
Genre/Form
Text corpora.
Data sets.
Periodicals.
Also listed under
ProQuest (Firm), publisher.
Citation

Available from:

Loading holdings.
Unable to load. Retry?
Loading holdings...
Unable to load. Retry?