Bibliographer: Matthew Parker

Masquerades, or What you Will

Modern NLP libraries were used to analyze historical works to identify trends in a given text compared to a large corpus

For my experimental bibliography, I decided to do computational text analysis of my novel. I started out by researching the methods and libraries to use for this, decided to use the NLTK and Gensim to perform latent Dirichlet allocation after researching different methods for textual analysis, and then coded a functional model using a small toy corpus and sample query text. The model produces the topic from the corpus most closely associated with a query text. Problems I encountered doing this were as follows: 1) To do computational text analysis, I needed a plaintext copy of my novel, and none currently existed. This major issue was somewhat mitigated by the existence of a plaintext copy of another novel by the same (anonymous) author. While the OCR isn't the best, I was able to obtain a relatively similar plaintext sample from Google books which I used in place of the original novel. Because of the computational complexity of the operations involved, the operation over the corpus is still in progress. Code can be found here.

Traditional Description

Masquerades, or What You Will. The author of Eliza Warwick. Dublin, 1781. 273 pp. vol. Volume 1 of 2.

Transcription: MASQUERADES; | OR, | WHAT YOU WILL. | BY THE | AUTHOR OF ELIZA WARWICK, &c. | IN TWO VOLUMES. | VOL. 1 | [ORNAMENT] | DUBLIN. | Printed for Messrs. Price, Sleater, W. Watson | W. Wilson, Ennis, Walker, Moncrieffe, | Jenkin, Burnette, E. Cross, Exshaw, | Burton, Parker, and Byrn. | M DCC LXXXI.

Pagination: v.1 273pp; v.2 309pp;

Format: Duodecimo.

contents: v.1 a1r: front matter, a1v: title, b1r: half-title and text, b1v-n5r: text v.2 a1r: front matter, a1v: title, b1r: half-title and text: b1v-o10r: text

Notes: Sourced from the British Library edition, accessed through the Eighteenth Century Collections online with Gale document number CW3312000553. The last page of the first edition marks the end of volume 1, and summarizes letters that were cut in order to abridge the series.

Experimental Description

Topic distribution for "Masquerades, or What you Will" over the canon corpus: 0.019*"time" + 0.012*"room" + 0.011*"man" + 0.009*"moment" + 0.009*"house" + 0.008*"hand" + 0.008*"life" + 0.008*"company" + 0.008*"letter" + 0.007*"morning"
Topic distribution for "Twelfth Night, or what you will" over a Shakespeare Corpus: 0.022*"thou" + 0.015*"man" + 0.013*"thee" + 0.013*"lord" + 0.011*"hath" + 0.008*"heart" + 0.008*"time" + 0.007*"sir" + 0.007*"love" + 0.006*"life"

Common elements: "Time"

Code can be found here.