On Distant and Close Reading Again

What we are thinking about to develop the “Pico Project” focuses on the application of “topic modelling” techniques to detect sources of the Conclusiones CM in the available digitised corpora of some medieval authors (Albertus Magnus, Thomas Aquinas and possibly John Duns Scotus).  Similar attempts have already been successfully carried out—cf. Timothy Allen & al., “Plundering Philosophers: Identifying Sources of the Encyclopédie,” http://quod.lib.umich.edu/j/jahc/3310410.0013.107/–plundering-philosophers-identifying-sources?rgn=main;view=fulltext, and Glenn H. Roe, “Intertextuality and Influence in the Age of Enlightenment: Sequence Alignment Applications for Humanities Research,” http://www.dh2012.uni-hamburg.de/conference/programme/abstracts/intertextuality-and-influence-in-the-age-of-enlightenment-sequence-alignment-applications-for-humanities-research/. A procedure associated with distant reading can be of help in identifying possible sources of a given text.

So far, nothing new.  But the application of topic modelling may be brought to bear also on the practice of Linked Open Data annotation, a procedure more commonly associated with close reading. For identified topics, or groups of terms that frequently occur together, can provide actually observed data to define controlled languages and, possibly, to develop specific ontologies to annotate the texts. If we do this, instead of “Using semantic enrichment to enhance big data solutions,” (see http://www-01.ibm.com/software/ebusiness/jstart/semantic/), we may use topic modelling to enhance annotation and semantic enrichment or, to say it otherwise, we may use distant reading to enhance close reading.