2022 Session (22 - 26 June)


Thursday Friday Saturday Sunday

Thursday, 6/23

Time Activity

9:30 - 10:15am


10:15 - 10:45am

Introduction to text encoding and XML markup

  • Orientation to XML elements and attributes. For further reading see Obdurodon’s What is XML and Why should humanists care?
  • Ordered Hierarchy of Content Objects (OHCO): the XML Tree hierarchy
  • Well-formedness vs. Validity
  • Character entities: ampersands and angle bracket characters in XML
  • Example XML files:
  • The process of document analysis and document data modeling
  • Issues with imposing an ordered hierarchy on messy documents (see Ozymandias example): What do we do with overlapping hierarchies?

10:45am - lunch

TEI XML Orientation

Lunch Break

return by ~1:30pm

1:30pm - 2:30pm

TEI ODD walk-through exercise

2:30pm - 5pm

MS Paleography workshop: Mitford letters

5 - 5:30pm

Discussion and review of MS workshop code

Friday, 6/24


9:30am - 11:30am

Introducing Regular Expressions: patterns in text files

Lunch Break

return by ~1:45 pm

2 - 4:15 pm

  • Regular Expressions Workshop: Into the Weeds! Files to Up-convert from the Digital Mitford Project [to be posted]
    • Steps:
      • Perform document analysis: study and look for patterns you can search for. Jot them down
      • Remove everything you don’t need at start and end of document
      • Simplify/reduce white spaces.
      • Work from the inside out: Try starting with what you can match readily and the most numerous line-by-line tags.
      • When working with larger structures (chapters, scenes, speeches, line groups), try the clopen (close-open) strategy. If you find the start of a thing, you have found the end of the previous thing. Remember if you use clopen, you have to clean it up afterwards: you’ll have an extra start-tag at the start and you’ll be missing the end tag at the end.

4:30 - 5:30pm

Introduction to XPath

Saturday, 6/25


9:30 -10am

Wrapping up!

  • Last full day: Housekeeping and travel arrangements
  • Taking stock: Research questions, project ideas, applications.
  • Some options we like for publishing TEI XML editions:

10 - 11:25am

XPath intensive with Digital Mitford Site Index

In oXygen open URL: https://digitalmitford.org/si.xml

11:30am - 12:30pm

XQuery or XSLT demonstration: pulling and remixing data

Lunch Break

return by ~2pm

2 - 3 pm

Document Data Modeling with the Digital Mitford Journal: Discussio

3 - 4:30 pm

Class choice or project-specific work

Sunday, 6/26


9:30am - 12:30pm

Conclusions/Farewells/Last Questions! For those who can stay, hands-on Practice with anything we have introduced in this workshop

2pm onward

Mitford Editors work session

Oxygen XML EditorThanks to SyncroSoft for generously contributing complimentary extended trial licenses for their <oXygen/> XML editor for the use of our Coding School participants.

eXist-db eXist-db is an open source native XML database and application platform. TEI Publisher is an open source product of eXist Solutions.