PIZAN Banner
UoE Logo BL Logo AHRC Logo

Working with the XML Transcription Files

XML Readers' Guide

The XML Readers' Guide describes the problems met in transcribing Harley MS 4431, and the solutions adopted. Most of the entries which follow concern XML (eXtensible Markup Language) elements and attributes, but there are also entries concerned with more general matters, for example Accents, Modus Operandi, Punctuation (in editions), Punctuation (Scribes'), and Reference.

The Readers' Guide can be downloaded here.

Guide to working with the XML transcriptions

All our files are designed to be used with the Firefox Web Browser.

The Tools we have developed are all available on line, using the links given below. To download them to your hard drive please position your cursor on the file-name, click with the right-hand mouse button, and then click on 'Open Link in New Window' with the left-hand mouse button.

Before working on an XML file you must choose one of the many XML editors available. The research team has used Editix or Oxygen, both of which are designed for use with a PC or a Mac. Your XML editor should, if possible, be copied into an empty folder (directory). See also Alexandre Brillant, XML Cours et exercices, Paris: Eyrolles, 2007, pp. 132-138.

The names of the XML text files developed by the project follow a standard pattern: 'book.xml', where 'book' is the 4-letter acronym for the Christine de Pizan text which has been selected. (Please use lower case for the acronym, since capitals in filenames can be lost in the transfer from a PC to a Unix web-server and then to a Mac.)

Additional Files

Having loaded the XML file into your XML editor you should load into the same folder (directory) the four additional files described below:

1. halfway.dtd, a Document Type Description which lists in alphabetical order the TEI P5 elements and associated attributes used in the transcription of Harley MS 4431.

2. classes.css a Cascading Style-sheet which renders font size and colour in the HTML file to be displayed on the web.

3. fleur.png, the small red star which points to notes in the HTML file to be displayed on the web.

4. javascript.js contains the JavaScript or ECMAScript code which is required for for the interactive glosses and name pop-up windows.

XSL Files

In addition you will need to choose one of the five XSL transformation files listed below, each of them designed for a different purpose.

1. student.xsl is our main transformation program. It makes a scholarly edition of Harley MS 4431 in its entirety, or of an individual book, as the reader chooses; it is aimed at specialised and non-specialised Anglophone and Francophone audiences, whose interest in Christine's texts is literary above all. (It ignores the codicological elements which mark page-breaks, running titles, catchwords, signatures etc.)

2. diplo.xsl removes all elements and attributes, together with all modern punctuation, and thus creates a diplomatic edition.

3. glossary.xsl lists the glosses included in the transcription files, presenting them in the form of a table.

4. propernames.xsl lists the Proper Names found in Harley MS 4431.

5. rime.xsl lists the rhyme-words found in Harley MS 4431, displaying them as a spreadsheet in Excel.

Search Tools

1. MAQUETTE makes it possible to do a combined text and image search.

2. SIFT, an extremely versatile search tool, enables the reader to interrogate a text visually.

3. LOCI, under development, offers innovative ways of interrogating master.xml or the files of any individual text. LOCI is only one of several very useful programs available on that website.

4. 'Ctrl F' works with all the project files, as do the search tools included with XML editors.

5. XPath 2.0. For a comprehensive XPath 2.0 manual, see XPath 2.0.

MAQUETTE, SIFT and LOCI have been developed by Charles Mansfield, as have the Python tools described below.

Python Scripts for Download

1. comments.py converts simple tagging to XML and numbers lines incrementally.

2. flip.py reverses a string so that, for example 'The' becomes 'ehT'.

3. undash.py removes hyphens (minus signs).

4. xtirp8.py converts accented characters to UTF-8 entities in hex for XHTML.

5. Python Imaging Library.

Other Useful Files

get.html is the pilot version of our file which creates a link to the Dictionnaire du moyen français in Nancy.

Microsoft Development Network provides information about how to achieve special effects in print, e.g.'underline', 'overline', 'line-through' etc.

Text © 2010 Edinburgh University Library
All manuscript images © 2005 The British Library
All rights reserved