TLG Index Utilities

I have worked very sporadically on utilities that permit access to the TLG's various indexes and lists from the command line or in conjunction with the `cgreek' package for emacs. The only product of my efforts at this point is a small collection of utilities for searching the TLG list of word forms (tlgwlist.inx) and retrieving word counts for word forms from the large index of word counts (tlgwcnts.inx). The package tlgindexutil contains some additional emacs functions to accompany cgreek (cgreek-tlgindexutil.el), plus sources and a Makefile for building two executables (tlgwlist and tlgwcounts) and a shell script front end for tlgwcounts. For the details and installation instructions, see the README.

How It Works

This package provides the emacs function cgreek-tlg-find-wordform for looking up word forms in the TLG word index. If you type M-x cgreek-tlg-get-wordcounts, you will be prompted for a betacode string to search for (it will be interpreted as a regular expression; if you want to match only at the beginning of a word, put a space at the beginning). You will then be offered a list of matching word forms with counts of their occurrences in the entire TLG corpus. Here, for instance, is the result of a search on the string "YH/FIZ" (Ψηφιζ, with acute over η):

In this result buffer, you may then move to any line containing a word form and get a list of works containing it by pressing the space bar:

You may limit this to a specific author by specifying a TLG author number (you can find these from the "*TLG authtab*" buffer in the cgreek package). If (as in this example) you do not specify an author, you get a list of all matches in all authors. Moving the cursor to a line in this result buffer and pressing spacebar will open the work in question. You'll then have to find the actual occurrences by searching (e.g. with C-s): the TLG indexes themselves list only the works in which forms occur, not the locations within those works. If you use this package (or find that it's unusable), I would appreciate your comments.

Robin Smith
Department of Philosophy
Texas A&M University
College Station, TX 77843-4237
rasmith@aristotle.tamu.edu
Voice (409) 845-5696 FAX (409) 845-0458

Requirements

  1. A copy of the Thesaurus Linguae Graecae CD-ROM
  2. GNU Emacs 20.4 or later. (follow this direct link to version 20.7) if you're in a hurry).
  3. You'll also want the leim package (Library of Emacs Input Methods) to go with it.
  4. The cgreek-emacs20 package
  5. A C compiler and grep(standard Unix tools)
  6. A version of flex or lex (standard Unix tool)

Installation

This is strictly a build-it-from-source installation, though the build should not raise any serious problems if you have a working C compiler and a version of lex. It should work for Unices (Linux, the BSDs, Solaris, etc) if you can get Emacs 20 and the cgreek package to work. Something like it may also work for Windows systems running Meadow, though I haven't heard any verification of this.

  1. Presumably, you already have a copy of the TLG CD-ROM, without which the rest of this is not of much use.
  2. You need GNU Emacs 20.4 or higher, with MULE. The most recent version of Emacs 20 is 20.7 See the Official Mule Page and the GNU Emacs home page for more information.
  3. Install cgreek-emacs20, if you haven't already. (If you're trying to make this work with Meadow (Emacs for Windows), you will want the Meadow version).
  4. Download the tlgindexutil archive. This link points to the current version (at the moment, tlgindexutil-0.1.2. If clicking on the link gives you a screenful of unintelligible nonsense, try shift-click (in Lynx, just type 'd').
  5. Move this to a directory where you'd like to work on it (in this example, "~/cgreek-emacs20/"):
    mv tlgindexutil.tgz ~/cgreek-emacs20
  6. Change to this directory:
    cd ~/cgreek-emacs20
  7. Extract the archive:
    tar -zxf tlgindexutil-0.1.2.tgz
  8. Quick Default Install. The default installation assumes that: If that describes you, then do this:
    make install
  9. Changing the Defaults. You can change both the location of the TLG CD-ROM and the install location as follows. For example, this command will make the installation look for the CD-ROM in /mnt/cdrom/ and install in /home/smith/tlgstuff/
    make install TLGDIR=/mnt/cdrom/ INSTALLDIR=/home/smith/tlgstuff/
  10. If you get error messages, check the Makefile to see if anything looks wrong. You must have both a C compiler and a version of flex or lex.
  11. Start emacs and open the file cgreek-tlgindexutil.el:
    C-x C-f ~/cgreek-emacs20/tlg/cgreek-tlgindexutil.el
    Near the top of the file, you should see this code:
    (defvar cgreek-tlgwordlist-program
      "~/cgreek-emacs20/tlg/tlgwlist"
      "*Program to create word form list from tlgwlist.inx.")
    
    (defvar cgreek-tlgwordcounts-program
      "~/cgreek-emacs20/tlg/tlgwcounts"
      "*Program to search tlgwcnts.inx for word form counts.")
    
    (defvar cgreek-tlg-wordlist-expanded
      "~/cgreek-emacs20/tlg/tlgwlist.expanded"
      "*Expanded version of  tlgwcnts.inx.")
      

    If you have installed things in the default directory ~/cgreek-emacs20/tlg/, just leave this alone. However, if you've changed the install location, then be sure you change all occurrences of ~/cgreek-emacs20/tlg/ to the directory in which you installed. Save any changes, then evaluate the buffer to load the commands (`M-x eval-buffer'). The commands should now work.
  12. If you want to make the tlg utilities part of your default emacs environment, I suggest including this line near the end of dotemacs.el (which will be in ~/cgreek-emacs20/ or the equivalent directory on your system):
    (load "cgreek-tlgindexutil")
    For faster loading, byte-compile dotemacs.el (can be done from a dired window or by loading the file and choosing the option Byte-compile And Load under the Emacs-Lisp menu).

Bugs

This is still very preliminary software, so if it works at all, it's bound to have bugs. Please report any problems (or successes, even) to rasmith@tamu.edu.
Last modified: July 8 2005