CTAN Comprehensive TeX Archive Network

Directory indexing/latex-indexer

README.md

-Indexer

The -Indexer is a free, open-source, platform-independent tool designed to automate the generation of indexes for documents. It extracts words from .tex files, generates frequency distributions using PGFplots, allows users to select and tag terms (including variants and sub-variants), and compiles the indexed document with MakeIndex. Released under GPL-3, it saves approximately 80% of indexing time, making document creation more efficient.

Table of Contents

Installation

On macOS, install the required dependencies and the -Indexer as follows:

brew install pandoc
curl -O https://mirrors.ctan.org/.../indexer.zip
unzip indexer.zip

Ensure you have the following prerequisites:

  • An up-to-date installation
  • Pandoc
  • Java Version 21 or higher

Usage

Run the -Indexer with:

java -jar indexer.jar /path/to/your/file

Commands

The -Indexer supports the following commands, entered at the prompt:

  • h, help: Displays a list of all available commands with brief descriptions.
  • p, parse: Re-parses the .tex document to update the word list. This runs automatically at startup but can be rerun to refresh the list.
  • l, list: Lists parsed words with optional parameters:
    • -n <number> (number of words to display, default 20)
    • -c <a|f> (sort alphabetically or by frequency, default frequency)
    • -p <prefix> (filter words by prefix)
    • -r <true|false> (reverse order, default false)
    • -h for detailed help
  • g, generate: Creates a .tex file with a frequency plot using PGFplots, rendered with PDF. Supports the same parameters as list, plus:
    • -f <filename> for a custom plot file name
    • -h for details
  • s, subvariant: Defines words as subvariants of a specified word, indexing them under the main word. Enter as s <word1> <word2> ..., then provide subvariant words when prompted. Use -h for help.
  • v, variation: Defines words as variations of a specified word, indexing their occurrences under the main word. Enter as v <word1> <word2> ..., then provide variation words. Use -h for help.
  • a, add: Automatically adds specified words to the index. Enter as a <word1> <word2> .... The tool checks if words exist in the document before adding them. Use -h for help.
  • i, interactive: Interactively adds a single word to the index, prompting the user to confirm each occurrence. Enter as i <word>. For each occurrence, the tool shows the line and context, allowing the user to choose [Y]es, [N]o, or [A]bort. Use -h for help.
  • q, quit: Exits the program.

Tips

While indexing an entire book at once is possible, the authors recommend processing individual chapter files for better manageability.

Limitations

The latex indexer is built with Pandoc. Pandoc is incredibly versatile and offers support for a great number of markup formats. However, it can occur that Pandoc does not know a certain latex package. In that case, it simply ignores the code 'written in the language' of said package, i.e. it ignores environments of such a package. When this happens, Pandoc prints an extensive warning to the command line at the beginning of the program, to let the user know.

Future Work

In the future a possible workaround for the aforementioned problem may be to catch such a warning, to call latexmk on the specified file, and then use Pandoc on the resulting PDF to parse the content of the file. It would then however be necessary to go over the tex files with nested loop to find occurences of specified words when adding the index{} macro, as we would not have any information about the words locations in the source file.

Contributing

Contributions are welcome! To contribute:

  1. Fork the repository.
  2. Create a new branch (git checkout -b feature/your-feature).
  3. Commit your changes (git commit -m 'Add your feature').
  4. Push to the branch (git push origin feature/your-feature).
  5. Open a pull request.

License

This project is licensed under the GPL-3 License. See the LICENSE file for details.

Version

1.0.0

Contact

For questions or feedback, you are welcome to open an issue!

Authors

David Degenhardt and Frederik Leyvraz, 2025

Download the contents of this package in one zip archive (856.3k).

latex-indexer – Automate index generation for documents

This is a free, open-source, platform-independent tool designed to automate the generation of indexes for documents. It extracts words from .tex files, generates frequency distributions using PGFplots, allows users to select and tag terms (including variants and sub-variants), and compiles the indexed document with MakeIndex. Released under GPL-3, it saves approximately 80% of indexing time, making document creation more efficient.

Packagelatex-indexer
Repositoryhttps://gitlab.ti.bfh.ch/texnicians/latex-indexer
Version1.0.0 2025-06-13
LicensesGNU General Public License, version 3
MaintainerDavid Degenhardt
TopicsIndex
Index proc
...
Guest Book Sitemap Contact Contact Author