Paced

Table of Contents

Copying

Copyright (C) 2017-2018 Free Software Foundation, Inc.

This program is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.

This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.

You should have received a copy of the GNU General Public License along with this program. If not, see http://www.gnu.org/licenses/.

Introduction

Paced (Predictive Abbreviation Completion and Expansion using Dictionaries) scans a group of files (determined by “population commands”) to construct a usage table (dictionary). Words (or symbols) are sorted by their usage, and may be later presented to the user for completion. A dictionary can then be saved to a file, to be loaded later.

Population commands determine how a dictionary should be filled with words or symbols. A dictionary may have multiple population commands, and population may be performed asynchronously. Once population is finished, the contents are sorted, with more commonly used words at the front. Dictionaries may be edited through EIEIO’s customize-object interface.

Completion is done through completion-at-point. The dictionary to use for completion can be customized.

Similar Packages

There are a few Emacs packages that have similar goals to paced, and provided some of the inspiration and motivation behind it.

pabbrev

The pabbrev package by Phillip Lord automatically scans text of the current buffer while Emacs is idle and presents the user with the most common completions.

One of the major downsides to pabbrev is that the data it collects doesn’t persist between Emacs sessions. For a few files that are always open, such as org agenda files, pabbrev works great. If you want to train it from a few files that aren’t always open, you’ll have to open each file and retrain pabbrev from that file. And you’ll have to do this every time you restart Emacs.

It keeps up-to-date usage and prefix hashes of all buffers of the same mode, and scanning, or “scavenging”, blends seamlessly into the background. Completion is just a hash table lookup, so it can handle completion in microseconds. There’s also no setup required; it will start working right away. The downside to this is that dictionaries aren’t flexible; each dictionary corresponds to a major mode, and there’s no way to change that.

predictive

The predictive package by Toby Cubitt scans text of the current buffer on user command. The usage data is stored in a dictionary, which can then be saved to a disk. Extensions are provided to completion-at-point, or predictive’s built-in frontend can be used. It has a safety precaution where it only adds existing words to a dictionary, unless the user allows this. This is to avoid adding typos to a dictionary.

Completion was also done intelligently, grouping commonly used words together and optionally suggesting shorter words before longer words.

While the frontend and backend are separate, the frontend is required to populate a dictionary. There is no way to exclude part of the buffer’s text from dictionary population. The safety precaution predictive has where it only adds a word to a dictionary if it already exists was tedious, since I didn’t need it to do that.

Installation

Requirements

Emacs 25.1
async 1.9.1

Paced may be installed from source, or from GNU ELPA.

From ELPA:

M-x package-install RET paced RET

From Source:

bzr branch https://bzr.savannah.gnu.org/r/paced-el paced

After installing from source, add the following to your init file (typically .emacs):

(add-to-list 'load-path "/full/path/to/paced/")
(require 'paced)

However you install paced, you must also make sure dictionaries are loaded on startup:

(paced-load-all-dictionaries)

Basic Setup

Paced needn’t have a lot of setup to run. In fact, the simplest setup is as follows:

  1. Create a new dictionary, “Default” (See Creating a Dictionary)
  2. Set paced-global-dictionary-enable-alist to ((t . "Default")) (See Selective Dictionaries)
  3. Run M-x global-paced-mode
  4. To add a file to the dictionary, use M-x paced-add-buffer-file-to-dictionary

This will create a default dictionary and populate it from buffers you specify.

Dictionaries

Creating a Dictionary

Now that you’ve got paced installed, it’s time to create a new dictionary.

M-x paced-create-new-dictionary RET DICTIONARY_NAME RET DICTIONARY_FILE RET

Let’s explain those two arguments:

First, you’ve got DICTIONARY_NAME. This is a string that will be used to reference the new dictionary. We recommend something short, like ’new-dict’, ’my-dict’, ’writing’, etc.

Next is the file where the dictionary will be stored. This is typically stored in paced-dictionary-directory, from which all dictionaries will be loaded with paced-load-all-dictionaries (more on that later). For now, it’s important to know that paced-load-all-dictionaries is the easiest way to load dictionaries when paced is loaded.

After you’ve run the above command, you will be taken to the customization buffer. This is where you can set population commands.

Editing a Dictionary

In order to edit a dictionary, paced provides paced-edit-named-dictionary and paced-edit-current-dictionary.

The edit buffer provides the options to change the population commands, case handling, dictionary storage name, and sort method. Each of these is documented in the edit buffer.

Selective Dictionaries

Paced provides a mechanism called the “enable list”, that allows a user to enable certain dictionaries for completion given certain conditions.

There are two enable lists: a global (paced-global-dictionary-enable-alist) and local (paced-local-dictionary-enable-alist) one. They both work the same, with the local one taking precedence. Each entry in the list has a condition and a key.

The conditions are one of the following:

  • A mode name, such as org-mode or text-mode, indicating that the named dictionary should be active in any mode derived from that mode.
  • A symbol, in which case the named dictionary is active whenever the value of that symbol is non-nil. This includes the symbol t.
  • A function symbol, in which case the function is called with no arguments to determine if the given dictionary should be enabled. If the function returns non-nil the dictionary is enabled.
  • A lambda function, in which case it is called with no arguments, and if it returns non-nil, the dictionary is enabled.
  • The form (or CONDITION1 CONDITION2 …), which enables the given dictionary if any of the conditions are met.
  • The form (and CONDITION1 CONDITION2 …), which enables the given dictionary if all of the conditions are met.

Remember that paced-mode must be active for completion to occur. Neither list will activate it, just determine which dictionary is active.

The key is the dictionary name you set during dictionary creation.

Dictionary Files

Paced provides paced-load-all-dictionaries to load all dictionaries in paced-dictionary-directory. Paced determines which dictionaries to load based on two variables: paced-dictionary-directory-whitelist-regexp and paced-dictionary-directory-blacklist-regexp. Paced can also be told to search recursively by setting paced-load-all-dictionaries-recursively to t. All four of these variables may be set using Emacs’s customization interface.

An individual dictionary file may also be loaded:

M-x paced-load-dictionary-from-file RET /path/to/file RET

Once a file has been modified, it may then be saved:

M-x paced-save-named-dictionary RET dictionary name RET

Or, all dictionaries may be saved:

M-x paced-save-all-dictionaries RET

Dictionaries may also be automatically saved whenever changed by setting paced-repopulate-saves-dictionary to t. Population is covered in the next section.

Printing a Dictionary

Paced allows a user to print the contents of a dictionary to a buffer. Uses for this might be to tweak population commands or exclude functions, or to simply make sure a dictionary is populating correctly.

To use this feature, run:

M-x paced-print-named-dictionary RET NAME-OF-DICTIONARY RET

Or for the current dictionary:

M-x paced-print-current-dictionary RET

Population Commands

Part of the beauty of paced is the ease of reconstructing a dictionary. When you’ve got a bunch of files from which you want to populate your dictionary, it’d be a pain to go to each of them and say “populate from this one, next, populate from this one, next”.

Instead, paced provides population commands. Each dictionary has one or more population commands it uses to recreate its contents, run in order during population.

In order to trigger population, run the following:

M-x paced-repopulate-named-dictionary RET DICTIONARY-NAME RET

Built-in Commands

There are five built-in population commands:

file
Populates a dictionary from all words in a given file
buffer
Populates a dictionary from all words in a given buffer, which must exist during population
file-function
Like the file command, but allows a custom setup function. This function is called with no arguments in a temporary buffer containing the file’s contents, and must return non-nil if population may continue.
directory-regexp
Populates from all files in a directory that match the given regexp. Also optionally allows recursion.
file-list
Populates from all files returned by a generator function.

Properties

When setting the population commands of a dictionary, one may also set certain properties. Each property is a variable binding, bound while the population command runs.

A few variables are of note here:

paced-exclude-function
Function of no arguments that returns non-nil if the thing at point should be excluded from population.
paced-thing-at-point-constituent
Symbol defining thing on which population works. Typically set to either ’symbol or ’word.
paced-character-limit
Maximum length of a thing to include it in a dictionary. If set to 0 (default), no limit is imposed.

For convenience, properties that are intended for all population commands of a given dictionary may be set in the dictionary itself. In the event of a conflict, population command properties take precedence over dictionary properties.

Custom Commands

Since the population commands all derive from paced-population-command, it’s possible to add additional commands.

As an example, let’s make a population command that populates a dictionary from a file like so:

alpha 5
beta 7
gamma 21
delta 54
epsilon 2

We want to make a population command that takes a file like this, with word in one column and weight in the other, and add it to a dictionary.

There are two ways to approach this, but we’re going to start with the basic one.

We need to define two functions: paced-population-command-source-list and paced-population-command-setup-buffer. The first returns a list of sources from which to populate, and the second sets up a temporary buffer based on those sources.

For our command, we want to return the specified file, and replicate each word by the amount given.

Inheriting from paced-file-population-command gives us the source list and file slot for free.

(defclass paced-weight-file-population-command (paced-file-population-command))

Now, we need to set up the buffer to replicate the words.

(cl-defmethod paced-population-command-setup-buffer ((cmd paced-weight-file-population-command) source)
  ;; Use the built-in `paced--insert-file-contents' to insert contents.
  (paced--insert-file-contents source)
  ;; Jump to the start of the buffer
  (goto-char (point-min))
  ;; Search for lines with the form WORD WEIGHT
  (while (re-search-forward (rx line-start ;; Start of line
                                (submatch (one-or-more (not (syntax whitespace)))) ;; Our word
                                (syntax whitespace) ;; Space between word and weight
                                (submatch (one-or-more (any digit))) ;; Weight
                                line-end) ;; End of line
                            nil t)
    (let* ((word (match-string 1))
           (weight (string-to-number (match-string 2)))
           ;; Repeat WORD WEIGHT times
           (new-text (string-join (make-list weight word) " ")))
      ;; Replace the matched text with our repeated word
      (replace-match new-text))))

That’s all there is to it. When you go to edit a dictionary, the “weight-file” population command will automatically be added as an option for a population command.

The even easier way to do this would’ve been to use paced-file-function-population-command, but it doesn’t make for a good example in this case.

Asynchronous Population

A common problem is that population can take a long time. Some of us populate dictionaries from org agenda files, which can get pretty big.

To solve this, paced uses the async package. Setup is seamless; just stick whatever code you need in ~/.emacs.d/paced-async.el, and use one of the two population commands:

A named dictionary:

M-x paced-repopulate-named-dictionary-async RET NAME RET

Or the current dictionary:

M-x paced-repopulate-current-dictionary-async RET

A few things to note about this:

  1. Dictionaries will be automatically saved by this method after population
  2. Asynchronous population doesn’t change anything until after population is finished, so a user may continue to use their dictionary while population is happening. This also means that multiple populations may run in parallel without interfering with one another.
  3. Because async runs population in a separate Emacs process, any custom code required for population must be in paced-async.el. This includes additional population command types, but doesn’t include the following variables:
    • load-path
    • paced-thing-at-point-constituent
    • paced-async-load-file

Example Setups

Org Agenda Files

As some of us record everything about our lives in our agenda files, it might be helpful to have a dictionary tuned to ourselves.

We use a file-list command that returns the agenda files, and an exclude command to block out all of Org’s extra features such as source code and drawers.

The generator for file-list is easy:

org-agenda-files

Done.

Now, the exclude command, which sits inside the properties option. This can be added to paced-async.el:

(require 'org)

(defun org-at-tag-p ()
  (let* ((p (point)))
    ;; Ignore errors from `org-get-tags-string'.
    (ignore-errors
      ;; Checks the match string for a tag heading, setting match-string 1 to the
      ;; tags.  Also sets match-beginning and match-end.
      (org-get-tags-string)
      (when (match-string 1)
        (<= (match-beginning 1) p (match-end 1))))))

(defun org-at-keyword-p ()
  "Return non-nil if point is at a keyword such as #+TITLE."
  (save-excursion
    (beginning-of-line)
    (looking-at-p "^#\\+")))

(defun org-at-heading-prefix-p ()
  "Return non-nil if looking at the leading stars of a heading."
  (looking-at outline-regexp))

(defun org-at-hline-p ()
  (save-excursion
    (beginning-of-line)
    (looking-at-p "^-----")))

(defun org-paced-exclude ()
  (or
   ;; Drawers
   (org-between-regexps-p org-drawer-regexp ":END:") ;; Doesn't catch END
   (org-in-regexp ":END:") ;; but this does

   (org-at-tag-p) ;; tags
   (org-at-keyword-p) ;; Keywords, such as #+TITLE
   (org-at-heading-prefix-p) ;; Leading stars of a heading
   (org-at-item-bullet-p) ;; Item Bullets
   (org-at-timestamp-p) ;; Timestamps
   (looking-at-p org-todo-regexp) ;; TODO keywords
   (org-at-hline-p) ;; H-lines

   (org-at-comment-p) ;; comments
   (org-in-regexp org-any-link-re) ;; links
   (org-in-block-p '("src" "quote" "verse")) ;; blocks
   (org-at-planning-p) ;; deadline, etc.
   (org-at-table-p) ;; tables
   ))

As explained earlier, this can be put inside properties in the customize buffer as such:

Properties :
[INS] [DEL] Variable: paced-exclude-function
Lisp expression: 'org-paced-exclude

And you’re done. See how easy that was?

Project Files

Now we get to the interesting one. There are tons of ways to collect project files in Emacs, so we’re going to stick with one for now, being Emacs’s built-in VC package.

(defun vc-paced-find-project-files (path-to-project-root)
  "Use VC to collect all version-controlled files."
  (let ((file-list))
    (vc-file-tree-walk path-to-project-root (lambda (f) (push f file-list)))
    file-list))

We’d then need to use the following for our file-list generator:

Generator : (lambda nil (vc-paced-find-project-files "/home/me/programming/paced"))

Now, we (probably) don’t want commented code to get in our way, so we’ll use a small function for excluding those:

(defun paced-at-comment-p ()
  (nth 8 (syntax-ppss)))

Use that for paced-exclude-function, and you’re done. We can’t necessarily recommend this for any programming language, as there are dedicated solutions for almost everything, but it makes an excellent fallback.

Markdown Files

Another common request is markdown files. In order for this to work, you’ll need to install markdown-mode:

M-x package-install RET markdown-mode RET

After that, add the following to your paced-async.el file:

(require 'markdown-mode)

(defun paced-markdown-exclude-p ()
  "Taken from `markdown-flyspell-check-word-p'."
  ;; Exclude anything markdown mode thinks flyspell should skip.
  (or
   ;; Ignore code blocks
   (markdown-code-block-at-point-p)
   (markdown-inline-code-at-point-p)
   ;; Ignore comments
   (markdown-in-comment-p)
   ;; Ignore special text
   (let ((faces (get-text-property (point) 'face)))
     (if (listp faces)
         (or (memq 'markdown-reference-face faces)
             (memq 'markdown-markup-face faces)
             (memq 'markdown-plain-url-face faces)
             (memq 'markdown-inline-code-face faces)
             (memq 'markdown-url-face faces))
       (memq faces '(markdown-reference-face
                     markdown-markup-face
                     markdown-plain-url-face
                     markdown-inline-code-face
                     markdown-url-face))))))

That excludes anything that the developers of markdown-mode felt should be excluded from flyspell.

Set this as your exclude function in your dictionary’s settings, then add each markdown file by hand.

Repopulating Dictionary After Saving

This is a common request, although with the power of async, it’s an easy one to fulfill. This will repopulate the current buffer’s dictionary every time you save a file with a dictionary. This may seem daunting, but the dictionary will remain usable during population, and multiple populations won’t interfere with one another.

;; Repopulate the current dictionary after saving
(add-hook 'after-save-hook 'paced-repopulate-current-dictionary-async)

Add that to your .emacs file, and paced will take it from there.

If you decide that’s too much, do the following:

M-: (remove-hook 'after-save-hook 'paced-repopulate-current-dictionary-async) RET

Repopulating Dictionary After Spellchecking the Buffer

Another request, although much trickier to do. This one involves using Emacs’s advice mechanism:

(define-advice ispell-pdict-save (:after (&optional _no-query _force-save) paced-populate)
  ;; Repopulate the current dictionary after running spell check
  (paced-repopulate-current-dictionary-async))

If you decide this isn’t for you, do the following to revert the changes:

M-: (advice-remove #'ispell-pdict-save #'ispell-pdict-save@paced-populate) RET

Contributing

We are all happy for any help you may provide.

First, check out the source code on Savannah: https://savannah.nongnu.org/projects/paced-el

bzr branch https://bzr.savannah.gnu.org/r/paced-el/ paced

Build the Makefile with EDE:

  1. Open any file from paced (See Working with EDE if you encounter “Corrupt object on disk” error)
  2. Run C-c . C or M-x ede-compile-project

Bugs

There are two ways to submit bug reports:

  1. Using the bug tracker at Savannah
  2. Sending an email using paced-submit-bug-report

When submitting a bug report, be sure to include a description of the dictionary or population command that caused the problem, with as much detail as possible.

Development

If you’re new to bazaar, we recommend using Emacs’s built-in VC package. It eases the overhead of dealing with a brand new VCS with a few standard commands. For more information, see the info page on it (In Emacs, this is C-h r m Introduction to VC RET).

To contribute with bazaar, you can do the following:

# Hack away and make your changes
$ bzr commit -m "Changes I've made"
$ bzr send -o file-name.txt

Then, use paced-submit-bug-report and attach “file-name.txt”. We can then merge that into the main development branch.

There are a few rules to follow:

  • New population commands should be named paced-POPULATION-COMMAND-TYPE-population-command
  • Run ’make check’ to verify that your mods don’t break anything
  • Avoid additional or altered dependencies if at all possible
  • Dictionary commands come in threes (“the operation triad”):
    1. paced-dictionary-OPERATION, a cl-defmethod which performs OPERATION on a dictionary
    2. paced-OPERATION-on-named-dictionary, an interactive only function that prompts for a dictionary name and performs OPERATION on that dictionary:

           (interactive (list (paced-read-dictionary)))
           (paced-ensure-registered name)
           (paced-dictionary-OPERATION (paced-named-dictionary name))
      
    3. paced-OPERATION-on-current-dictionary, an interactive function that performs OPERATION on the current dictionary

           (interactive)
           (paced-dictionary-OPERATION (paced-current-dictionary-or-die))
      

Documentation

Documentation is always helpful to us. Please be sure to do the following after making any changes:

  1. Update the info page in the repository with C-c C-e i i
  2. If you’re updating the HTML documentation, switch to a theme that can easily be read on a white background; we recommend the “adwaita” theme

Working with EDE

EDE can be a little finicky at times, but we feel the benefits, namely package dependency handling and Makefile generation, outweigh the costs.

One of the issues that many will likely encounter is the error “Corrupt file on disk”. This is most often due to EDE not loading all its subprojects as needed. If you find yourself dealing with this error often, place the following in your .emacs file:

;; Target types needed for working with paced
(require 'ede/proj-elisp)
(require 'ede/proj-aux)
(require 'ede/proj-misc)

These are the three target types that paced uses: elisp for compilation and autoloads; aux for auxiliary files such as documentation; and misc for tests.

When creating a new file, EDE will ask if you want to add it to a target. Consult with one of the paced devs for guidance, but usually selecting “none” and letting one of us handle it is a good way to go.

Changelog

1.1.3

  • Fixed bug with printing an empty dictionary
  • Fixed bug with paced crashing on non-existent thing at point

1.1.2

  • Fixed bug with printing dictionaries

1.1.1

  • Fixed bug with asynchronous population throwing an error on no dictionary
  • Set paced-throw-error-on-no-current to nil by default

1.1

  • Cleaned up the code to reflect the “operation triad”
    • -OP, OP-on-named, OP-on-current
    • Retained backwards compatibility by obsoleting a bunch of functions, but didn’t remove any of them
    • Also removed the use of dict- in global variables and functions
  • Added the ability to print the contents of a dictionary in a separate buffer
  • Added the option to limit the words added during population by size
  • Various documentation improvements

1.0.1

Bug fix release

  • Save dictionaries right after they’re created
  • Added “force” parameter to save functions

1.0

Initial release.

Email: dunni@gnu.org

Validate