<kaz@kylheku.com>
git bisect
git add
Only Lines Matching a PatternTXR
:
TXR is a pragmatic, convenient tool ready to take on your daily hacking challenges with its dual personality: its whole-document pattern matching and extraction language for scraping information from arbitrary text sources, and its powerful data-processing language to slice through problems like a hot knife through butter. Many tasks can be accomplished with TXR "one liners" directly from your system prompt. TXR is relatively new: the project started in 2009.
It is difficult to give a small introduction to TXR because it is no
longer a small language. The PDF rendition of the reference manual,
which takes the form of a large Unix man page, is
778
881
900
951 pages long,
excluding any index or table of contents. There are many ways to solve a given
data processing problem with TXR: many skills and techniques can be used.
"The best lisp for text processing is TXR Lisp. [...] But [it is] a rabbit hole that goes so deep, you'd better have nothing but free time if you want to grok it all"
(anonymous comment on 4Chan)
"So far, txr-lisp felt most ergonomic to me when I tested various languages for their suitability to implement MAL in"
(comment by user wasamasa in a #Lisp IRC channel)
TXR Lisp supports compilation to code for a register-based virtual machine
code. Individual functions can be compiled as well as files (.tl
to .tlo
). Compiled files may be catenated together to
load
. Individual compiled files, as well as catenated files,
may be compressed with gzip
.
Though not native, the compiler is optimizing: it performs optimizations like jump threading, dead code elimination, constant folding, and data flow optimizations. The compiler is actively being improved.
Application deployment is possible. The save-exe
function creates
a copy of the TXR executable, under a name of your choosing, and containing an
expression that is executed at startup, typically used to load the rest of the
application relative to the same directory. The executable just has to be
accompanied by all the needed library modules; the
details
are in the reference manual.
TXR is light in terms of resources. The executable is some 1.7 megabytes
of code (a little more than twice the size of GNU Awk), and the satellite
library modules add up to another megabyte and a half. It has no external
dependencies other than libffi
. The entire project is easy to
build; it just requires GNU Make and GCC or Clang, and a few shell utilities
needed by the configure
script. There are some generated sources,
which are shipped and so the tools for them are not required.
TXR has a small memory footprint. When the executable is started up to its interactive prompt, the memory use is similar to GNU Bash. When compiling its standard library of TXR Lisp code, including complex files such as the compiler, TXR requires a peak of only around 18 megabytes.
TXR is a fusion of many different ideas, a few of which are original, and it is influenced by many languages, such as Common Lisp, Scheme, Awk, M4, POSIX Shell, Prolog, Ruby, Python, Arc, Clojure, S-Lang and others.
TXR consists of two languages, which can be used separately or tangled together: the TXR Pattern Language, and TXR Lisp.
A comparison may be drawn between the TXR Pattern Language and the Unix utility Awk. Both provide an implicit, convenient way of scanning input. Whereas Awk implicitly reads a file, breaking it into records and fields which are accessible as positional variables, TXR has quite a different way of making input handling implicit: namely via a nested, recursive pattern matching notation which binds variables. This approach still handles delimited fields with relative convenience, but generalizes into handling messy, loosely structured data, or data which exhibits different regularities in different sections, etc. Constructs in TXR (the pattern language) aren't imperative statements, but rather pattern-matching directives: each construct terminates by matching, failing, or throwing an exception. Searching and backtracking behaviors are implicit. It has features like structured named blocks with nonlocal exits, structured exception handling, named pattern matching functions, and numerous other features. TXR's pattern language is powerful enough to parse grammars, yet simple to use in an ad-hoc way on trivial tasks. Speaking of Awk, TXR in fact contains an implementation of Awk, in the form of a Lisp macro, which brings us to the next topic.
The other language in TXR is TXR Lisp. This is not an implementation of an existing Common Lisp or Scheme, but a new dialect, which contains many new ideas. TXR Lisp is feature-rich, and oriented toward succinct, convenient expressiveness. While staying completely true to the Lisp heritage, it takes cues from new scripting and functional languages.
Users of mainstream Lisp will find that skills transfer well to and from TXR. There will be features users will miss from TXR Lisp when using others Lisps, and likely vice versa.
The TXR project values brevity in programs: programs should be short and clear. If you're struggling in coming up with a nice solution to a problem, the TXR project wants to hear from you; give a shout to the mailing list. If a program is significantly clearer and shorter in another language, that is of interest to the project; TXR may be able to absorb the technique or something equivalent.
The TXR project is looking for hackers to develop features
TXR has clean, easy to understand and maintain internals that are a pleasure to work with. Be sure to read the HACKING guide.
Here is a collection of TXR Solutions to a number of problems from Rosetta Code.
TXR is truly free software because it is distributed under the two-clause BSD license which allows every conceivable use, commercial and non-commercial.
If you find TXR to be a valuable tool in your arsenal, here is one way to show your appreciation and support! Developing stuff like this takes countless hours.
WARNING: Do not use
TXR packaged by Homebrew! The TXR project has received reports that Homebrew
packages of TXR contain an unstable, unreliable executable. There are
reports from users that they are not even able to build the Homebrew TXR
package themselves (in order to investigate into the problem).
The Homebrew build formula txr.rb
does not run
make tests
, so when the build does succeed, it is not verified
to be good. It is suspected that Homebrew may be using compiler code
generation features that cause instability.