Welcome to the Tensile home page
Introduction
Tensile (formerly NSL) is a programming language intended primarily for processing
text documents in various input formats and in various languages. It is being developed as to be
as light-weight as possible, however, providing a possibility to solve a wide range of tasks.
It can be used as a stand-alone tool, as well as a CGI engine. It is not intended to be embeddable
like Tcl, but since the interpreter is very compact, it can be attached to an application without
great overhead.
Tensile should be easy to learn (though the lack of documentation as of yet may be a considerable
obstacle ;). Its syntax is much simplier than that of perl or even awk and is more like Tcl or csh. It has, however, some peculiarities in syntax, as well as in programming techniques, so it would probably require some time to get accustomed to.
However, Tensile is not a quick-development language. Its core does not and shall not include
'complete solutions'. Inspite of its rather high level, it should be regarded as a toolbox by the
means of which a programmer may implement what he wants. Only such approach (IMHO) may allow to
keep the language small, efficient, easy both to learn and to use.
The Tensile interpreter is really compact — a stripped binary image (under Linux/Intel) occupies only
about 100k! (this doesn't include support functions which are moved to
a separate library, libutils which is about 80k). It also requires
DB1 or GDBM (but these are present on most systems) and libltdl which comes
with the package (just in case). And I am strongly inclined to maintain
this compactness — new features should not be coded into the core but rather implemented as
separate modules. Besides, several compile-time options are provided to trigger some of the
language elements.
Tensile is meant to be portable. Its core doesn't use any features beyond ISO C
+ POSIX library functions.
A list of (more or less) successful builds on various platforms will be available
really soon
ATTENTION!
The fate of Tensile depends on you!
Your feedback is vitally important for developing Tensile, or how else could I
learn what Tensile lacks, what bugs/defects it has and so on.
So, people who got interested in Tensile! If you use it (or tried), please,
send you comments, requests, advices, swearings etc.
|
History
I started working on Tensile in 2000 when I was faced with a task of extracting structured data
from free-form texts. One of the basic problems was that those texts were not in plain-text format
and contained a lot of non-Latin1 characters in various fonts most of which complied to no standard
encoding. After several attempts to hard-code all that I need in C, I realized that a higher-level
language was necessary. However, peculiarities of the task would require a deep knowledge of any
such language (like Perl). And as my knowledge was insufficient and I am in nature too lazy to
deepen it, so I decided to write my own HLL which would better meet my needs than any others now
existant.
So, in June, 2001 the core of the language was complete. Then I thought that
my work might be useful for other members of the programmers' community. On
the other hand, there is a lot of work to be done so that Tensile would
become a widely usable language. With this in mind, I decided to take
Advantage of the services provided by SourceForge. However, SF seemed unwilling
to provide any feedback, so I left it for GNU Savannah (
savannah.gnu.org)
The Savannah services for Tensile can be found
here.
Up to 01 Jun 2002 the language was called NSL (abbr. of
'New Programming Language). However after some thoughts it changed its name to Tensile which is
an unobvious abbreviation of ThE New Scripting LanguagE. It was the only word that
more or less fitted the primary abbreviation, and at the same time it reveals high flexibility
and extensibility of the language.
The freshest downloads can be get from
Savannah and
ILI RAN.
FTP access is available via sunsite.dk
The reference guide for Tensile is found
here. Note it is
incomplete; if you have questions, use Savannah
mailing lists or feel free to
mail me directly.
There are three main concepts which make Tensile differ from other languages.
Automata
All the complex string translations (such as case mapping, converting from one encoding to another,
producing collation sequences etc) are being done with the help of user-defined finite-state
automata*.
Unlike many other languages, Tensile does not use Unicode internally, since IMHO it would greatly hurt
portability and compactness (well, I may be wrong :).
Automata may be grouped into sequences where the output of one automaton is the input for another
(similar to Unix pipes). Such sequences are further referred to as autoseqs
* More exactly, what is used in Tensile are pushdown
transducers,
not finite-state autotomata, but the former term is rarely used nowadays, so I used the latter as
a more comprehensive one.
The idea was inspired by J. Plaice's and Y. Haralambous's
Omega
project, but the implementation qis at all different and shares no code with that.
Storages
Tensile does not have real structured data types (in principle, it has a single
data type — a string). Their role is played by storages. A storage is
an abstraction of a data collection with opaque internal structures. The user
always operates with storages as sets of 'key-value' pairs, no matter whether in
fact it is an array, a table (a dictionary), or an SQL query result — it's only
the form of the key that differs. The user now may define her own storage types.
Streams
Tensile programs deal not with OS-level files, but with streams. A stream is thought of as a flow
of raw text interlaced with markup tags. The exact way how those tags are formed is hidded with a
stream driver, so that an application always deals with an HTML-like structures.
As of now only plain-text and HTML stream drivers are implemented (and also
special streams for CGI support).
Besides, a user may define her own stream types by the means of the language itself.
Note that streams define only data layout not the way they're stored. For
those stream types which do not impose the physical representation, the
latter is determined by a flow which is analogous to a protocol
prefix of an URL. Now the following flow types exist in the core: ordinary files, pipes and
Tensile strings. Other flow types may be (and are) defined by extension modules and,
with some limitations, by Tensile programs.
There are yet some things to be mentioned.
- Tensile is an extensible language (as its name shows).
The user can use functions written in
a compiled language via loadable modules,
as well as define new stream and storage types.
- Tensile possess all the contol statements which a decent language should
possess. Their form is more or less modelled after those of C. Tensile is not
unaware of modular programming. A program may load a piece of code to run in a separate
environment or subinterpreter. Environments communicate via
pools of shared variables and exporting/importing sharable objects
which are storages, streams, procedures, autoseqs and environments themselves.
- Tensile is not an OO language, and I hope it never be.
However, the OOP may be
modelled in the language if necessary, since it can forward a procedure call
to a given environment.
TO DO
A lot of things:
- document it properly!
- port to other platforms and test thoroughly
- implement various useful auxiliary tools, code libraries,
thunks to widely used libraries and so on
The author and maintainer of this project is Artem V. Andreev.
Mail me to artem@AA5779.spb.edu