-a
and -r
has been extended,
ndselect has been improved, man pages are provided.
See the User Manual for additional information.
Users are strongly encouraged to upgrade to this new release.
Numdiff (which I will also write numdiff) is a little program that can be used to compare putatively similar files line by line and field by field, ignoring small numeric differences or/and different numeric formats. Equivalently, Numdiff is a program with the capability to appropriately compare files containing numerical fields (and not only).
Whenever you compare a couple of such files,
what you want to obtain usually is a list of the numerical fields in
the second file which numerically differ from the corresponding
fields in the first file.
Well known tools like diff
, cmp
or wdiff
can not be used to this purpose:
they can not recognize whether a difference between two numerical
fields is only due to the notation or is an actual difference of
numerical values.
In addition, sometimes you might also want to ignore differences in
numerical values as long as they do not exceed a certain threshold.
In other words, you could desire to neglect all small numerical
differences too.
However, programs like diff
and wdiff
can not be used
to ignore small numerical differences, since they do not even know
what a numerical difference is.
These are the reasons why I decided to implement Numdiff.
In writing this program I was inspired by ndiff
,
a GPL'ed software by Nelson H. F. Beebe of the Salt Lake City
University, see
http://www.math.utah.edu/~beebe/software/ndiff
ndiff
is a good tool and I used it for a while.
But I did not completely like the way it works and so numdiff
was conceived. Although ndiff
inspired numdiff
, they are completely
different from the viewpoint of the source code: numdiff
has been
entirely written from scratch with addition of source code
from GNU bc, GNU diff and GNUlib.
When comparing files, Numdiff assumes by default
that the fields are separated by white-space
characters (spaces, horizontal tabulations and newlines), but
the user can also specify its list of separators through the
option -s
, see the User Manual.
Numdiff has many features that ndiff
lacks,
for instance it recognizes complex numbers and allows
to specify different sets of field delimiters for the
two files to compare. In addition, starting from version 5
Numdiff includes a filter which allows it not to get confused
if one file contains one or more lines for which there exist
no corresponding lines in the other file.
Also this feature is missing in ndiff
.
I know that many people could find Numdiff simply useless. But people working in Scientific Computing or in Numerical Analysis could find it useful for their job. Often they need to compare a file containing the output produced by a given numerical program, when running in a certain environment, with another file containing the output produced by the same program but in a different environment (by different environment I mean e.g.a different operating system or a different compiler on the same system). Or they need to compare the output of a numerical program, which is made to solve a certain problem, with the one produced by another program, which solves the same problem but using a different algorithm. Finally, sometimes they have to compare the output of a numerical program with a sample file containing a list of expected data (which could have been computed theoretically or come from experiments in a laboratory). In all these situations Numdiff could turn out very helpful, since it also lets the user specify a tolerance for absolute and/or relative differences, then reporting only the fields which differ enough to exceed these tolerances.
To end this presentation, let me say that Numdiff is a console application,
i.e. a computer program designed to be used
via a text-only computer interface, such as a text terminal or
the command line interface of some operating systems.
This means no mouse, no windows, no buttons, no silly icons.
All modern operating systems provide with the Graphical User Interface (GUI)
a program to emulate a text terminal. This program
has different names depending on the operating system you are using:
console, terminal emulator, xterm, rxvt, and so on.
To use Numdiff you have to open the console/terminal emulator,
start to write there some strange commands, and then press the key Enter
to execute them :)
If you do not know how to start with a terminal emulator,
search the web for a user guide and, after reading it carefully,
come back here.
Since one example is often more useful than many words...
Let us suppose that file1
contains the list
of numbers:
1.25 -3.45 1.23456789E-2 -5.98765432e+5 100.00
and file2
the following one:
1.250001 -3.450003 1.23456788E-2 -5.98765431e+5 100.000022
We can compare these two files by calling numdiff
(the name of the program must be written lower case!)
and passing it file1
and file2
as arguments:
numdiff file1 file2
The output of this command will be:
---------------- ##1 #:1 <== 1.25 ##1 #:1 ==> 1.250001 @ Absolute error = 1.0000000000e-6, Relative error = 8.0000000000e-7 ##1 #:2 <== -3.45 ##1 #:2 ==> -3.450003 @ Absolute error = 3.0000000000e-6, Relative error = 8.6956521739e-7 ##1 #:3 <== 1.23456789E-2 ##1 #:3 ==> 1.23456788E-2 @ Absolute error = 1.0000000000e-10, Relative error = 8.1000001393e-9 ##1 #:4 <== -5.98765432e+5 ##1 #:4 ==> -5.98765431e+5 @ Absolute error = 1.0000000000e-3, Relative error = 1.6701030958e-9 ##1 #:5 <== 100.00 ##1 #:5 ==> 100.000022 @ Absolute error = 2.2000000000e-5, Relative error = 2.2000000000e-7 +++ File "file1" differs from file "file2"
This text should be self-explanatory. The tags ##l
and #:f
,
where l
and f
are integer numbers, refer to the
line number and to the position of the field within the line,
respectively. Thus,
##1 #:1 <== 1.25 ##1 #:1 ==> 1.250001 @ Absolute error = 1.0000000000e-6, Relative error = 8.0000000000e-7
means that the first field of the first line is given by
1.25
in the first file, by 1.250001
in the second file.
The absolute difference between these two numbers is
1.0000000000e-6
, while the relative difference is 8.0000000000e-7
.
Numdiff can also print a sort of statistical report about
the numerical differences discovered in the two files. To this
end it is sufficient to specify the option -S
.
If you are interested only in the statistical report
and want to remove from the output the detailed list
of all differences, then you have to specify additionally
the option -q
.
The output of the command numdiff -S -q file1 file2
is:
5 numeric comparisons have been done, all of them have produced an outcome beyond the tolerance threshold Largest absolute error in the set of the major numerical differences: 1.0000000000e-3 Corresponding relative error: 1.6701030958e-9 First occurrence (#line, #field) in the first file: 1, 4 First occurrence (#line, #field) in the second file: 1, 4 Largest relative error in the set of the major numerical differences: 8.6956521739e-7 Corresponding absolute error: 3.0000000000e-6 First occurrence (#line, #field) in the first file: 1, 2 First occurrence (#line, #field) in the second file: 1, 2 Sum of all absolute errors: 1.0260001000e-3 Sum of the major absolute errors: 1.0260001000e-3 Arithmetic mean of all absolute errors: 2.0520002000e-4 Arithmetic mean of the major absolute errors: 2.0520002000e-4 Square root of the sum of the squares of all absolute errors: 1.0002469695e-3 Quadratic mean of all absolute errors: 4.4732404362e-4 Square root of the sum of the squares of the major absolute errors: 1.0002469695e-3 Quadratic mean of the major absolute errors: 4.4732404362e-4
You can specify an absolute error tolerance (or a relative
error tolerance) by means of the option -a
(-r
).
If an absolute error tolerance is specified,
numdiff
only reports the absolute differences
exceeding that tolerance. For instance, the output
of numdiff -a 1.0e-5 file1 file2
will be
---------------- ##1 #:4 <== -5.98765432e+5 ##1 #:4 ==> -5.98765431e+5 @ Absolute error = 1.0000000000e-3, Relative error = 1.6701030958e-9 ##1 #:5 <== 100.00 ##1 #:5 ==> 100.000022 @ Absolute error = 2.2000000000e-5, Relative error = 2.2000000000e-7 +++ File "file1" differs from file "file2"
Numdiff can also recognize non-numerical differences between two files.
If a certain field in any of the two compared files is of non-numerical type,
then, instead of performing a numeric comparison, Numdiff will
simply perform a literal (character by character) comparison.
For example, if the file example1
contains the line
1.0 xyz 3.0 x y
and the file example2
the line
abc 1.1 3.3 x z
then numdiff example1 example2
will display
---------------- ##1 #:1 <== 1.0 ##1 #:1 ==> abc @ @@ ##1 #:2 <== xyz ##1 #:2 ==> 1.1 @ @@ ##1 #:3 <== 3.0 ##1 #:3 ==> 3.3 @ Absolute error = 3.0000000000e-1, Relative error = 1.0000000000e-1 ##1 #:5 <== y ##1 #:5 ==> z @ @@ +++ File "example1" differs from file "example2"
The most appealing feature of Numdiff is the
ability to detect insertions/deletions
of lines, similarly to what diff
does,
through activation of a filter.
Suppose that the files list1
and list2
contain the data
Additional_line_which_creates_confusion Additional_line_which_creates_confusion +1.000 +2.510 +10.022
and
+1.003 +2.500 +10.000 Final_line_which_creates_confusion
respectively. What you would expect to find in
the report displayed by Numdiff is, that list1
contains two
lines at the beginning which are not present in list2
,
that the last line of list2
is not present
in list1
and finally, that the three numerical values
in list2
differ from the corresponding values
in list1
together with the specifications of absolute and relative errors.
But the output of the command numdiff list1 list2
differs from
your expectations, since this is what Numdiff reports:
---------------- ##1 #:1 <== Additional_line_which_creates_confusion ##1 #:1 ==> +1.003 @ @@ ---------------- ##2 #:1 <== Additional_line_which_creates_confusion ##2 #:1 ==> +2.500 @ @@ ---------------- ##3 #:1 <== +1.000 ##3 #:1 ==> +10.000 @ Absolute error = 9.0000000000e+0, Relative error = 9.0000000000e+0 ---------------- ##4 #:1 <== +2.510 ##4 #:1 ==> Final_line_which_creates_confusion @ @@ ---------------- ##5 <== +10.022 ==> *** End of file "list2" reached Likely the files "list1" and "list2" do not have the same number of lines ! +++ File "list1" differs from file "list2"
By default Numdiff compares indeed the first, second, third line
of the first file (in this case list1
) with
the first, second, third line of the second file (list2
),
and so on. If one of the two compared files
contain one or more lines for which there
exist no corresponding lines in the other file,
Numdiff gets confused and displays a wrong output.
The filtering mechanism implemented in Numdiff since version 5
can detect such situations and re-synchronize the
two files to obtain the final expected result.
For instance, the command numdiff -z @ list1 list2
,
which activates the filter through the option -z @
,
will print
---------------- ##1 <== Additional_line_which_creates_confusion ==> ---------------- ##2 <== Additional_line_which_creates_confusion ==> ---------------- ##3 #:1 <== +1.000 ##1 #:1 ==> +1.003 @ Absolute error = 3.0000000000e-3, Relative error = 3.0000000000e-3 ---------------- ##4 #:1 <== +2.510 ##2 #:1 ==> +2.500 @ Absolute error = 1.0000000000e-2, Relative error = 4.0000000000e-3 ---------------- ##5 #:1 <== +10.022 ##3 #:1 ==> +10.000 @ Absolute error = 2.2000000000e-2, Relative error = 2.2000000000e-3 ---------------- <== ##4 ==> Final_line_which_creates_confusion +++ File "list1" differs from file "list2"
The use of the filter can be sometimes tricky, see the User Manual for more examples and additional explanations.
Numdiff has many more options and features. In the User Manual you can find a detailed description of them.
On Unix(R) and GNU systems, like GNU/Linux, configuration, building and installation of Numdiff can be performed through the standard three steps:
./configure make make install
This works under the assumption that the target system for installation
supplies an ANSI C compiler, a POSIX implementation of the make
utility,
and a shell sh-compatible.
The compiler should at least accept the option -o
to write its output to a specified file,
the option -D
for macros pre-definition,
the option -l
to search for a specified library,
and the options -I
and -L
to add a given directory to the search path for include and
library files, respectively.
If you want to install the documentation also in the GNU Info
format, then you need additionally a proper installation of GNU Texinfo.
Finally, a proper installation of GNU Gettext is needed
if you care about support for languages other than english
(at the moment only the Italian localization is available).
If you leave enabled the Natural Language Support and you
want to install also the localization files, after
make
you will have to type and launch
make install-nls
By default, make install
will install all the files in
/usr/local/bin
, /usr/local/info
, etc. You can specify
an installation prefix different from /usr/local
by using the
option --prefix
in the configure
step,
for instance --prefix=$HOME
:
./configure --prefix=$HOME
Type ./configure --help
to obtain
the complete list of all available options.
Once Numdiff has been installed, you can remove all
files previously installed by a simple make uninstall
.
If you have also installed the localization files trough
make install-nls
, then, in order to remove
these ones too, use make uninstall-nls
in place
of make uninstall
.
Look at chapter 4 of the User Manual if you need more information on how to compile, build and install Numdiff.
The target installation directory specified by means of the configuration
option --prefix
cannot contain white spaces:
make install
does not work at all when the target installation directory
is in a path which includes a white space (blank or tab).
It is fairly easy to fix this issue in Makefile.in
with some double quotes
around each usage of $(DESTDIR)
, but unfortunately also the installation
script GNU-shtool (of which Numdiff includes the current version) is broken.
Numdiff (also written numdiff) is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.
Numdiff is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.
You should have received a copy of the GNU General Public License along with this program. If not, see http://www.gnu.org/licenses/.
Bug reports have to be sent to the address <ivprimi (at) libero (dot) it>. Please, put Numdiff in the subject and indicate the version of the operating system you are running (in particular, do not forget to specify if it is a 32- or a 64-bit system), and, if you know it, the version of the compiler used to build Numdiff. Please write also whether your version of Numdiff uses the GNU MP library or not. Before writing an email be sure to run the latest stable version of Numdiff, I do not provide support for older versions.
The tar-gzipped archive with the source code of Numdiff can be downloaded from
http://savannah.nongnu.org/download/numdiff
The latest stable release of Numdiff is provided by version 5.9.0. Together with the source code, the archive contains a very detailed user manual (in English). The manual, which was written by using GNU Texinfo, is available in the following formats:
Permission is granted to copy, distribute and/or modify this manual under the terms of the GNU Free Documentation License, Version 1.3 or any later version published by the Free Software Foundation. A copy of the license is always included in the section entitled "GNU Free Documentation License". You can also obtain a copy of the GNU Free Documentation License from http://www.gnu.org/copyleft/.
The manual of Numdiff can also be browsed online here.
First I want to thank all the people till now involved in the Free Software community, starting from those ones directly involved in the GNU project (http://www.gnu.org). Without their great work, this little one would have never been done.
I have also to thank Aurelio Marinho Jargas (verde@aurelio.net), author of txt2tags (http://txt2tags.sf.net), a free (GPL'ed) and wonderful text formatting and conversion tool, which I used in writing this web page.
Many thanks also to Mr. Norman Clerman of Opcon Associates, Inc. for several suggestions he gave me to improve the readability and the effectiveness of the output produced by Numdiff. He also pointed out the need to implement a filter for resynchronizing the lines between two files in case of addition or deletion of one or more lines. I have to give him credit for the urge to prepare the versions 4.x and 5.x of Numdiff.
Finally, I want to thank my friends Mariapia Palombaro,
since she removed some errors while
reviewing the first version of this document, and
Paolo Caramanica, who suggested me to add more
information to the output of the option -S
of Numdiff.