-a and -r has been extended,
ndselect has been improved, man pages are provided.
See the User Manual for additional information.
Users are strongly encouraged to upgrade to this new release.
-v.
See the User Manual for additional information.
Numdiff (which I will also write numdiff)
is a little program that can be used to compare
putatively similar files line by line and field by field,
ignoring small numeric differences or/and different numeric formats.
Equivalently, Numdiff is a program with the capability to appropriately
compare files containing numerical fields (and not only).
By default, Numdiff assumes the fields are separated by white-space
characters (spaces, horizontal tabulations and newlines), but
the user can also specify its list of separators through the
option -s, see the User Manual.
When you compare a couple of such files,
what you want to obtain usually is a list of the numerical fields in
the second file which numerically differ from the corresponding
fields in the first file.
Well known tools like diff, cmp or wdiff
can not be used to this purpose:
they can not recognize whether a difference between two numerical
fields is only due to the notation or is actually a difference of
numerical values.
Moreover, you could also want to ignore differences in
numerical values as long as they do not overcome a certain threshold.
In other words, you could desire to neglect all small numerical
differences too.
However, programs like diff and wdiff can not be used
to ignore small numerical differences, since they do not even know
what a numerical difference is.
That is why I decided to implement Numdiff.
In writing this program I was inspired by ndiff,
a GPL'ed software by Nelson H. F. Beebe of the Salt Lake City
University, see
http://www.math.utah.edu/~beebe/software/ndiff
ndiff is a good tool and I used it for a while,
but I did not completely like the way it works and so numdiff
was born. Although ndiff inspired numdiff, they are completely
different from the viewpoint of the source code: numdiff has been
entirely written from scratch with addition of source code
from GNU bc, GNU diff and GNUlib.
Numdiff has also many features that ndiff lacks,
for instance it recognizes complex numbers and allows
to specify different sets of field delimiters for the
two files to compare. Starting from version 5
Numdiff includes a filter which avoids it to get confused
by the presence in one of the files to compare of
one or more lines for which there exist no
corresponding lines in the other file. Also this
feature is absent in ndiff.
I know that many people could find Numdiff simply useless. But people working in Scientific Computing or in Numerical Analysis could find it useful for their job. Since one might compare a file containing the output produced by a given numerical program, when it runs in a certain environment, with another file containing the output produced by the same program but in a different environment. By different environment I mean e.g. a different operating system or a different compiler on the same system. Other times one has to compare the output of a numerical program, which is made to solve a certain problem, with the one produced by another program, which solves the same problem but using a different algorithm. Finally, one might compare the output of a numerical program with a sample file containing a list of expected data (which could have been computed theoretically or come from experiments in a laboratory). In all these situations Numdiff could turn out very helpful, since it also lets the user specify a tolerance for absolute and/or relative differences, then reporting only the fields which differ enough to exceed these tolerances.
To end this presentation, I have to say that
Numdiff is a console application, i.e. a computer program designed to be used
via a text-only computer interface, such as a text terminal or
the command line interface of some operating systems.
This means no mouse, no windows, no buttons, no silly icons.
All modern operating systems provide with the Graphical User Interface (GUI)
a program to emulate a text terminal. This program
has different names depending on the operating system that you are using:
console, terminal emulator, xterm, rxvt, and so on.
To use Numdiff you have to open the console/terminal emulator,
start to write there some strange commands, and then press the key Enter
to execute them :)
If you do not know how to start with a terminal emulator,
search the web for a user guide and, after reading it carefully,
come back here.
Because a sample is often more useful than many words...
Let us suppose that file1 contains the list
of numbers:
1.25 -3.45 1.23456789E-2 -5.98765432e+5 100.00
while file2 the following one:
1.250001 -3.450003 1.23456788E-2 -5.98765431e+5 100.000022
We can compare these two files by calling numdiff
(the name of the program must be written lower case !)
and passing it file1 and file2 as arguments:
numdiff file1 file2
The output of this command will be:
---------------- ##1 #:1 <== 1.25 ##1 #:1 ==> 1.250001 @ Absolute error = 1.0000000000e-6, Relative error = 8.0000000000e-7 ##1 #:2 <== -3.45 ##1 #:2 ==> -3.450003 @ Absolute error = 3.0000000000e-6, Relative error = 8.6956521739e-7 ##1 #:3 <== 1.23456789E-2 ##1 #:3 ==> 1.23456788E-2 @ Absolute error = 1.0000000000e-10, Relative error = 8.1000001393e-9 ##1 #:4 <== -5.98765432e+5 ##1 #:4 ==> -5.98765431e+5 @ Absolute error = 1.0000000000e-3, Relative error = 1.6701030958e-9 ##1 #:5 <== 100.00 ##1 #:5 ==> 100.000022 @ Absolute error = 2.2000000000e-5, Relative error = 2.2000000000e-7 +++ File "file1" differs from file "file2"
This text should be self-explanatory. The tags ##l and #:f,
where l and f are integer numbers, refer respectively to the
line number and to the position of the field within the line.
Then
##1 #:1 <== 1.25 ##1 #:1 ==> 1.250001 @ Absolute error = 1.0000000000e-6, Relative error = 8.0000000000e-7
means that the first field of the first line is given by
1.25 in the first file, 1.250001 in the second one.
The absolute difference between these two numbers is
1.0000000000e-6, while the relative difference is
given by 8.0000000000e-7.
Numdiff can also print a sort of statistical report about
the numerical differences discovered in the two files. To this
end is sufficient to specify the option -S.
If you are interested only in the statistical report
and you want to remove from the output the detailed list
of all differences, then you have to specify additionally
the option -q.
The output of the command numdiff -S -q file1 file2 is:
5 numeric comparisons have been done, all of them have produced an outcome beyond the tolerance threshold Largest absolute error in the set of relevant numerical differences: 1.0000000000e-3 Corresponding relative error: 1.6701030958e-9 Largest relative error in the set of relevant numerical differences: 8.6956521739e-7 Corresponding absolute error: 3.0000000000e-6 Sum of all absolute errors: 1.0260001000e-3 Sum of the relevant absolute errors: 1.0260001000e-3 Arithmetic mean of all absolute errors: 2.0520002000e-4 Arithmetic mean of the relevant absolute errors: 2.0520002000e-4 Square root of the sum of the squares of all absolute errors: 1.0002469695e-3 Quadratic mean of all absolute errors: 4.4732404362e-4 Square root of the sum of the squares of the relevant absolute errors: 1.0002469695e-3 Quadratic mean of the relevant absolute errors: 4.4732404362e-4 +++ File "file1" differs from file "file2"
You can specify an absolute error tolerance (or a relative
error tolerance) by the option -a (-r).
If the user specifies an absolute error tolerance,
numdiff only reports the absolute differences
exceeding that tolerance. For instance, the output
of numdiff -a 1.0e-5 file1 file2 will be
---------------- ##1 #:4 <== -5.98765432e+5 ##1 #:4 ==> -5.98765431e+5 @ Absolute error = 1.0000000000e-3, Relative error = 1.6701030958e-9 ##1 #:5 <== 100.00 ##1 #:5 ==> 100.000022 @ Absolute error = 2.2000000000e-5, Relative error = 2.2000000000e-7 +++ File "file1" differs from file "file2"
Numdiff can also recognize non-numerical differences
between the files given to it as arguments. If a certain
field in at least one of the two files is of non-numerical type,
then, instead of performing a numeric comparison, Numdiff will
simply do a literal (character by character) comparison.
If the file example1 contains the line
1.0 xyz 3.0 x y
and the file example2 the line
abc 1.1 3.3 x z
then numdiff example1 example2 displays
---------------- ##1 #:1 <== 1.0 ##1 #:1 ==> abc @ @@ ##1 #:2 <== xyz ##1 #:2 ==> 1.1 @ @@ ##1 #:3 <== 3.0 ##1 #:3 ==> 3.3 @ Absolute error = 3.0000000000e-1, Relative error = 1.0000000000e-1 ##1 #:5 <== y ##1 #:5 ==> z @ @@ +++ File "example1" differs from file "example2"
The most appealing feature of Numdiff is the
ability to detect insertions/deletions
of lines, similarly to what diff does,
through activation of a filter.
Let us suppose that the files list1 and
list2 contain the data
Additional_line_which_creates_confusion Additional_line_which_creates_confusion +1.000 +2.510 +10.022
and
+1.003 +2.500 +10.000 Final_line_which_creates_confusion
respectively. What you would expect to find in
the report displayed by Numdiff is that list1 contains two
lines at the begin which are not present in list2,
that the last line of list2 is not present
in list1, and finally that the three numerical values
in list2 differ from the corresponding values
in list1 with indication of the absolute and relative errors.
But the output of the command numdiff list1 list2, namely
----------------
##1 #:1 <== Additional_line_which_creates_confusion
##1 #:1 ==> +1.003
@ @@
----------------
##2 #:1 <== Additional_line_which_creates_confusion
##2 #:1 ==> +2.500
@ @@
----------------
##3 #:1 <== +1.000
##3 #:1 ==> +10.000
@ Absolute error = 9.0000000000e+0, Relative error = 9.0000000000e+0
----------------
##4 #:1 <== +2.510
##4 #:1 ==> Final_line_which_creates_confusion
@ @@
----------------
##5 <== +10.022
==>
*** End of file "list2" reached
Likely the files "list1" and "list2" do not have the same number of lines !
+++ File "list1" differs from file "list2"
differs from your expectations. By default Numdiff compares indeed
the first, second, third line
of the first file (in this case list1) with
the first, second, third line of the second file (list2),
and so on. If in one of the two files to compare
there are one or more lines for which there
exist no corresponding lines in the other file,
then Numdiff gets confused and displays a wrong output.
The filtering mechanism implemented in Numdiff since version 5
can detect such situations and re-synchronize the
two files to obtain the final expected result.
For instance, the command numdiff -z @ list1 list2,
which activates the filter through the option -z @,
outputs
----------------
##1 <== Additional_line_which_creates_confusion
==>
----------------
##2 <== Additional_line_which_creates_confusion
==>
----------------
##3 #:1 <== +1.000
##1 #:1 ==> +1.003
@ Absolute error = 3.0000000000e-3, Relative error = 3.0000000000e-3
----------------
##4 #:1 <== +2.510
##2 #:1 ==> +2.500
@ Absolute error = 1.0000000000e-2, Relative error = 4.0000000000e-3
----------------
##5 #:1 <== +10.022
##3 #:1 ==> +10.000
@ Absolute error = 2.2000000000e-2, Relative error = 2.2000000000e-3
----------------
<==
##4 ==> Final_line_which_creates_confusion
+++ File "list1" differs from file "list2"
The use of the filter can be sometimes tricky, see the User Manual for more examples and additional explanations.
Numdiff has many more options and features. In the User Manual you can find a detailed description of them.
On Unix(R) and GNU systems, like GNU/Linux, configuration, building and installation of Numdiff can be performed through the standard three steps:
./configure
make
make install
provided that the system supplies
an ANSI C compiler, a POSIX implementation of the make utility
and a shell sh-compatible.
The compiler should at least accept the option -o
to write its output to a specified file,
the option -D for macros pre-definition,
the option -l to search for a specified library,
and the options -I and -L
to add a given directory to the search path for include and
library files respectively.
If you want to install the documentation also in the GNU Info
format, then you need additionally a proper installation of GNU Texinfo.
Finally, a proper installation of GNU Gettext is needed
if you care about support for languages other than english
(at the moment only the Italian localization is available).
If you leave enabled the Natural Language Support and you
want to install also the localization files, then, after
make, you will have to type and run
make install-nls
By default, make install will install all the files in
/usr/local/bin, /usr/local/info, etc. You can specify
an installation prefix other than /usr/local using the
option --prefix in the configure step,
for instance --prefix=$HOME:
./configure --prefix=$HOME
Type ./configure --help to obtain
the complete list of all the available options.
Once Numdiff has been installed you can remove all the
files previously installed by a simple make uninstall.
If you have also installed the localization files trough
make install-nls, then, in order to remove
these ones too, use make uninstall-nls in place
of make uninstall.
Look at chapter 4 of the User Manual if you need more information on how to compile, build and install Numdiff.
Numdiff (also written numdiff) is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.
Numdiff is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.
You should have received a copy of the GNU General Public License along with this program. If not, see http://www.gnu.org/licenses/.
Bug reports have to be sent to the address ivprimi@libero.it . Please, put Numdiff in the subject and indicate the version of the operating system you are running and, if you know it, the version of the compiler used to build Numdiff. Before writing an email be sure to run the latest stable version of Numdiff, I do not provide support for older versions.
The tar-gzipped archive with the source code of Numdiff can be downloaded from
http://savannah.nongnu.org/download/numdiff
The latest stable release of Numdiff is given by the version 5.6.1. Together with the source code, the archive contains a very detailed user manual (in English). The manual, which has been written by using GNU Texinfo, is available in the following formats:
Permission is granted to copy, distribute and/or modify this manual under the terms of the GNU Free Documentation License, Version 1.3 or any later version published by the Free Software Foundation; with no Invariant Sections, with the Front-Cover Texts being "Numdiff User Manual, version 5.0", and with no Back-Cover Texts. A copy of the license is always included in the section entitled "GNU Free Documentation License". You can also obtain a copy of the GNU Free Documentation License from http://www.gnu.org/copyleft/.
The manual of Numdiff can also be browsed online here.
First I want to thank all the people till now involved in the Free Software community, starting from those ones directly involved in the GNU project (http://www.gnu.org). Without their great work, this little one would have never been done.
Moreover, I have to thank Aurelio Marinho Jargas (verde@aurelio.net), author of txt2tags (http://txt2tags.sf.net), a free (GPL'ed) and wonderful text formatting and conversion tool, which I used in writing this web page.
I want to thank Mr. Norman Clerman of Opcon Associates, Inc. for several suggestions he gave me to improve the readability and the effectiveness of the output produced by Numdiff. He also pointed out the need to implement a filter to resynchronize the lines between two files in case of addition or deletion of one or more lines. I have to give him credit for the urge to prepare the versions 4.x and 5.x of Numdiff.
Moreover, I want to thank my friends Mariapia Palombaro,
since she removed some errors while
reviewing the first version of this document, and
Paolo Caramanica, who suggested me to add more
information to the output of the option -S.