world wide web
A non-hierarchical, non-relational, unstructured, node-linkable
text-based distributed database. The non-structured nature of
the internet has proved to be its strength and shows how
from an epistemological point of view, planned systems can
actually end up being less useful than unplanned systems.
For example, the heirarchical and relational database structures
have both got considerable theoretical advantages over an
unstructured text based database but in practice there is
no database more useful than the world wide web, with
the possible exception of the internet itself.
A web-site which is editable through an html form, or
web-page interface using 'simple' codes.
Software which is never written although it is spoken about
as if it exists. Hypertext and a free Unix were first
created as vaporware before being implemented.
syntax sugar
Programming inorder to make programs more readable. A function
which makes a program more readable.
text transformation
The process of producing different file formats from a normal
text file.
document format
This refers to the way that a file stores a piece of text
data internally. Some formats are 'proprietary', that is,
the format is copyrighted. Other formats are closed which
means that there is no publicly available specification available
for the document format. Some document formats are plain text
formats and others are binary formats.
plain text
Text which belongs to a particular text character set such as
ASCII, latin-1 or unicode. Wysiwyg editing tools normally do
not store the file as plain text. For example, a microsoft
word document is not plain text. That which is not plain text
is more difficult to transform to another format
troff format
An old text layout format originally developed for unix and
maintained by Brian Kernighan, who also wrote AWK and one of
the first books about the C language. The tools for dealing
with this format have been rewritten by the GNU team and
are called groff.
texinfo, info
Another document format which is supposed to be the standard
for the gnu project.
TeX, LaTeX
A document format and typesetting system first created by
Knuth (?). This system is oriented toward the setting out
of mathematical and scientific documents but can be used
for other types of documents.
tool chain
A set of software programs which are used together to
produce an output format. This term is especially used
in relation to xml document formats and particularly the
docbook format. If you see this term used it means there
is probably no 'one-click' method of producing a nicely
formatted document from a source format. You will probably
have to type commands and the shell console.
An xml format for marking up a wide variety of types of
The gnu version of the troff document typesetting system.
A type of operating system.
the unix philosophy, the unix mindset
This is a philosophy based on the unix toolset which says
that complex commands are best built up out of smaller
commands which are glued together used pipes. The philosophy
refers essentially to the idea of keeping programs and
software small and focussed only on one job. This is in
opposition to the encroachment of 'bloatware' which is
software which attempts to absolutely everything and
takes 2 minutes just to start up.
The unix philosophy can only work if there is a common
way to tie the various small programs together. This
is accomplished in the unix shell using pipes. (|)
Plain text in general is the common data format for these
small shell tools.
monolithic software
Software which attempts to do everything possible and
include every possible feature. Unfortunately the java
language seems to encourage monolithic programming as
evidenced by the Netbeans and Eclipse ide programs.
A monolithic text editor which James Gosling was involved
in creating or maintaining.
A very frustating but non monolithic text editor.
The unix stream editor. A potentially useful program to
linguists and those needing to transform text data.
Unfortunately it requires a knowledge of regular expressions
or text patterns in order to use it effectively.
regular expression
A way of describing a pattern within a piece of text.
regular expressions have the advantage of being succint and
capable but are difficult to read, difficult to maintain
and require the user to spend significant time learning to
use them well. Regular expressions are present in many unix
and unix style systems, languages and pieces of software, such
as Perl, Php, Java, sed, grep, awk, vim, emacs etc. However
the exact regular expression syntax can vary slightly between
different implementations which makes it easy to make small
but significant errors.
The unix toolset
A set of programs used from the console or shell which are a
part of the unix operating system. The tools are usually joined
together using 'pipes' and have a few cryptic letters for
their name. The unix toolset uses the philosophy of each program
doing only one task.
A variant of unix or a unix type of operating system, such
as linux or FreeBSD or QNX.
markup is a way of providing information about text by inserting
certain text codes within the text. The markup may provide
information of a semantic nature (some types of xml) or may
provide information about the way that the document should be
displayed (html, troff, lateX)
A particular type of markup used within html and xml which has the
format <tagname attributes>
The process of preparing a document to be displayed on some
display device.
This is a way of linking programs together by using the
output of one program as the input of another program.
A pipe is usually denoted by the symbol |.
A deliberate attempt to break standards.
not the silver bullet that people would like to think.
Code which is dressed up as data. A dogs breakfast.
A dogs breakfast
A terrible mess, something which doesnt look good
A small operating system made by a dutch professor who later
declared that linux was 'obsolete'. Linux Torvalds used
minix as his initial developement platform and inspiration.
GNU Hurd
an example of vaporware, a unix like operating system kernel.
source file
A file which is to be transformed to another format or used
for some other purpose.
output file
The file which results from the transformation of a text file
into another format.