A Few Terms


(*) world wide web
A non-hierarchical, non-relational, unstructured, node-linkable text-based distributed database. The non-structured nature of the internet has proved to be its strength and shows how from an epistemological point of view, planned systems can actually end up being less useful than unplanned systems. For example, the heirarchical and relational database structures have both got considerable theoretical advantages over an unstructured text based database but in practice there is no database more useful than the world wide web, with the possible exception of the internet itself.
(*) wiki
A web-site which is editable through an html form, or web-page interface using 'simple' codes.
(*) vaporware
Software which is never written although it is spoken about as if it exists. Hypertext and a free Unix were first created as vaporware before being implemented.
(*) syntax sugar
Programming inorder to make programs more readable. A function which makes a program more readable.
(*) text transformation
The process of producing different file formats from a normal text file.
(*) document format
This refers to the way that a file stores a piece of text data internally. Some formats are 'proprietary', that is, the format is copyrighted. Other formats are closed which means that there is no publicly available specification available for the document format. Some document formats are plain text formats and others are binary formats.
(*) plain text
Text which belongs to a particular text character set such as ASCII, latin-1 or unicode. Wysiwyg editing tools normally do not store the file as plain text. For example, a microsoft word document is not plain text. That which is not plain text is more difficult to transform to another format
(*) troff format
An old text layout format originally developed for unix and maintained by Brian Kernighan, who also wrote AWK and one of the first books about the C language. The tools for dealing with this format have been rewritten by the GNU team and are called groff.
(*) texinfo, info
Another document format which is supposed to be the standard for the gnu project.
(*) TeX, LaTeX
A document format and typesetting system first created by Knuth (?). This system is oriented toward the setting out of mathematical and scientific documents but can be used for other types of documents.
(*) tool chain
A set of software programs which are used together to produce an output format. This term is especially used in relation to xml document formats and particularly the docbook format. If you see this term used it means there is probably no 'one-click' method of producing a nicely formatted document from a source format. You will probably have to type commands and the shell console.
(*) Docbook
An xml format for marking up a wide variety of types of documents.
(*) groff
The gnu version of the troff document typesetting system.
(*) unix
A type of operating system.
(*) the unix philosophy, the unix mindset
This is a philosophy based on the unix toolset which says that complex commands are best built up out of smaller commands which are glued together used pipes. The philosophy refers essentially to the idea of keeping programs and software small and focussed only on one job. This is in opposition to the encroachment of 'bloatware' which is software which attempts to absolutely everything and takes 2 minutes just to start up.

The unix philosophy can only work if there is a common way to tie the various small programs together. This is accomplished in the unix shell using pipes. (|) Plain text in general is the common data format for these small shell tools.

(*) monolithic software
Software which attempts to do everything possible and include every possible feature. Unfortunately the java language seems to encourage monolithic programming as evidenced by the Netbeans and Eclipse ide programs.
(*) Emacs
A monolithic text editor which James Gosling was involved in creating or maintaining.
(*) Vim
A very frustating but non monolithic text editor.
(*) sed
The unix stream editor. A potentially useful program to linguists and those needing to transform text data. Unfortunately it requires a knowledge of regular expressions or text patterns in order to use it effectively.
(*) regular expression
A way of describing a pattern within a piece of text. regular expressions have the advantage of being succint and capable but are difficult to read, difficult to maintain and require the user to spend significant time learning to use them well. Regular expressions are present in many unix and unix style systems, languages and pieces of software, such as Perl, Php, Java, sed, grep, awk, vim, emacs etc. However the exact regular expression syntax can vary slightly between different implementations which makes it easy to make small but significant errors.
(*) The unix toolset
A set of programs used from the console or shell which are a part of the unix operating system. The tools are usually joined together using 'pipes' and have a few cryptic letters for their name. The unix toolset uses the philosophy of each program doing only one task.
(*) *nix
A variant of unix or a unix type of operating system, such as linux or FreeBSD or QNX.
(*) markup
markup is a way of providing information about text by inserting certain text codes within the text. The markup may provide information of a semantic nature (some types of xml) or may provide information about the way that the document should be displayed (html, troff, lateX)
(*) tag
A particular type of markup used within html and xml which has the format <tagname attributes>
(*) render
The process of preparing a document to be displayed on some display device.
(*) pipe
This is a way of linking programs together by using the output of one program as the input of another program. A pipe is usually denoted by the symbol |.
(*) enhancement
A deliberate attempt to break standards.
(*) xml
not the silver bullet that people would like to think.
(*) xsl
Code which is dressed up as data. A dogs breakfast.
(*) A dogs breakfast
A terrible mess, something which doesnt look good
(*) minix
A small operating system made by a dutch professor who later declared that linux was 'obsolete'. Linux Torvalds used minix as his initial developement platform and inspiration.
(*) GNU Hurd
an example of vaporware, a unix like operating system kernel.
(*) source file
A file which is to be transformed to another format or used for some other purpose.
(*) output file
The file which results from the transformation of a text file into another format.
(top)