Jumble
-
(*)
What is the Jumble project?
-
umble is an attempt to write a text transformation engine,
similar to those used by wiki applications to turn 'plain text'
documents into rendered visual documents. At this stage,
a rendered visual document means an html web page but hopefully
other formats will be supported, such as pdf, TeX, man pages,
DocBook xml, ... which is an extremely ambitious list.
-
(*)
How does this system differ from the numerous wiki
transformation engines and various text2html scripts that
exist in all sorts of colours shapes and sizes?
-
From a technical point of view, this project differs in that
an attempt is made to parse the document rather than just
use patterns and regular expressions to transform the
document.
From a users perspective, this system does not use a
standard wiki markup language but tries to use the
minimal possible semantic markup which will allow the
parser code to understand what sort of document and
text is being dealt with. This approach means that
the source text documents remain, in some sense, clean,
in that they do no become weighed down with codes and
hashes and funny punctuation marks.
I believe this is an advantage in that the authors of the
documents are not distracted by all these wiki codes.
-
(*)
What is the current state of the project?
-
The code contains some useful components for recognizing
and rendering various structures in plain text documents,
including, FAQ blocks, lists, links, paragraph breaks,
etc. On a document level the faq component is at
a useful stage.
-
(*)
How can I use the code?
-
You can use the code here to transform a text file in a
'frequently asked questions' format into an html web page.
This is possible by downloading the file Web.jar (about 60k) and then running the command in a dos box- type:
java -cp Web.jar FaqDocument textfile > webpage.html
or if you only have the Microsoft Virtual machine you can
try, for example:
jview /cp Web.jar FaqDocument textfile > webpage.html
This will create a webpage in an faq format and leave the
text file unchanged. In order to see what sort of format
the text file should be in you can look at the source for
the current page at web-faq.txt If you do not have
any Java Virtual Machine (Runtime Engine) then you cannot
use this code until you get one.
-
(*)
Is there a wiki here?
-
no. but maybe a few tools which could be used to create a wiki.
There is a component to create faq lists in HTML and there is
a component to create directory listings. This means that there
is some code to do wiki style transformations of text but there
is no interface for entering that text.
-
(*)
What is a wiki?
-
The wiki was invented by someone called Ward Cunningham
http://c2.com/ and is a way to edit web pages without using html.
The philosophy of a wiki is that any visitor to a webpage should
be allowed to edit it, using a normal html form, although in
practice site normally place restrictions on the way that
visitors can edit the site.
-
(*)
what is the edit directory?
-
this /edit/ directory contains the beginnings of a text editor
written in java. The editor is orientated towards saving
on a ssh server via sftp. this is mainly because this is
the only way to save to the source forge server.
-
(*)
How do I write an faq document?
-
Look at the file web-faq.txt which is the souce for
the webpage (assuming that you are reading the page
http://bumble.sf.net/lang/web/web-faq.html )
This will show you the format for writing an faq document such
as this one. The format is reasonably simple.
-
(*)
what are these strange codes with square brackets in the
text files?
-
these are a way to provide some structure in an unstructured
text document. The codes indicate to the transformation engines
the sort of text which it is dealing with. For example this faq
document is enclosed in [ faq] [ /faq] tags to indicate to the
transformation engine that it is dealing with an FAQ style
document. Html also uses tags, but many more. The idea of this
transformation engine is to use as few tags as possible and to
make them of a semantic nature rather than of a visual or layout
nature.
For example, the FAQ tags say something about the 'meaning'
of the text within the tags rather than saying anything about
how the document should be layed out or formatted when it is
displayed visually. The idea of this is to remove from the writer
the burden of having to decide how the document should look when
he or she is attempting to write. The writer can decide how the
document should look afterwards. Also, if you look at the
text file which these HTML pages were generated from,
you will see that the 'source' is quite clean.
That is to say there are very few tags in the text files, which
makes them easier to read and I believe easier to maintain.
This is based on the principle that it is better and more
creative to think about one sort of thing at a time.
-
(*)
Are there any similar systems to this available?
-
There are many wiki systems available, many of them far more
powerful than the current system. There are also some
systems which emphasize having minimal tags in the source files
for example the Markdown
system. The markdown system
seems to have the same philosophy as the current system. Also
there appears to be a Php Markdown . These systems are
no doubt much more advance than the current one, and you would be
well advised to use them if you want to create a web site.
The current system is only in its initial stages and may not
be continued.
It appears that Markdown still allows the writer to put
formatting code into the text document. Therefore I feel that
my system has a slightly different outlook from Markdown.
This system attempts to encourage the writer to think about
semantic content rather than visual content, but does not
force the writer to categorize his or her writing.
-
(*)
Why not use XML for the document format?
-
XML is a strict format which requires or at least
encourages the author to make decisions about the categories
of information which his or her writing will be dealing with,
but often writers do not wish to make these sort of decisions
or are not able to make these decisions because their ideas
about the nature of what they are writing about are vague
and will develope during the course of their writing. For
this reason I prefer a non strict format which attempts to
make guesses about the semantic content of the text.
-
(*)
Is it possible to change the format for the faq document?
-
Most wiki systems and text transformations use regular
expressions but this system actually parses the text
document to find the structures. This is slower and more
complex than a regular expression system but it allows
more precise control of the way the document is transformed
and allows a type of 'query' to be made of the document
about its content. For example the FAQ class can determine
how many questions there are in the document and how
many answers where as a regular expression system would have
difficulty in finding that information.
This means that the syntax of the text documents is determined
by the parsing which the objects do of the document and so
to change the document syntax requires changing those parsing
routines. In some cases this is simple but in other cases not so
simple.
-
(*)
What other documents are available?
-
There is a brief faq in the top level directory which is
That file describes the overall bent of this site.
There is also an FAQ in the /lang directory of this site.
-
(*)
What does the code '[dir]' tag mean?
-
This code instructs the transforming engine to insert a
directory listing in the outputted html document.
The directory listing is the listing of a directory
on the computer where the transformation engine is run,
which would usually be the web-server.
-
(*)
What does the code '[image]' tag mean?
-
This tag instruct the transformation engine to insert
an image in the rendered document.
-
(*)
What does the code '[webdir]' tag mean?
-
This tag allows the insertion of a set of links from
another web-page in the rendered document. for example-
'[webdir http://www.yahoo.com]'
would insert all the links from the yahoo page into the
rendered document. Please note that this component is
only in a development stage. For example the links from
the page are not transformed to make them useable from
a different server.
-
(*)
How can I stop a tag from being transformed by the code?
-
You can enclose the tag in single quote characters as I have
done in the examples above. (Actually only the leading quote
matters). If there had not been enclosed in quotes they
would have been transformed by the code engine.
-
(*)
Is this page dynamically generated?
-
No. which means that the file listing which may occur on this page may not
be entirely up-to-date.
-
(*)
what is in the /edit/ directory?
-
This directory contains the beginnings of a text editor
written in java. The editor is orientated towards saving
on a ssh server via sftp. this is mainly because this is
the only way to save to the sourceforge server.
-
(*)
Can I use lists in documents?
-
There is a PlainList.java class which will recognize
and render lists in Html but I have left it out of the
faq document class for reasons of simplicity. There is also
a PlainListDocument.java class which is a document
which can contain some text and a list. But this is not
that useful really. The syntax of a list is, for example-
u- The first item
- the second item
- the last item, followed by a blank line.
-
(*)
How do I put a link in a document?
-
You can just type the link normally, for example-
http://www.google.com or www.google.com
which should be rendered as http://www.google.com .
For a relative link you can use the fake link: protocol, for example-
link://TextLink.java or link:///lang/web/TextLink.java
which should be rendered as TextLink.java etc.
Also, in order to change the display text of the link you
can use a format such as, for example-
http://www.google.com 'google'
which should render as google This syntax is not as good as the text before the link
but easier to parse.
-
(*)
How do I include an example block in a document?
-
The MixedText class has support for example blocks. Since
the FaqDocument class also uses the MixedText class you
may also use example blocks in faqs. An example block
is not transformed in any other way- The example block is
started with text such as "for example :" or "for example -"
with no space between the example and the punctuation.
Look at the source text file for a better idea.
-
(*)
what kind of markup does this system use?
-
First it is convenient to understand the connotations of
the phrase 'plain text'. This is a vague phrase but it means
something quite specific within the context of computer
systems. Historically it means text using a limited set
of characters or codes which could be represented by the
latin alphabet as well as some punctuation marks. This was
often referred to as the ASCII character set.
The concept is not that simple to explain but is easiest
to grasp by exclusion. A microsoft word document is NOT
plain text because it contains codes whose purpose is
to provide information about the formatting visually of the
document and those codes are not themselves characters or
letters.
Plain text is important because the internet is built on
plain text protocols and because plain text acts as a
useful interchange format between different types of
computers.
The transformation code here uses plain text transformations
and plain text markup codes. The upshot of this is that
documents can be written in any text editor such as
Microsoft Notepad or Unix Vim and then transformed to a
'fancy' format such as HTML.
-
(*)
Why is the faq a popular format?
-
Maybe because it has overtones of a dialog, in a socratic sense.
It also allows people to make up questions for themselves to
answer which some people enjoy.
(top)
|
|