To learn a new language, one must acquire new words. This is unavoidable and requires dull learning by rote and some semantic effort: there is no French word for 'shallow', but 'know' splits into 'connaître' and 'savoir', etc. In addition, one must learn 'new grammar'; e.g. French nouns have gender, which need not be related to sex, and which must be considered in adjective concord[1]. My purpose is to reduce this new grammar as much as possible, so a minimal amount of work is needed beyond word memorization. As a slogan:
If you know what the word means, you can use it.
This, of course contains a huge fallacy: the meaning of a word is precisely the set of all the occasions when it may be used. But what I mean[2] is that if you knew what 'be' means, you could deal with it, without having to know about 'is', 'was', 'been', etc.
Another slogan for minimal grammar:
If a natural language makes do without it, so can I.
This says that some features may be redundant. For instance, I won't worry about translating 'the', because Russian, Chinese and even Latin manage without it.
Ideally, a word is a word is a word; I don't want it classified as 'noun', 'verb', 'preposition' etc. I don't even want a role for the word in the sentence: the grammar must manage without the terms 'subject', 'object', etc[3]. Then, all that is left is the subordination relation: a word may modify another. To define 'modify' I will use a naive grade school attitude: a modifier answers a question about the head.
that book
that what? that book : 'book' modifies 'that'
which book? that book : 'that' modifies 'book'
In this example, the relation is symmetrical, which is just
fine. Actually, in grade school we are not allowed to ask 'that what?' -- we
must have 'that' subordinate to 'book' and not vice-versa, which makes parsing
diagrams directed graphs. However, my grammar does not need such precision. Of
course, there is a big difference in meaning: 'book' is more complete than
'that', which needs finger pointing; this is the reason why 'that what?' is a
no-no. But I would like to ignore semantics as much as possible -- alas, it
won't let itself ignored.
Here is yet another slogan, this time about vocabulary:
The Greeks had a word for it, and so do I.
Philologists have gone on for centuries about the richness of Greek, or Sanskrit or whatnot. It seems to me that they liked the ability of such languages to create new words on the fly; even more they loved discussing the possible meanings of these words. So, in order to ease the process of word acquisition, I will put in some rules for creating compound words, and leave the meaning hazy, to the utmost gratification of philo/sophers/logists. I give up any pretense of precisely defining word sense -- that would involve such notions as verbs of motion, transient occurrences, animate/inanimate, affect -- all the 1000 Roget categories and more. So I will depend on common sense, and all I can say is that my compound words are vague, possibly ambiguous, and are offered as suggestions only.
Nor will I try to produce a minimal set of basic words, from which the others may be composed. Consider the family:
two, double, twice, both, dual, dyadic, pair, even, twin, twine, twist, duplicate, second, the one... the other[4]
All of them are semantically related to 'two', but it would be hopeless to define precisely the relations, in the hope of generalizing.
Among natural languages, Chinese and Malay seem nearest to my type of language --- there is fluidity between parts of speech, and little inflection[5] to memorize. However, in both languages the classes noun/verb are quite evident, and there are syntactic paradigms to learn, i.e. there are such classes as subject, object, complement etc.
Among artificial languages I know of, the nearest to mine seems Allnoun. But here comes my pet peeve: I cannot stand recursion and long to infinite series of parentheses. These are OK for computers, but my memory, being neither magnetic or silicaceous, boggles. Still, I like Allnoun in its intransigence, and I adopted as the first rule of translation from English 'replace every word by a noun'.
Finally, the vocabulary is mostly English[6] -- just as Volapük is mostly English -- and the language is called, obviously, LAN.
Some more slogans
Etymology is semantics.
Of course it ain't, 'nice' does not mean 'ignorant', and 'artery' does not carry air. One must ignore usage, diachronic etymology -- the fun part -- and especially shun such compounds as 'understand', which has nothing to do with 'under' or 'stand', but clearly proceeds from both. Such self-explaining compounds, as 'lukewarm' = 'body' + 'warm' = 'body temperature' may be rare.
Then maybe derivation is semantics ?
Although poetry is lost, one can gain much flexibility from LAN compounds like:
fat+disapprove = obese
fat+approve = portly, embonpoint
Since there are only autosemantemes in LAN, morphological derivation is word-compounding (no slogan here) and LAN etymology is basically decomposing compounds -- the reverse of word derivation.
English word = LAN word.
That would really be great, especially for automatic translation. Unfortunately, it won't even work in the form 'English word = English word'; consider 'the human race' vs. 'the rat-race' and some rarer 'race' meaning 'pluck'. Even extremely technical words carry this ambiguity: the biceps of the anatomist is not the biceps of the prosodist, although both terms are as precise as they get. So I will just take the easy way out, providing a simple-minded automatic translator, and lazily avoiding race1, race2 and race3; to ease my conscience somewhat, I also translate LAN formally via triplets, q.v.
Sense beats grammar any time.
Or, if you prefer, poetic license. Although word order in LAN prescribes strict dependency, if that does not make sense, think of a different ordering. This, of course, may be more than exponentially ambiguous -- that's why we have human brains.
Yet another slogan
All languages are foreign languages.
A very good approximation -- after all, few of us are native speakers of even 0.1% of the various existing languages (about 6 different ones, to the last count). And it seems to me that this shifts the practical import of linguistics from a detailed description of a given language to a coarse model of all the languages, or to the invariants of translation. It is not particularly interesting to flag 'ungrammatical utterances', because most speakers (who are mostly foreign speakers) will keep uttering them; rather find out how is it possible that they get understood, or to what degree do they get understood.
What I have in mind is a linguistics of meaning, and I think that meaning is defined by translation (Huh?) Or at least we would have a very different idea of what meaning is if everybody spoke the same language. Like the Port-Royal gentlemen, we would find it obvious that pure reason dictates the accords of past participles in compound tenses. Realizing the differences when the same thing gets said in different languages[7], one is lead to the idea that under the variety of forms there is an invariant -- the meaning.
A famous German philosopher[8] once said, “Some concepts cannot be expressed simply and some concepts cannot be expressed in French”. I couldn't agree less -- if the idea cannot be translated, then it's a very poor idea, IMHO with little meaning. On the other hand, it is true that the translation/explanation may involve clumsy paraphrases, boring examples, whatever; it may certainly fall short from aphoristic brilliance or any semblance of style. Still, I feel that the philosopher owes us this kind of explanation, even if he'd rather say 'Dummkopf!' and go on with his parerga and paralipomena. The penalty for not supplying the explanation is being ignored.
This has
taken us rather far from the original discussion of meaning, but what better
connection than a philosopher?
One more slogan:
Lexical functions are the enemy!
In English one says: “dead center”, “deep silence”, “deep hatred”, and the expressions mean: “precisely at the center”, “much silence”, “much hatred” -- nothing to do with death or depth!
These are all examples of lexical functions, i.e. the qualifier is a function of the head noun. In other words, the noun requires that particular qualifier for the generic meaning of “intense”, in this case.
Such expressions mark one as a fluent speaker, but they contribute little to the meaning (“deep silence”), or are idiomatic, therefore untranslatable. So let's dump them! Replace by very broad generics: “much silence”, or by explanatory forms: “privative + sound” means precisely no sound at all, while “silence” may be just “too_little + sound”.
Some examples of a language codifying lexical functions are, of course, Esperanto “--id” (progeny) and “--ar” (group_of):
kid = progeny + goat,
lamb = progeny + sheep,
calf = progeny + bovine … etc.
an exaltation of larks = group_of + lark,
a pride of lions = group_of + lion,
a school of fish = group_of + fish … etc.
In natural languages they might appear as affixes, e.g. English “un-“ and “-ly” ( and the Romance “mente”, very similar to “--ly”). But they are not always applicable and not always carry the same meaning: “hard work” is not “hardly working”!
Phonology I: primary and secondary phonemes
In tokens (words of LAN
origin, which cannot be decomposed; 'atomic' words) only the following
sounds/letters appear:
1. 5 vowels: A, E, I, O, U (like the same letters in Spanish).
2. 15 consonants: B, D, F, G, H, K, L, M, N, P, R, S, T, V, Z.
These are pronounced more or less like in English, but:
·
G
is always hard (egg, give)
·
R
is trilled (un-English), not too long (the Spanish 'ere')
·
S
always as English 'bliss'
·
There
is no phonemic difference between aspirated and unaspirated
consonants (English 'pin/spin'), or alveolar/dental versions of /d/, /t/,
etc...
These are the primary sounds/letters of LAN, which may also be
called token-letters.
There are several secondary sounds/letters which appear in
compound words and foreign words:
Vowels:
6 : an, French le (schwa)
4 : French tu,
German für
9 : French peu,
oeuf, German schön
Semivowels:
Y : yet, day ( German
j )
W : wet, German Au
Consonants:
X : she
3 : measure
C : chip
Q : thing
J : jet
7 : Italian pizza
8 : thin
2 : this
Like
a.
tokens – a closed class of words, following strict phonetics rules, not
analyzable into smaller morphemes or semantemes; all
will be found in the dictionary. E.g. :
PI = person, FE = female, UDU = two
b.
foreign words – an open class of words, not analyzable into smaller
morphemes or semantemes; such words may violate some
of the token phonetic rules, and are bracketed by []. These are basically technical
terms, proper names, etc. All must appear in the dictionary. E.g.:
[PARI] = Paris, [KROMOSOMI] = chromosome, [JAI ALAI] = jai alai
c. compounds – an open class of words, which need not appear in the dictionary. Their phonetics allows each one to be decomposed into a set of tokens or foreign words, and their meaning is suggested by these components. Examples:
FE+PI
= female person = woman, she
PI+UDU
= two persons = couple
[PARI]+FE
=
The token class is closed because a token must consist of up
to 5 primary letters, with at least 1 and at most 2 vowels, such that vowels
and consonants alternate (see again the examples above). Here are all the possible forms of
a token:
A, AB, BA,
where A stands for an arbitrary primary vowel and B
for an arbitrary primary consonant. Some examples for each type:
type ‘A’: A = but
type ‘AB’:
type ‘BA’: PI =
person, FE = female, BO = beauty
type ‘
type ‘BAB’: FID =
find
type ‘ABAB’: AGEN =
again
type ‘BABA’: NARO =
narrow
type ‘BABAB’: BUSIN =
business
In foreign words, such as
[ENTROPI] [KATASTROFI] [PRESIDE] [ALEGORI] [PARI] [JAI ALAI]
the brackets [] are part of the language, realizable as sound: [
is the token LA and ] is the secondary consonant X. The brackets signal that
the word, although not a token, is not compound, and should not be further
analyzed. International words are modified to contain, if possible, only token
phoneme/letters, and end in a vowel. Thus:
[PARI] = la Parix
[PRESIDE] = la presidex
[JAI ALAI] = la jai alaix
Non-token phonetics and phonotactics (jai alai, presidex)
set such words apart from tokens; the X‑ending clearly shows the end of
the foreign word, which may contain breaks (la jai alaix).
Tokens and foreign words may be combined in compounds, e.g.
TI+PE = this , place =
here
UN+BO = oppposite, beauty = ugly, ugliness
SI+NOGU = the one who,
know = expert
BO+BUSIN = beauty,
business = art
PI+BO*BUSIN = person ,
beauty , business = artist
[JAI ALAI]+LAGI+PE =
jai alai, play, place = jai alai court
The + and * signs (called the loose bond and the tight bond,
respectively) are also part of the language, and are realized in various ways,
depending on the tokens they join. The result is a pronounceable combination
which still keeps clear the boundaries between the components. The meaning of a
compound may be vague, and sometimes is independent of the ordering of the
components, e.g.
TI+PE = PE+TI = here
The sign * shows a stronger bound than + (just as in arithmetic *
has higher precedence):
PI+BO*BUSIN = person, (beauty,
business) = person of art = artist
That may be compared to:
PI*BO+BUSIN = (person,
beauty), business = feeding and care of beautiful people (?!)
It goes without saying that * should be avoided whenever possible.
Here are some examples of bound realization:
fyepi = FE+PI = female
person = woman, she
piudu = PI+UDU = two persons =
couple
la
Parixfe = [PARI]+FE =
The boundaries appear as
all of which are not allowed in tokens.
These three categories
-- token, foreign words and compounds -- are the untagged words of the
language, as opposed with words tagged with the syntactical prefixes and
suffixes that we describe below. 'Word', unqualified, will mean untagged word. After
describing word morphology, one may summarize word semantics: tokens cannot
be analyzed, foreign words are specially marked so one does not try to analyze
them, and compounds must be analyzed to be understood.
Grammar.
The words are
invariable: they are not modified to show such categories as plurality, verb
tenses, noun cases or degree of comparison in the adjective.
There are no parts of speech: GU is 'good' (adjective), but also
'well' (adverb) and 'goodness' (noun); SO means 'with' (preposition),
'together' (adverb) and 'join' (verb).
The whole grammar is
actually syntax. The only relation shown is the very vague one of subordination
between head and modifier, e.g. in the English phrase 'two white kittens', the
word 'kittens' is the head and 'two' and 'white' are modifiers, subordinate to
'kittens'.
Grammar Rule #1: if word2 immediately follows word1, then word2 is
a modifier of word1.
By obvious analogy, the modifier may be called ‘tail’, as it
follows its head according to this rule. So we may start to talk about kittens
in LAN:
Kitten = KAT+ID (= cat, young one)
White = HITE
Two = UDU
And so:
Two kittens = KAT+ID UDU
White kitten = KAT+ID
HITE
In the statement of
Rule #1, order is essential (precedes, follows). In practice, it does not
always matter. UDU KAT+ID may be translated as 'a pair of kittens', with
precisely the same meaning as KAT+ID UDU[7].
On the other hand HITE KAT+ID may be a
little different 'the whiteness of kittens' a nice poetical concept not quite
the same as 'white kitten'. Again, there is no way to separate the meanings
'white', 'whiteness' of the token 'HITE'. If you must make the distinction, use
compounds or modifiers; and always consider: is that detail truly meaningful?
Now we proceed to 'two white kittens'. It may be:
UDU KAT+ID HITE = a
pair of kittens white
In this order, KAT+ID modifies UDU, and HITE modifies KAT+ID;
quite logical (and very un-English: the modifier 'white' follows, instead of
preceding). Other orderings:
KAT+ID HITE UDU =
kitten, double whiteness?
UDU now modifies HITE. Not particularly meaningful, might be
considered nonsense (but grammatically correct: colorless green ideas sleep
furiously). It is easy to give examples where several orderings are meaningful.
Introduce some new words:
LIL = little
LAGI = play
Then:
LAGI KAT+ID LIL = the
play of little kittens
KAT+ID LAGI LIL = the
kitten plays a little
KAT+ID LIL LAGI = a
kitten of (little play) = a not very playful kitten
The last translation emphasizes the fact that LAGI is a modifier
of LIL, not of KAT+ID. Plain sequencing of words one after another represents
this logical connection, in which each word has at most one modifier (following
it) and at most one head (preceding).
To deal with more complex situations, use
Grammar Rule #2a: to show that word1 is the head of word2, append
the initial open syllable of token1 as a top prefix to word2.
Grammar Rule #2b: to show that word1 is the modifier of word2,
append the initial open syllable of token1 as a bottom prefix to word2.
Notice the shift from word1
to token1; it means that, if word1 is itself prefixed, the syllable
following the prefix, which is actually the first in a token, will be used.
Schematically:
Rule #2a:
A__ word = A__ … A/word = A/word … A__
BA__ word = BA__ … BA/word = BA/word … BA__
(top
prefixes A/, BA/)
Rule #2a:
word A__ = A__ … A\word = A\word … A__
word BA__ = BA__ … BA\word = BA\word … BA__
(bottom
prefixes A\, BA\)
The underscores denote
the rest of the word; for instance, A__ is any word starting with A. The
ellipses denote several intervening words; if the prefix is used, the modifier
and head may appear in any order. The notation above explains the strange names
‘top prefix’ and ‘bottom prefix’: in BA\ the syllable is under the sign, thus
‘bottom’, and in BA/ it is over the sign, thus ‘top’. The signs are, of course,
realized as sound.
The two parts of rule
#2 are very similar, but #2a is used much more frequently than #2b; so using
the unmodified word 'prefix' we mean top prefix, an instance of rule #2a.
Using Rule #2, we can express 'two white kittens' in any of the
forms:
1. KAT+ID HITE KA/UDU[D1]
2. KAT+ID UDU KA/HITE
3. KAT+ID KA/HITE KA/UDU
4. KA/HITE KA/UDU KAT+ID, etc...
The prefix makes clear that, in (3) UDU modifies KAT+ID, not HITE.
On the other hand, (4) is really too KA-KA-phonic! What if we had
to translate: 'two male white kittens' (male = MA). Even more KA- syllables!
KAT+ID HITE KA/UDU
KA/MA
KA/UDU KA/MA KA/HITE
KAT+ID
KA/UDU KAT+ID MA
KA/HITE, etc...
Notice that there is no particular order of the modifiers, (unlike
English where 'white male two kittens' would be ungrammatical, and one must say
'two white male kittens').
To avoid the repetition of prefixes, use
Grammar Rule #3: to show that word1, word2, ..., wordN have the same head, use the chain
word1< word2< ...
wordN
The words in the chain - except the last - are tagged with the tag
< ; they must be consecutive. The tag is realized as -NS, after vowels and
as –AY after consonants. Using this rule, one may translate 'two white male
kittens' as:
1. KAT+ID HITE< UDU< MA
2. KA/HITE< UDU< MA KAT+ID
Form (1) is usually preferred, as it is shortest (pronounced: kaytid hitens uduns
ma; 7 syllables) versus (2) with 8 syllables (ka’hitens uduns ma kaytid)[8].
Notice that in (2) MA is untagged and followed by KAT+ID; still MA is not the
head of KAT+ID, but its modifier, as shown by <.
Similar to rule #3 is
Grammar Rule #4: to show that word1, word2, ..., wordN are the heads of the same modifier, use the chain
word1> word2> ...
wordN
The consecutive words in the chain - except the last - are tagged
with the tag >. This is realized as –NK after vowels and –UY after
consonants. Now one may translate 'white kittens and cats' as:
1. KAT< KAT+ID HITE
2. KA/HITE KAT+ID< KAT, etc...
However, rule #4 is not always sufficient, and this is when one
must use rule #2b:
(#4) the
cat's head and tail = HEDA> TALI KAT
versus:
(#2b) the cat's round head and long tail =
KA\HEDA RONU
KA\TALI LON KAT
(using the tokens: HEDA=head, RONU=round, TALI=tail, LON=long)
It is possible to add to a word both tags < and >; e.g.
'I see white cats and
kittens'
I = MI, see = SEGE, several = SE
MI SEGE KAT<> KAT+ID SE< HITE.[D2]
<> shows that both KAT and KAT+ID are modifiers of SEGE (by
Rule #1) and heads of SE (again by Rule #1); while SE< shows that SE and
HITE modify the same head(s). By the way, here we have a complete sentence,
showing verb conjugation 'I see' and plural 'SE'; or, better put, showing how
such English grammatical concepts are translated.
The sentence ends in a
period, which is realized as falling tone followed by a rest. The period is
also a cut, meaning it interrupts the dependency chains preceding it: a word
following the period is not a modifier of a preceding word. Another such cut is
the question mark, realized as rising tone followed by a rest, and, less
obviously, any rest before starting to speak. Then there are the five vowels :
·
E,
the copula: e.g.
1. [TOBERMORI]
2. [TOBERMORI] TORI [SAKI]
It may join words (as in 1) or
phrases (as in 2); think of it as an equal sign. A phrase is just a group of
subordination chains, delimited by cuts.
The following correspond to English conjunctions, and join
phrases:
·
A,
but
·
O,
or
·
U,
therefore
·
I,
and; this does not mean 'also', as in 'bread and butter', but is the vaguest
cut between phrases:
I went to the mall, and I
saw some red shoes, but they were too tight, and then I met Lucy, and
she told me about her daughter, and I said I was sorry I could not have
lunch with her.
There is an additional set of compound cuts, as explained below
under Rule #6. We may formulate the following:
Grammar Rule #5: Cuts (a closed class consisting of single vowels
and punctuation marks) are used to interrupt subordination chains, and to show
the relations between phrases in a sentence.
Grammar Rule #6: Asides are tagged sentences, used like cuts.
Asides, like cuts, appear between phrases, and interrupt
subordination chains. They clarify the connection between phrases, by showing
attitude `but, unfortunately', modifying information `is, to my best
knowledge', etc... Asides are sentences, enclosed in the tags {}. The tags are
realized as follows:
{ a token, YE
} an ending: -EY after
a consonant, -W after a vowel
Using some new words : DOBU = doubt, PA = past, PI = person, TU =
thou:
{[MARI]} TU TI+PE? = Mary! are you there?
pronounced:
Ye la Marixey tu tyipe?
[MARI] {
pronounced: La Marix ye e paw la diplomaxpi.
[MARI] {
pronounced: La Marix ye e dobuw la diplomaxpi.
What is special about asides is:
1. cuts get modifiers:
2. the tags {} do not nest, so an aside cannot contain another
aside.
Asides should be used as little as possible! They seem necessary
for past and future of the copula, and essential for vocatives and
interjections. Vocatives and interjections, according to school grammar, have
'no function in a sentence', i.e. are not (part of) subject, object, predicate
or complement; no function = no syntactic role = no place in LAN grammar, which
is wholly syntax. Still they are used freely in many languages, so I found a
place for them in mine. In addition, many languages must show attitude of
speaker, deference, etc..., which fit well into asides.
Examples of avoiding asides:
TO+HERA MI< [MARI]
Listen to me Mary ; you
there?
(TO
= in order to; HERA = hear)
The expression is much flatter than a neat vocative; but at one
time I thought flatness (lack of emotion, emphasis or involvement) was a
feature, not a bug.
DOBU [MARI] PI+[DIPLOMA].
there is doubt Mary modified by
graduate.
The words in italics must be supplied to make the sentence
somewhat English, but the meaning is quite clear, and this would be the
preferred form. In the same way, one would tolerate the somewhat muddied
meaning, and say:
PA [MARI] [DIPLOMA]+PI.
past modified by
Mary modified by graduate.
And with that, the grammar ends! The whole purpose of LAN was to
build a language with minimal grammar, taking as 'grammar' the stuff you have
to memorize - besides vocabulary - when studying a foreign language:
·
amo,
amas, amat, ...
·
Milch
is feminine, Wein masculine, Bier neuter
·
the
future of 'can' is 'will be able'
So we have our minimal grammar; does it work? The answer is
provided by a few translations, with commentary. I have included in my sample
some poems, to get a feeling for style as opposed to mere sense conveyance - as
they say, poetry is what is lost in translation, so sense must be the conseved quantity. And I made my life easy, translating Housman and Heine; I would not
dare translate Rilke or Dylan Thomas, but then, who
would?
Phonolgy III: Marker
Realization.
Markers are all the grammatical
signs used in LAN text, besides letters and punctuation. They are divided into
three categories:
One may also
classify markers by their position in the complete word:
All the
markers are realized by a combination of
In
a prefixed word, the stress falls on the second syllable; in an unprefixed word, one may freely stress any but the second
syllable. Thus, word stress on the second syllable identifies completely prefix
markers. To distinguish between / and \, the vowel is modified for \, the lower
prefix marker: Y or W is added before or after the vowel.
The
loose bond between two vowels or two consonants is left unmarked – the
so-called null loose bond. The loose bond between a consonant and a vowel is
realized as Y before the last letter of the first word.
The
tight bond between a consonant and a vowel is realized as W before the last
letter of the first word. The tight bond between two vowels may be optionally
realized as Y or W between them.
The
other realizations are summarized in the table below. Notice, in particular
that ] must appear as AX in the
interior of a compound word, but may appear as yA in a stand-alone foreign word:
[HISTORI] = LA-HISTORYA = history
[HISTORI]+VE
= LA-HISTORIXVE = historically
(VE = manner, way of doing)
When there is
a choice between y and w, it is used to avoid the ugly combinations iy, yi, uw,
wu. If these cannot be avoided, they are pronounced
(but not written) 6y, y6, 6w, w6 – i.e. the semivowel remains, but the vowel is
pronounced as schwa.
Marker Realization Table
In this
table:
initial |
{XXX |
du⌷XXX |
|
|
|
|
|
|
|
prefix |
A/ |
A’ |
|
stressed 2nd syllable |
|
A\ |
ywA’ |
Ayw’ |
y or w, preceding or following A and stressed 2nd syllable |
|
|
|
|
|
medial |
A+E |
AE |
|
null light bond |
|
A+B |
yAB |
|
y precedes last letter |
|
AB+E |
AyBE |
|
y precedes last letter |
|
AB+K |
ABK |
ABəK |
null light bond ; schwa may be inserted |
|
A*E |
wAE |
AywE |
w precedes last letter or intervocalic y or w |
|
A*B |
wAB |
|
w precedes last letter |
|
AB*E |
AwBE |
|
w precedes last letter |
|
AB*K |
AwBK |
AwBəK |
w precedes last letter ; schwa may be inserted |
|
|
|
|
|
final |
A< |
Ans⌷ |
|
|
|
AB< |
yAB⌷ |
|
|
|
A> |
Ank⌷ |
|
|
|
AB> |
AyB⌷ |
|
|
|
A<> |
A(n)ks⌷ |
A(n)sk⌷ |
n may be omitted |
|
AB<> |
wAB⌷ |
|
|
|
|
|
|
|
|
A} |
Ard⌷ |
|
|
|
AB} |
AwB⌷ |
|
|
|
|
|
|
|
free |
[XXX |
la–XXX |
|
|
|
XXX] |
XXXAx |
XXXyA⌷ |
ends in vowel-x
or in
y-vowel in final position |
Morphology IV.
Numerals have a special formation and special phonology. The open,
literally infinite class of numbers is built by pronouncing each character in
their written form:
1 2 3 4 5 6 7 8 9 0 =
UNU UDU UTU UKU UFU ULU URU UGU UVU UZU
decimal point = IPE
fraction slash = IVE
minus = MINU
E (10 to the power) =
PEV
10 = UNU+UZU
123 = UNU+UDU+UTU
12.3 = UNU+UDU+IPE+UTU
23/45 = UDU+UTU+IVE+UKU+UFU
-1.23E-4 =
MINU+UNU+IPE+UDU+UTU+PEV+MINU+UKU
They are somewhat exceptional as being compounds where order is
invariable, and they are pronounced differently from other compounds: instead
of hiatus, the U of the digits is elided.
123 = UNU+UDU+UTU = unudutu
12.3 = UNU+UDU+IPE+UTU = unudipetu
23/45 = UDU+UTU+IVE+UKU+UFU = udutivekufu
-1.23E-4 =
MINU+UNU+IPE+UDU+UTU+PEV+MINU+UKU = minunipedutupevminuku
The numbers are usually preceded by NU (number) or RO (ordinal)
and are strongly stressed on the last syllable, to show where the end is. There
are also a few abbreviations:
zero zero
= AHA
zero zero
zero = ASA,
e.g.: 2500 = UDU+UFU+AHA = udufaha,
20003 = UDU+ASA+UTU = udasatu
The combinations ASA+AHA, AHA+AHA, etc... simplify AA to A:
three hundred thousand =
UTU+AHA+ASA, UTU+ASA+AHA = utahasa, utasaha
a million = UNU+ASA+ASA = unasasa
This is the lojban way to treat numbers, and the only reasonable one.
Written numbers are understood by anyone, so let us copy in spoken numbers the
written form, and be done with quatre vingts dix sept,
baq, pik, kalab and gross dozens.
A few more details:
·
mixed
fractions are expressed as sums, using the token AD = add:
2⅔ = UDU+AD+UDU+IVE+UTU = udadudivetu;
·
percents
and promils are IVE+UNU+AHA = ivenaha,
IVE+UNU+ASA = ivenasa respectively.
In pronounciation,
IPE, IVE, MINU, PEV and AD are surrounded by short rests, which may be marked
by dashes: udut-ive-kufu , unud-ipe-tu.
Of course, mathematical stuff is much cleaner written down than spoken aloud.
Minimal grammar – revisited.
Consider again the translation of a short sentence from ‘Loreley’:
At the end the waves swallow boatman
and boat.
IFO VAVA
The grammar –
i.e. syntax – is best represented by a grade school parsing diagram. Each arrow
is a subordination pair, or link.
SE at the end boatman boat IFO BOTA BOTA
Notice that ‘waves’ is actually a link: ‘VAVA SE’ = ‘waves many’,
or ‘
On the other hand ‘IFO’ is a fancy adverb meaning ‘ending in the
future’.
Then we may summarize the LANGU diagram in a table:
Links
|
Translation |
‘Better’
English |
VAVA SE |
wave modified by many |
waves |
VAVA VALO |
wave modified by swallowing |
wave
swallows |
VALO IFO |
swallowing modified by ending in the future |
at
the end will swallow |
VALO SU |
swallowing involving a direct object |
swallow
... |
SU MA+BOTA |
the direct object modified by boatman |
...
boatman |
SU BOTA. |
the direct object modified by boat. |
...
boat |
One could simply speak out the links:
1. VAVA SE I VAVA VALO I VALO IFO I VALO SU I
SU MA+BOTA I SU BOTA.
This can be even
translated using the last column:
There is more than one wave, and the waves swallow, and swallowing will end in the future, and swallowing affects something, affects[9] the boatman, and affects the boat.
The addition of
the cut ‘I’ is needed to prevent every word from modifying the preceding word[10].
The links could also be rearranged in any order; that would not change the
meaning, but some orderings may be easier to understand than others:
2. SU MA+BOTA I VAVA SE I VAVA VALO I VALO SU I
SU BOTA I VALO IFO.
3. VALO IFO I SU BOTA I VAVA VALO I SU MA+BOTA
I VALO SU I VAVA SE.
4. VAVA VALO I SU MA+BOTA I VALO SU I SU BOTA I
VALO IFO I VAVA SE.
etc.[11]
One could
replace the head-words by their first syllable, as a high tone prefix:
5. VAVA SE
VA/VALO IFO VA/SU
MA+BOTA SU/BOTA.
The cuts are no
longer needed, but there is some ambiguity: ‘VA/SU’ could be ‘VAVA SU’ as well
as ‘VALO SU’. One could rearrange this form, too:
6. SE VAVA VALO
IFO VA/SU MA+BOTA
SU/BOTA.
7. VALO IFO VA/VAVA SE VA/SU MA+BOTA SU/BOTA.
Or one could use
the markers < and > :
8. VAVA SE< VALO IFO< SU MA+BOTA<
BOTA.
Now, although (8) is the clearest and
least repetitive form, all the other forms are grammatical, and, one
might say, mean the same thing. (1-4) are suitable only for computers --
although the lists fully represent the dependency structure, human short memory
cannot fit them in. (5-7) are for real people, since some interpretation of
ambiguity is needed. As for (8), I would assert it has some style.
Here we also see the structure of LAN:
Words may be
coupled in dependency pairs (‘VAVA VALO’) or may follow each other in
dependency chains (‘VAVA VALO MA+BOTA’); dependency chains may be joined by
various means into trees (parsing diagrams), e.g.:
VAVA SE< VALO
MA+BOTA: a junction of the chains:
VAVA SE, VAVA VALO
MA+BOTA
SE VALO
BOTA
VAVA VALO MA+BOTA VA/BOTA: a junction of the chains:
VAVA
VALO MA+BOTA, VALO BOTA
VALO BOTA BOTA
Finally, cuts are used
between such trees to separate word pairs which do not form a subordination
link. Sentences are groups of trees separated by cuts. The hierarchy:
word < subordination pair < chain < tree < sentence
is the LAN equivalent of
the usual English scheme:
word < phrase < clause <
sentence.
[1] Contrast with English grammatical gender, which manifests itself only in the choice among 'he', 'she' or 'it'.
[2] what I think meaning means
[3] when I use 'noun', 'verb', 'object', etc. the terms refer to the English translation
[4] in some languages, e.g. Swedish, 'the second' is etymologically 'the other'
[5] There is definitely some; both languages have measure words, worse than der,die,das.
[6] If you think that's unfair, interchange every r and t and every u and e -- guaranteed to make it unrecognizable to everybody.
[7] Chomsky, whom I mistrust because of his politics, saw that as the intensive language (the rules generating everything one may say) and the extensive language (everything ever said).
[8] I think it's Hegel, or maybe Schopenhauer… can't find the reference on the net.
[1] Contrast with English grammatical gender, which manifests itself only in the choice among ‘he’, ‘she’ or ‘it’.
[2] what I think meaning means
[3] when I use ‘noun’, ‘verb’, ‘object’, etc. the terms refer to the English translation
[4] in some languages, e.g. Swedish, ‘the second’ is etymologically ‘the other’
[5] There is definitely some; both languages have measure words, worse than der,die,das.
[6] If you think that’s unfair, interchange every r and t
and every u and e – guaranteed to make it unrecognizable to everybody.
[7]
so UDU
means ‘two’ and ‘a pair’. What else? Any or all of: double, twice, second,
dual, dyadic, even, twin, twine, etc…
[8] and the following is even better: idkat hitens uduns ma; see ‘Light and Heavy Words’.
[9] ‘affects’ vs. ‘involves’ : the translation of ‘SU’
[10] in this case, every third word modifies its precedent.
[11] this looks like a PROLOG database, with one relation only – ‘modifies’ – and some pairs in random order.