Spec clarifications for Section 8.

Based on Mark Adler's review comments.
2015-04-08 11:07:00 +02:00 · 2015-04-08 11:07:00 +02:00 · dcdc68e68b
parent 8618383b9b
commit dcdc68e68b
2 changed files with 525 additions and 466 deletions
--- a/docs/draft-alakuijala-brotli-02.nroff
+++ b/docs/draft-alakuijala-brotli-02.nroff
@ -1106,7 +1106,7 @@ The number of static dictionary words for a given length is:
 .nf
   NWORDS[length] = 0                       (if length < 4)
-   NWORDS[length] = (1 << NDBITS[lengths])  (if length >= 4)
+   NWORDS[length] = (1 << NDBITS[length])   (if length >= 4)
 .fi
 DOFFSET and DICTSIZE are defined by the following recursion:
@ -1122,7 +1122,7 @@ index is:
   offset(length, index) = DOFFSET[length] + index * length
-Each static dictionary word has NTRANSFORMS different forms, given by
+Each static dictionary word has 121 different forms, given by
 applying a word transformation to a base word in the DICT array. The
 list of word transformations is given in Appendix B. The static
 dictionary word for a <length, distance> pair can be reconstructed as
@ -1131,21 +1131,21 @@ follows:
 .nf
   word_id = distance - (max allowed distance + 1)
   index = word_id % NWORDS[length]
-   base_word = DICT[offset(length, index)..offset(length, index+1))
+   base_word = DICT[offset(length, index)..offset(length, index+1)-1]
   transform_id = word_id >> NDBITS[length]
 .fi
 The string copied to the output stream is computed by applying the
 transformation to the base dictionary word. If transform_id is
-greater than NTRANSFORMS - 1 or length is greater than 24, the
+greater than 120 or length is greater than 24, the
 compressed data set is invalid.
-Each word transformation has the follwing form:
+Each word transformation has the following form:
   transform_i(word) = prefix_i + T_i(word) + suffix_i
 where the _i subscript denotes the transform_id above. Each T_i
-is one of the following 20 elementary transforms:
+is one of the following 21 elementary transforms:
 .nf
   Identity, OmitLast1, ..., OmitLast9, UppercaseFirst, UppercaseAll,
@ -1169,7 +1169,7 @@ The form of these elementary transforms are as follows:
 .fi
 For the purposes of UppercaseAll, word is parsed into UTF-8
-characters an coverted to upper-case by taking 1 - 3 bytes at a time,
+characters and converted to upper-case by taking 1 - 3 bytes at a time,
 using the algorithm below:
 .nf
@ -1179,15 +1179,15 @@ using the algorithm below:
      if word[i] < 192:
         if word[i] >= 97 and word[i] <= 122:
            word[i] = word[i] ^ 32
-            i = i + 1
+         i = i + 1
      else if word[i] < 224:
         if i + 1 < length(word):
            word[i + 1] = word[i + 1] ^ 32
-            i = i + 2
+         i = i + 2
      else:
         if i + 2 < length(word):
            word[i + 2] = word[i + 2] ^ 5
-            i = i + 3
+         i = i + 3
 .KE
 .fi
@ -1196,6 +1196,9 @@ executed only once.
 Appendix B. contains the list of transformations by specifying the
 prefix, elementary transform and suffix components of each of them.
 Note that the OmitFirst8 elementary transform is not used in the list
 of transformations. The strings in Appendix B. are in C string format
 with respect to escape (backslash) characters.
 .ti 0
 9. Compressed data format
--- a/docs/draft-alakuijala-brotli-02.txt
+++ b/docs/draft-alakuijala-brotli-02.txt