I frequently write docbook documents colloratively with other authors.
So that each of us can be working on the document in parallel, we use a
version control system (CVS to be specific) to manage the document.
This works great and because Docbook/XML is an ASCII-based format the
version control system is able to resolve most changes/merges with no
problem.
What is a problem though is that with most Docbook editors, changing a
single word in a paragraph causes a very large "diff" in the file. This
makes it difficult to find what exactly changed in the document from
version to version.
For example, a single spelling change in a large paragraph could cause
the entire paragraph to be replaced in the new version. This can make
tracing through document histories VERY painful and frustrating.
As and example take the following document:
<sect1>
<title>Section 1</title>
<para>This is the first paragraph of text. CHANGE THIS. There
is more text to follow, but this should show the problem fine.
If only the save option wrote this paragraph out better.</para>
</sect1>
Now imagine that I wanted to change the section of text at "CHANGE
THIS". Then it would look like:
<sect1>
<title>Section 1</title>
<para>This is the first paragraph of text. Diff. There is more
text to follow, but this should show the problem fine. If only
the save option wrote this paragraph out better.</para>
</sect1>
The problem is that by changing 2 words I have created a file diff for
every line of the paragraph (because words were shifted to new lines).
XXE already supports a save option that allows for a max line length,
but this doesn't really solve the problem. Although it can help
minimize it so that only the lines on or after the changed word are
considered different.
One "simple" solution may be to add an option to the save options that
creates new lines at end-of-sentence punctuation marks. This could
probably be implemented in the same area that the current max line
length code is at.
So, with this option the example would look like:
<sect1>
<title>Section 1</title>
<para>This is the first paragraph of text.
CHANGE THIS.
There is more text to follow, but this should show the problem fine.
If only the save option wrote this paragraph out better.
</para>
</sect1>
This a single change would look like:
<sect1>
<title>Section 1</title>
<para>This is the first paragraph of text.
Diff.
There is more text to follow, but this should show the problem fine.
If only the save option wrote this paragraph out better.</para>
</sect1>
The diff in this case is much better because it only shows the single
line that changed instead of showing the entire paragraph.
Any opinions on this feature? Other ideas? Would it be difficult to
implement?
If my description is unclear, please ask questions and I will try to
clarify the problem.
Thanks,
Allen