Wednesday, March 28, 2012

Text change management

We have a web application that has a lot of large textareas for data
collection. We need to track who makes changes, when and what changes
were made. Right now I'm storing the entire text in a separate SQL
table
before the changes are saved. Needless to say this table is growing
quite large and is beginning to cause performance problems.

A lot of the changes are small, like capitalization or punctuation. Is

there anything that is available that somehow calculates just the
change that occurs? It's kind of hard to explain, but I want to store
only the modifications. But I also need to rebuild the previous
document if necessary.

I know something has to exist but I can't even think of a term to
search for.

Any guidance would be great,
ADIndustry term.

Track Changes in Microsoft Word used to keep track of the changes you make
to a document.
You can then choose to accept or reject those changes.Let's say Bill creates
a document and emails it to his colleague, Gus, for feedback. Gus can edit
the document with Track Changes on. When Gus sends the document back to
Bill, Bill can see what changes Gus had made.

Track Changes is also known as redline, or redlining.
This is because some industries traditionally draw a vertical red line in
the margin to show that some text has changed.

It is easier if you have complete control of the editor being used to make
the changes. so you can keep track of the changes as they are entered.
record them and mark their position.

Otherwise you need to go through analysis to see what has changed.

What is causing the performance issue for you?

Sorry this is only a 1 cent contribution.

SA

<addoty@.gmail.com> wrote in message
news:1146593511.645751.300490@.v46g2000cwv.googlegr oups.com...
> We have a web application that has a lot of large textareas for data
> collection. We need to track who makes changes, when and what changes
> were made. Right now I'm storing the entire text in a separate SQL
> table
> before the changes are saved. Needless to say this table is growing
> quite large and is beginning to cause performance problems.
> A lot of the changes are small, like capitalization or punctuation. Is
> there anything that is available that somehow calculates just the
> change that occurs? It's kind of hard to explain, but I want to store
> only the modifications. But I also need to rebuild the previous
> document if necessary.
> I know something has to exist but I can't even think of a term to
> search for.
> Any guidance would be great,
> AD
http://www.heikniemi.net/hc/archives/000124.html

<addoty@.gmail.com> wrote in message
news:1146593511.645751.300490@.v46g2000cwv.googlegr oups.com...
> We have a web application that has a lot of large textareas for data
> collection. We need to track who makes changes, when and what changes
> were made. Right now I'm storing the entire text in a separate SQL
> table
> before the changes are saved. Needless to say this table is growing
> quite large and is beginning to cause performance problems.
> A lot of the changes are small, like capitalization or punctuation. Is
> there anything that is available that somehow calculates just the
> change that occurs? It's kind of hard to explain, but I want to store
> only the modifications. But I also need to rebuild the previous
> document if necessary.
> I know something has to exist but I can't even think of a term to
> search for.
> Any guidance would be great,
> AD
SA, thanks for the reply.

I have a web application that has lots of large textareas. These are
modified frequently and we need to track who makes what changes and
when. For example, one textarea could have 3000 words. Someone
browses to this record and add 7 words and saves it.

I'm currently saving all original 3000 words in a separate table.

Someone else browses to this 10 minutes later and changes the
capitalization of 4 words. I save all 3007 words in the change table.

Now multiply this by >100 changes a day and multiply it again by >1000
textareas and you can see my problem.

I need to figure out how to save just the modifications and not the
whole textarea.

AD
Thanks for the reply but this...

http://www.heikniemi.net/hc/archives/000124.html

doesn't really address my issue.
AD,

This is not very easy. I am just thinking with you here.

May record every text change made at the client side then save it on the
server side.

May compare all text with original, at server side, and note changes.

What happens when you make 100 changes to the changes... then you have to go
through the Changes-Transactions to get to the final one.

So there is a trade off here space vs. speed.

What you might also have is a Background Process in the middle of the night
that can compress your data.

to be viewed it has to be uncompressed. non compressed data are viewed with
no further processing.

What about Zipping things up and unzip before viewing. zipping can reduce
the size dramatically but required processing.

You still have to send all 3000 words down to the client no matter what.

What you are saying your speed is slow at the sql-server side?

SA

<addoty@.gmail.com> wrote in message
news:1146599884.294939.102930@.j73g2000cwa.googlegr oups.com...
> SA, thanks for the reply.
> I have a web application that has lots of large textareas. These are
> modified frequently and we need to track who makes what changes and
> when. For example, one textarea could have 3000 words. Someone
> browses to this record and add 7 words and saves it.
> I'm currently saving all original 3000 words in a separate table.
> Someone else browses to this 10 minutes later and changes the
> capitalization of 4 words. I save all 3007 words in the change table.
> Now multiply this by >100 changes a day and multiply it again by >1000
> textareas and you can see my problem.
> I need to figure out how to save just the modifications and not the
> whole textarea.
> AD
Thanks for you help.

What I'm thinking is, say the text is 3000 words. Someone edits and
adds 7 words. Instead of saving the previous 3000 words, I could save
the new changes (3007 words) and somehow save a modification log or
change log that can be used to recreate the 3000 original 3000 words if
necessary. I don't know if it would be a text comparison or some sort
of binary comparison to create this modification log.

I'm thinking there has to be some sort of commericial tool that does
this.

AD
How about using ZIP for the OLD copy?

File1-Version 1.0
Zip this as File-1-mmddyyyyhhmiss.zip
Any changes implemented to original file
Zip this as File-1-mmddyyyyhhmiss.zip

That way you can always have single copy of the latest and rest of them
are saved as ZIP file.
I like this zip idea. Is there any way to implement this at the SQL
Server level?

AD

0 comments:

Post a Comment