Wednesday, March 7, 2012

Representation for Deleted Entities: difficult question

Our customer (of our ecommerce system) wants to be able to preserve
deleted entities in the database so that they can do reporting,
auditing etc.

The system is quite complex where each end user can belong to multiple
institutional affiliations (which can purchase on behalf of the user).
The end user also has a rich trail of past transactions affiliations
etc. Thus in the schema each user entity is related to many others
which in turn relate to yet others and so on.

In the past when a user was deleted all of his complex relationships
were also deleted in a cascading fashion. But now the customer wants
us to add a "deleted" flag to each user so that a user is never
_really_ deleted but instead his "deleted" flag is set to true. The
system subsequently behaves as if the user did not exist but the
customer can still do reports on deleted users.

I pointed out that it is not as simple as that because the user entity
is related to many, many others so we would have to add this "deleted"
flag to every relationship and every other entity and thus have
"deleted" past purchases, "deleted" affiliations - a whole shadow
schema full of such ghost entities. This would overtime degrade
performance since now each query in the system has to add a clause:
"where deleted = 0".

I assume this is a standard problem since many organizations must have
this need of preserving deleted records (for legal or other reasons).
I tried to talk them into creating a simple audit file where all the
deletions will be recorded in XML but they were not too happy with
that.

Is there a more satisfying solution to this than have this "deleted"
flag?

Thanks for your help,

- robertYou didn't post database you're using.

In Oracle you could partition the main table (from where all
'cascaded' is coming from) into deleted/undeleted records. Put a view
on top of the table just as original table would look like with where
clause 'undeleted' and the undeleted partition would always be used.
Only certain versions of oracle allow rows to migrate from partition
to partition, u can look it up.

However, you would still suffer from 'performance' issues on the other
tables, if your data is truly huge.

Best solution would probably be to separate all deleted/undeleted data
into separate tables, and built a union view on top of them for
reporting purposes.

my2c
Someone might have a better idea.

On 13 May 2004 11:32:21 -0700, robertbrown1971@.yahoo.com (Robert
Brown) wrote:

>Our customer (of our ecommerce system) wants to be able to preserve
>deleted entities in the database so that they can do reporting,
>auditing etc.
>The system is quite complex where each end user can belong to multiple
>institutional affiliations (which can purchase on behalf of the user).
>The end user also has a rich trail of past transactions affiliations
>etc. Thus in the schema each user entity is related to many others
>which in turn relate to yet others and so on.
>In the past when a user was deleted all of his complex relationships
>were also deleted in a cascading fashion. But now the customer wants
>us to add a "deleted" flag to each user so that a user is never
>_really_ deleted but instead his "deleted" flag is set to true. The
>system subsequently behaves as if the user did not exist but the
>customer can still do reports on deleted users.
>I pointed out that it is not as simple as that because the user entity
>is related to many, many others so we would have to add this "deleted"
>flag to every relationship and every other entity and thus have
>"deleted" past purchases, "deleted" affiliations - a whole shadow
>schema full of such ghost entities. This would overtime degrade
>performance since now each query in the system has to add a clause:
>"where deleted = 0".
>I assume this is a standard problem since many organizations must have
>this need of preserving deleted records (for legal or other reasons).
>I tried to talk them into creating a simple audit file where all the
>deletions will be recorded in XML but they were not too happy with
>that.
>Is there a more satisfying solution to this than have this "deleted"
>flag?
>Thanks for your help,
>- robert

......
We use Oracle 8.1.7.4 on Solaris 2.7 boxes
remove NSPAM to email|||"Robert Brown" <robertbrown1971@.yahoo.com> wrote in message
news:240a4d09.0405131032.6c2e9802@.posting.google.c om...
> I pointed out that it is not as simple as that because the user entity
> is related to many, many others so we would have to add this "deleted"
> flag to every relationship and every other entity and thus have
> "deleted" past purchases, "deleted" affiliations - a whole shadow
> schema full of such ghost entities. This would overtime degrade
> performance since now each query in the system has to add a clause:
> "where deleted = 0".

Blanket statements like this are rarely true.

> I assume this is a standard problem since many organizations must have
> this need of preserving deleted records (for legal or other reasons).
> I tried to talk them into creating a simple audit file where all the
> deletions will be recorded in XML but they were not too happy with
> that.

So you think file is better than DBMS?

> Is there a more satisfying solution to this than have this "deleted"
> flag?

Can I suggest that there is a modelling problem? I can imagine customer
having multiple timestamp columns, for example

table customer (
...
DOB DATE,
married DATE,
divorced DATE,
died DATE
)

but can't possibly see why you need "is_alive", "is_married" boolean
columns.|||andreyNSPAM@.bookexchange.net (NetComrade) wrote in message news:<40a3cbf2.428366208@.localhost>...

> You didn't post database you're using.

Thanks for your answer. This particular customer is using Oracle but
our software is supported on SQL server as well.

> In Oracle you could partition the main table (from where all
> 'cascaded' is coming from) into deleted/undeleted records. Put a view
> on top of the table just as original table would look like with where
> clause 'undeleted' and the undeleted partition would always be used.
> Only certain versions of oracle allow rows to migrate from partition
> to partition, u can look it up.
> However, you would still suffer from 'performance' issues on the other
> tables, if your data is truly huge.
> Best solution would probably be to separate all deleted/undeleted data
> into separate tables, and built a union view on top of them for
> reporting purposes.
> my2c
> Someone might have a better idea.
>
> On 13 May 2004 11:32:21 -0700, robertbrown1971@.yahoo.com (Robert
> Brown) wrote:
> >Our customer (of our ecommerce system) wants to be able to preserve
> >deleted entities in the database so that they can do reporting,
> >auditing etc.
> >The system is quite complex where each end user can belong to multiple
> >institutional affiliations (which can purchase on behalf of the user).
> >The end user also has a rich trail of past transactions affiliations
> >etc. Thus in the schema each user entity is related to many others
> >which in turn relate to yet others and so on.
> >In the past when a user was deleted all of his complex relationships
> >were also deleted in a cascading fashion. But now the customer wants
> >us to add a "deleted" flag to each user so that a user is never
> >_really_ deleted but instead his "deleted" flag is set to true. The
> >system subsequently behaves as if the user did not exist but the
> >customer can still do reports on deleted users.
> >I pointed out that it is not as simple as that because the user entity
> >is related to many, many others so we would have to add this "deleted"
> >flag to every relationship and every other entity and thus have
> >"deleted" past purchases, "deleted" affiliations - a whole shadow
> >schema full of such ghost entities. This would overtime degrade
> >performance since now each query in the system has to add a clause:
> >"where deleted = 0".
> >I assume this is a standard problem since many organizations must have
> >this need of preserving deleted records (for legal or other reasons).
> >I tried to talk them into creating a simple audit file where all the
> >deletions will be recorded in XML but they were not too happy with
> >that.
> >Is there a more satisfying solution to this than have this "deleted"
> >flag?
> >Thanks for your help,
> >- robert
> ......
> We use Oracle 8.1.7.4 on Solaris 2.7 boxes
> remove NSPAM to email|||Note that SQL Server 2000 EE does support partitioned views.

On 14 May 2004 09:14:47 -0700, robertbrown1971@.yahoo.com (Robert Brown) wrote:

>andreyNSPAM@.bookexchange.net (NetComrade) wrote in message news:<40a3cbf2.428366208@.localhost>...
>> You didn't post database you're using.
>Thanks for your answer. This particular customer is using Oracle but
>our software is supported on SQL server as well.
>
>> In Oracle you could partition the main table (from where all
>> 'cascaded' is coming from) into deleted/undeleted records. Put a view
>> on top of the table just as original table would look like with where
>> clause 'undeleted' and the undeleted partition would always be used.
>> Only certain versions of oracle allow rows to migrate from partition
>> to partition, u can look it up.
>>
>> However, you would still suffer from 'performance' issues on the other
>> tables, if your data is truly huge.
>>
>> Best solution would probably be to separate all deleted/undeleted data
>> into separate tables, and built a union view on top of them for
>> reporting purposes.
>>
>> my2c
>> Someone might have a better idea.
>>
>>
>>
>> On 13 May 2004 11:32:21 -0700, robertbrown1971@.yahoo.com (Robert
>> Brown) wrote:
>>
>> >Our customer (of our ecommerce system) wants to be able to preserve
>> >deleted entities in the database so that they can do reporting,
>> >auditing etc.
>>> >The system is quite complex where each end user can belong to multiple
>> >institutional affiliations (which can purchase on behalf of the user).
>> >The end user also has a rich trail of past transactions affiliations
>> >etc. Thus in the schema each user entity is related to many others
>> >which in turn relate to yet others and so on.
>>> >In the past when a user was deleted all of his complex relationships
>> >were also deleted in a cascading fashion. But now the customer wants
>> >us to add a "deleted" flag to each user so that a user is never
>> >_really_ deleted but instead his "deleted" flag is set to true. The
>> >system subsequently behaves as if the user did not exist but the
>> >customer can still do reports on deleted users.
>>> >I pointed out that it is not as simple as that because the user entity
>> >is related to many, many others so we would have to add this "deleted"
>> >flag to every relationship and every other entity and thus have
>> >"deleted" past purchases, "deleted" affiliations - a whole shadow
>> >schema full of such ghost entities. This would overtime degrade
>> >performance since now each query in the system has to add a clause:
>> >"where deleted = 0".
>>> >I assume this is a standard problem since many organizations must have
>> >this need of preserving deleted records (for legal or other reasons).
>> >I tried to talk them into creating a simple audit file where all the
>> >deletions will be recorded in XML but they were not too happy with
>> >that.
>>> >Is there a more satisfying solution to this than have this "deleted"
>> >flag?
>>> >Thanks for your help,
>>> >- robert
>>
>> ......
>> We use Oracle 8.1.7.4 on Solaris 2.7 boxes
>> remove NSPAM to email|||This solution is not specific to Oracle. Most DBMS support partitions and
views.

--
Mike Nicewarner [TeamSybase]
http://www.datamodel.org
mike@.nospam!datamodel.org
Sybase product enhancement requests:
http://www.isug.com/cgi-bin/ISUG2/submit_enhancement

"NetComrade" <andreyNSPAM@.bookexchange.net> wrote in message
news:40a3cbf2.428366208@.localhost...
> You didn't post database you're using.
> In Oracle you could partition the main table (from where all
> 'cascaded' is coming from) into deleted/undeleted records. Put a view
> on top of the table just as original table would look like with where
> clause 'undeleted' and the undeleted partition would always be used.
> Only certain versions of oracle allow rows to migrate from partition
> to partition, u can look it up.
> However, you would still suffer from 'performance' issues on the other
> tables, if your data is truly huge.
> Best solution would probably be to separate all deleted/undeleted data
> into separate tables, and built a union view on top of them for
> reporting purposes.
> my2c
> Someone might have a better idea.
>
> On 13 May 2004 11:32:21 -0700, robertbrown1971@.yahoo.com (Robert
> Brown) wrote:
> >Our customer (of our ecommerce system) wants to be able to preserve
> >deleted entities in the database so that they can do reporting,
> >auditing etc.
> >The system is quite complex where each end user can belong to multiple
> >institutional affiliations (which can purchase on behalf of the user).
> >The end user also has a rich trail of past transactions affiliations
> >etc. Thus in the schema each user entity is related to many others
> >which in turn relate to yet others and so on.
> >In the past when a user was deleted all of his complex relationships
> >were also deleted in a cascading fashion. But now the customer wants
> >us to add a "deleted" flag to each user so that a user is never
> >_really_ deleted but instead his "deleted" flag is set to true. The
> >system subsequently behaves as if the user did not exist but the
> >customer can still do reports on deleted users.
> >I pointed out that it is not as simple as that because the user entity
> >is related to many, many others so we would have to add this "deleted"
> >flag to every relationship and every other entity and thus have
> >"deleted" past purchases, "deleted" affiliations - a whole shadow
> >schema full of such ghost entities. This would overtime degrade
> >performance since now each query in the system has to add a clause:
> >"where deleted = 0".
> >I assume this is a standard problem since many organizations must have
> >this need of preserving deleted records (for legal or other reasons).
> >I tried to talk them into creating a simple audit file where all the
> >deletions will be recorded in XML but they were not too happy with
> >that.
> >Is there a more satisfying solution to this than have this "deleted"
> >flag?
> >Thanks for your help,
> >- robert
> ......
> We use Oracle 8.1.7.4 on Solaris 2.7 boxes
> remove NSPAM to email|||As Leandro and Mikito point out, you have flaws in your design.
First, deleting the user entity is the only thing that is logically deleted,
but rather than make it an indicator, use a date, as in DELETE_DATE as
nullable. Non-null entities are to be ignored.
In addition, all relationships to the user entity should be evaluated to
determine if they need to be sensitive to the user entity's status. Some
may, and others may not. This is a business question, not a technical
question. For instance, if there are invoices and inventory tables linked
in some way to the user entity, would you really want to *not* display that
information just because an associated user entity had been deleted?

I'd really need to see your design and talk to your business to know exactly
what impact this design change would have on your database.

--
Mike Nicewarner [TeamSybase]
http://www.datamodel.org
mike@.nospam!datamodel.org
Sybase product enhancement requests:
http://www.isug.com/cgi-bin/ISUG2/submit_enhancement

"Robert Brown" <robertbrown1971@.yahoo.com> wrote in message
news:240a4d09.0405131032.6c2e9802@.posting.google.c om...
> Our customer (of our ecommerce system) wants to be able to preserve
> deleted entities in the database so that they can do reporting,
> auditing etc.
> The system is quite complex where each end user can belong to multiple
> institutional affiliations (which can purchase on behalf of the user).
> The end user also has a rich trail of past transactions affiliations
> etc. Thus in the schema each user entity is related to many others
> which in turn relate to yet others and so on.
> In the past when a user was deleted all of his complex relationships
> were also deleted in a cascading fashion. But now the customer wants
> us to add a "deleted" flag to each user so that a user is never
> _really_ deleted but instead his "deleted" flag is set to true. The
> system subsequently behaves as if the user did not exist but the
> customer can still do reports on deleted users.
> I pointed out that it is not as simple as that because the user entity
> is related to many, many others so we would have to add this "deleted"
> flag to every relationship and every other entity and thus have
> "deleted" past purchases, "deleted" affiliations - a whole shadow
> schema full of such ghost entities. This would overtime degrade
> performance since now each query in the system has to add a clause:
> "where deleted = 0".
> I assume this is a standard problem since many organizations must have
> this need of preserving deleted records (for legal or other reasons).
> I tried to talk them into creating a simple audit file where all the
> deletions will be recorded in XML but they were not too happy with
> that.
> Is there a more satisfying solution to this than have this "deleted"
> flag?
> Thanks for your help,
> - robert|||robertbrown1971@.yahoo.com (Robert Brown) wrote:
> Our customer (of our ecommerce system) wants to be able to preserve
> deleted entities in the database so that they can do reporting,
> auditing etc.
> The system is quite complex where each end user can belong to multiple
> institutional affiliations (which can purchase on behalf of the user).
> The end user also has a rich trail of past transactions affiliations
> etc. Thus in the schema each user entity is related to many others
> which in turn relate to yet others and so on.
> In the past when a user was deleted all of his complex relationships
> were also deleted in a cascading fashion.

Users are not deleted. They may die, they may be incarcerated, they may
be fired, but they cannot be deleted. Only data is deleted.

> But now the customer wants
> us to add a "deleted" flag to each user so that a user is never
> _really_ deleted but instead his "deleted" flag is set to true. The
> system subsequently behaves as if the user did not exist but the
> customer can still do reports on deleted users.
> I pointed out that it is not as simple as that because the user entity
> is related to many, many others so we would have to add this "deleted"
> flag to every relationship and every other entity and thus have
> "deleted" past purchases,

If I am hit by a bus tomorrow, will the office furniture I ordered (but
which is owned by the company) disappear? Will the patents I've generated
for the company no longer be valid?

> "deleted" affiliations - a whole shadow
> schema full of such ghost entities.

Why would each of these need it's own deleted flag? If they are being
joined against the user table, then they can just rely on the user table to
provide the necessary logic. If they are not being joined against the user
table, then why would you want them to be excluded based on a condition of
a user?

> This would overtime degrade
> performance since now each query in the system has to add a clause:
> "where deleted = 0".

Do they want the reporting to go back indefinately, or would they like
the data to be "really" deleted after a certain time of "virtual" deletion?

> I assume this is a standard problem since many organizations must have
> this need of preserving deleted records (for legal or other reasons).

It is a standard problem, but there is not a standard solution.

Xho

--
------- http://NewsReader.Com/ -------
Usenet Newsgroup Service $9.95/Month 30GB|||>It is a standard problem, but there is not a standard solution.
>Xho

Put a bit column in there and flag it if this record is "deleted".
Or write a trigger and dump the deleted records into an archive
table.

Randy
http://members.aol.com/rsmeiner|||Robert Brown (robertbrown1971@.yahoo.com) writes:
> Our customer (of our ecommerce system) wants to be able to preserve
> deleted entities in the database so that they can do reporting,
> auditing etc.
> The system is quite complex where each end user can belong to multiple
> institutional affiliations (which can purchase on behalf of the user).
> The end user also has a rich trail of past transactions affiliations
> etc. Thus in the schema each user entity is related to many others
> which in turn relate to yet others and so on.
> In the past when a user was deleted all of his complex relationships
> were also deleted in a cascading fashion. But now the customer wants
> us to add a "deleted" flag to each user so that a user is never
> _really_ deleted but instead his "deleted" flag is set to true. The
> system subsequently behaves as if the user did not exist but the
> customer can still do reports on deleted users.
> I pointed out that it is not as simple as that because the user entity
> is related to many, many others so we would have to add this "deleted"
> flag to every relationship and every other entity and thus have
> "deleted" past purchases, "deleted" affiliations - a whole shadow
> schema full of such ghost entities. This would overtime degrade
> performance since now each query in the system has to add a clause:
> "where deleted = 0".

Indeed, this is quite a big change if this was not in the system from
the beginning.

In our system, many items are not deletable, because even if an item
goes away, there might still be plenty of references to it. So in these
tables - accounts, customers, instruments, currencies, to name a few - there
is a deregdate column. This column serves the double purpose of telling
us if the item is still active, and if it is not, when the entity was
deregistered. And, yes it happens that deregistered entities are revived
too!

A non-trivial issue here is to know when a deregistered item should be
included and when it should not. If you are producing a list of last
month's tranactions, it obviously should. But if you are populating a
list of available products to order, deregistered products should not be
included. So when you introduce this concept in your system, you have
a lot to write in your functional specification too.

Another issue which becomes complicated, is referential integrity.
accounts.accresponsible may refer to the users table, but if the account
is active, the user must be too. Currently we do this in triggers, which
is a bit complex, and more difficult than foreign-key constraints. One
thought I've been playing with is to have tables like active_accounts,
active_users etc. This would not be the partition suggested by others,
since active_accounts would only hold the account number, and foreign
keys to other deregisterable items. Thus, active_accounts.accresponsible
would refer to active_users.userid. The full data would still be in
acconts and users, for both active and deregistered items.

I have never considered the performance cost for "AND deregdate IS NULL",
but I would suggest that if you need to access that column, you probably
already access some column which is not in any index, so there is
already a bookmark lookup, so I would not expect any particular penalty.
(This applies to MS SQL Server. Not that I really expect Oracle to be
different, but I don't know Oracle.)

--
Erland Sommarskog, SQL Server MVP, sommar@.algonet.se

Books Online for SQL Server SP3 at
http://www.microsoft.com/sql/techin.../2000/books.asp

No comments:

Post a Comment