Main Page | Report this Page
Computers Forum Index  »  Computer - Databases  »  Use IDs or strings?...
Page 2 of 2    Goto page Previous  1, 2

Use IDs or strings?...

Author Message
Marco Mariani...
Posted: Thu Dec 04, 2008 7:57 pm
Guest
Walter Mitty wrote:

Quote:
If so, it is better to plan and design such ID's rather than just
exposing a sequence number value. A VIN (Vehicle Identification
Number) is a good example. It isn't perfect, but it goes a long way
beyond just a number.
Ed

Elsewhere, it's been said that a key should NOT carry any information beyond
identifying something. You seem to be saying the opposite. Am I reading
you right?

Of course.

Mr. Elsewhere is wrong.

The following links just scratch the issue. I suppose it's been going on
for more than 20 years.

http://it.toolbox.com/blogs/database-soup/primary-keyvil-part-i-7327
http://it.toolbox.com/blogs/database-soup/primary-keyvil-part-ii-7345
http://it.toolbox.com/blogs/database-soup/primary-keyvil-part-iii-7365
 
Walter Mitty...
Posted: Fri Dec 05, 2008 12:04 am
Guest
"Marco Mariani" <marco at (no spam) sferacarta.com> wrote in message
news:mlRZk.4018$J84.960 at (no spam) tornado.fastwebnet.it...
Quote:
Walter Mitty wrote:

If so, it is better to plan and design such ID's rather than just
exposing a sequence number value. A VIN (Vehicle Identification
Number) is a good example. It isn't perfect, but it goes a long way
beyond just a number.
Ed

Elsewhere, it's been said that a key should NOT carry any information
beyond identifying something. You seem to be saying the opposite. Am I
reading you right?

Of course.

Mr. Elsewhere is wrong.

The following links just scratch the issue. I suppose it's been going on
for more than 20 years.

http://it.toolbox.com/blogs/database-soup/primary-keyvil-part-i-7327
http://it.toolbox.com/blogs/database-soup/primary-keyvil-part-ii-7345
http://it.toolbox.com/blogs/database-soup/primary-keyvil-part-iii-7365

You are conflating two issues. The first is surrogate keys vs. natural
keys. The second is keys that encode multiple attributes vs. information
free keys.

On the issue of surrogate vs. natural I agree with what I think you said:
natural keys are to be preferred, whenever they are reliable.

On the isuue of whether a key should merely identify or should be made out
of a composite that encodes several attributes, you are wrong. The VIN is a
classic case of a key that was constructed the wrong way. And the problems
with VIN make my case for me.

There is a place where composite keys are worthwhile: when identifying
instances of a relationship. Even in this case, the composition should be
acheived by composing a key from multiple columns, and not by creating a
single column that encodes multiple attributes.
 
Ed Prochak...
Posted: Fri Dec 05, 2008 3:45 pm
Guest
On Dec 4, 1:04 pm, "Walter Mitty" <wami... at (no spam) verizon.net> wrote:
Quote:
"Marco Mariani" <ma... at (no spam) sferacarta.com> wrote in message

news:mlRZk.4018$J84.960 at (no spam) tornado.fastwebnet.it...



Walter Mitty wrote:

 If so, it is better to plan and design such ID's rather than just
exposing a sequence number value. A VIN (Vehicle Identification
Number) is a good example. It isn't perfect, but it goes a long way
beyond just a number.
 Ed

Elsewhere, it's been said that a key should NOT carry any information
beyond identifying something.  You seem to be saying the opposite.  Am I
reading you right?

Of course.

Mr. Elsewhere is wrong.

The following links just scratch the issue. I suppose it's been going on
for more than 20 years.

http://it.toolbox.com/blogs/database-soup/primary-keyvil-part-i-7327
http://it.toolbox.com/blogs/database-soup/primary-keyvil-part-ii-7345
http://it.toolbox.com/blogs/database-soup/primary-keyvil-part-iii-7365

You are conflating two issues.  The first is surrogate keys vs. natural
keys.  The second is keys that encode multiple attributes vs. information
free keys.

On the issue of surrogate vs. natural I agree with what I think you said:
natural keys are to be preferred, whenever they are reliable.

Yes.

Quote:

On the isuue of whether a key should merely identify or should be made out
of a composite that encodes several attributes, you are wrong.  The VIN is a
classic case of a key that was constructed the wrong way.  And the problems
with VIN make my case for me.

Enlighten me. Can you give a few examples of the problems? I know of
problems in the implementation of VIN (vehicles assigned the wrong VIN
codes). But I don't think that is the problem you mean.
Quote:

There is a place where composite keys are worthwhile:  when identifying
instances of a relationship.  Even in this case, the composition should be
acheived by composing a key from multiple columns, and not by creating a
single column that encodes multiple attributes.

I'm not sure I see the difference here. Columns are just an
implementation of attributes, aren't they?
(maybe we should that this part of the discussion to
comp.database.theory? your call)
 
Ed Prochak...
Posted: Fri Dec 05, 2008 3:47 pm
Guest
On Dec 4, 8:57 am, Marco Mariani <ma... at (no spam) sferacarta.com> wrote:
Quote:
Walter Mitty wrote:
 If so, it is better to plan and design such ID's rather than just
exposing a sequence number value. A VIN (Vehicle Identification
Number) is a good example. It isn't perfect, but it goes a long way
beyond just a number.
 Ed

Elsewhere, it's been said that a key should NOT carry any information beyond
identifying something.  You seem to be saying the opposite.  Am I reading
you right?

Of course.

Mr. Elsewhere is wrong.

Mr. Elsewhere?

I guess it's better than calling me the lame nickname of Mister Ed. I
am just too tired of that one. 8^)


The first article makes my point. Here's a quote from it:
"The essential problem is that an autonumber "id" column contains no
information about the record to which it's connected, and tells you
nothing about that record. It could be a duplicate, it could be
unique, it could have ceased to exist if some idiot deleted the
foreign key constraint."

In the last of those articles, Mr. Berkus makes the same comment that
I have made many times in discussing this issue: It is not that using
ID columns is wrong, but that misusing them is wrong.

Ed
(but ya doesn't have to call me Mister)
 
Marco Mariani...
Posted: Fri Dec 05, 2008 11:08 pm
Guest
Ed Prochak wrote:

Quote:
Elsewhere, it's been said that a key should NOT carry any information beyond
identifying something. You seem to be saying the opposite. Am I reading
you right?
Of course.

Mr. Elsewhere is wrong.

Mr. Elsewhere?

I guess it's better than calling me the lame nickname of Mister Ed.

Sorry Ed, I was jesting. I meant "the people who told/wrote you,
Elsewhere, that bla bla..." are wrong. My coat is the same colour as
yours Smile (*)



(*) even though I have no experience with big databases or huge
applications, would never dream of touching a line of code that runs
inside an airport or train station, and cannot for the life of me
understand why a VIN pkey can have problems.
 
Marco Mariani...
Posted: Fri Dec 05, 2008 11:16 pm
Guest
Walter Mitty wrote:

Quote:
You are conflating two issues. The first is surrogate keys vs. natural
keys. The second is keys that encode multiple attributes vs. information
free keys.

As I reckon, in my admittedly limited knowledge, surrogate ==
information free.

If you have a column which is called ID...
Numeric, and auto-incrementing or serial or whatever your dbms calls
it... BUT you are displaying the column values to users.... and they use
THAT number to identify stuff... well, it is part of the business model.
It's not surrogate anymore. It's a natural key.
Say, if my tennis card number (let n=53) has been given by a clerk or a
DB engine, does it make any difference?


Quote:
of a composite that encodes several attributes, you are wrong. The VIN is a
classic case of a key that was constructed the wrong way. And the problems
with VIN make my case for me.

I don't know of those problems with VINs. But the alternative is... a
non-global key? I can clearly see the problems with them.
 
paul c...
Posted: Fri Dec 05, 2008 11:19 pm
Guest
Ed Prochak wrote:
Quote:
On Dec 4, 1:04 pm, "Walter Mitty" <wami... at (no spam) verizon.net> wrote:
....
There is a place where composite keys are worthwhile: when identifying
instances of a relationship. Even in this case, the composition should be
acheived by composing a key from multiple columns, and not by creating a
single column that encodes multiple attributes.

I'm not sure I see the difference here. Columns are just an
implementation of attributes, aren't they?
(maybe we should that this part of the discussion to
comp.database.theory? your call)


I doubt if it's got anything to do with theory. I'd say more to do with
the psychological, eg., how users prefer to interpret the aspects of
results that don't affect the way a dbms manipulates symbols to emulate
logical operators. Eg., if two results are the same except that one
used VIN "abc..." to identify a vehicle and the other used VIN "xyz..."
to identify the same vehicle, then the difference can't be logical. Up
to the users if they want to perceive "xyz..." as standing for a Ford
Escort or whether they want it stand for an individual registered car.
Maybe the thread should go to comp.database.psych!


Regarding composites, if the users instead want the dbms to separate
Makes from Models, eg., separate "columns", that is an application
requirement, not a theory issue.
 
Walter Mitty...
Posted: Sat Dec 06, 2008 2:28 am
Guest
"Ed Prochak" <edprochak at (no spam) gmail.com> wrote in message
news:3ccb62d0-4bb0-4742-ae85-894252bb4727 at (no spam) k41g2000yqn.googlegroups.com...
On Dec 4, 1:04 pm, "Walter Mitty" <wami... at (no spam) verizon.net> wrote:
Quote:
"Marco Mariani" <ma... at (no spam) sferacarta.com> wrote in message

news:mlRZk.4018$J84.960 at (no spam) tornado.fastwebnet.it...



Walter Mitty wrote:

If so, it is better to plan and design such ID's rather than just
exposing a sequence number value. A VIN (Vehicle Identification
Number) is a good example. It isn't perfect, but it goes a long way
beyond just a number.
Ed

Elsewhere, it's been said that a key should NOT carry any information
beyond identifying something. You seem to be saying the opposite. Am I
reading you right?

Of course.

Mr. Elsewhere is wrong.

The following links just scratch the issue. I suppose it's been going on
for more than 20 years.

http://it.toolbox.com/blogs/database-soup/primary-keyvil-part-i-7327
http://it.toolbox.com/blogs/database-soup/primary-keyvil-part-ii-7345
http://it.toolbox.com/blogs/database-soup/primary-keyvil-part-iii-7365

You are conflating two issues. The first is surrogate keys vs. natural
keys. The second is keys that encode multiple attributes vs. information
free keys.

On the issue of surrogate vs. natural I agree with what I think you said:
natural keys are to be preferred, whenever they are reliable.

Yes.

Quote:

On the isuue of whether a key should merely identify or should be made out
of a composite that encodes several attributes, you are wrong. The VIN is
a
classic case of a key that was constructed the wrong way. And the problems
with VIN make my case for me.

Enlighten me. Can you give a few examples of the problems? I know of
problems in the implementation of VIN (vehicles assigned the wrong VIN
codes). But I don't think that is the problem you mean.
Quote:

There is a place where composite keys are worthwhile: when identifying
instances of a relationship. Even in this case, the composition should be
acheived by composing a key from multiple columns, and not by creating a
single column that encodes multiple attributes.

I'm not sure I see the difference here. Columns are just an
implementation of attributes, aren't they?
(maybe we should that this part of the discussion to
comp.database.theory? your call)


From the point of view of content, I would say, yeah, let's move it over to
comp.database.theory.

But, if we do that, we're going to provoke responses from a certain
individual whose initials are BS.
And I'm not sure I can stand to read his exposition on this particular
subject.

I'll give you the short version: Some of the attributes used to construct
the VIN, (IIRC) are mutable attributes. Some of the people who use the VIN
figure out how to decompose the VIN into the component attributes. Then,
they use the value of the attribute at the time of manufacture as if it were
the current value of the attribute, which it (sometimes) no longer is. They
congratulate themselves on having saved a table lookup (whoop-de-doo) and
shrug off the fact that they deliver outdated results.

Admittedly this problem falls under the general category of VIN misuse,
which is like the comment you made about ID misuse in a different subthread.
All I'm gonna say is that the VIN invites misuse. It's a trap for fools.
If someone such as you or I were to use it as if it were a meaningless
identifier, we would avoid this trap, at the cost of running a little
slower than the geniuses who skip the table lookup.
 
Jasen Betts...
Posted: Sat Dec 06, 2008 7:16 am
Guest
On 2008-12-05, Marco Mariani <marco at (no spam) sferacarta.com> wrote:
Quote:
Ed Prochak wrote:

Elsewhere, it's been said that a key should NOT carry any information beyond
identifying something. You seem to be saying the opposite. Am I reading
you right?
Of course.

Mr. Elsewhere is wrong.

Mr. Elsewhere?

I guess it's better than calling me the lame nickname of Mister Ed.

Sorry Ed, I was jesting. I meant "the people who told/wrote you,
Elsewhere, that bla bla..." are wrong. My coat is the same colour as
yours Smile (*)



(*) even though I have no experience with big databases or huge
applications, would never dream of touching a line of code that runs
inside an airport or train station, and cannot for the life of me
understand why a VIN pkey can have problems.

having registered a home made vehicle once (it was a trailer, and I
got it second-hand) I was instructed to make a VIN plate and given a
form with suggestions on how to make-up a number for it,

while the possibilkity for VIN collisions may be miniscule it is
non-zero

In another application I'm using MAC addresses as primary keys on one
table - obviously that won't work for clients using only PPP for for
networking, but for this application it's not an issue.
 
Ed Prochak...
Posted: Sun Dec 07, 2008 4:53 am
Guest
On Dec 5, 3:28 pm, "Walter Mitty" <wami... at (no spam) verizon.net> wrote:
Quote:
"Ed Prochak" <edproc... at (no spam) gmail.com> wrote in message

[snip]
On the isuue of whether a key should merely identify or should be made out
of a composite that encodes several attributes, you are wrong. The VIN is
a
classic case of a key that was constructed the wrong way. And the problems
with VIN make my case for me.

Enlighten me. Can you give a few examples of the problems? I know of
problems in the implementation of VIN (vehicles assigned the wrong VIN
codes). But I don't think that is the problem you mean.



There is a place where composite keys are worthwhile: when identifying
instances of a relationship. Even in this case, the composition should be
acheived by composing a key from multiple columns, and not by creating a
single column that encodes multiple attributes.

I'm not sure I see the difference here. Columns are just an
implementation of attributes, aren't they?
(maybe we should that this part of the discussion to
comp.database.theory? your call)

From the point of view of content, I would say,  yeah, let's move it over to
comp.database.theory.

I was just a little confused. On rereading your comment I understand
and I generally agree that a multiple column Primary Key is preferable
to blindly using an ID column. I see you prefer that same. I think we
differ on the transition point. IOW, there is a point where a multiple
column Primary Key is so long or complex that a surrogate must be
used. I think that you at that point would go to the informationless
autonumber ID. Possibly due to your experience on the VIN noted below.
Quote:

But, if we do that, we're going to provoke responses from a certain
individual whose initials are BS.
And I'm not sure I can stand to read his exposition on this particular
subject.

fine with me.
Quote:

I'll give you the short version:  Some of the attributes used to construct
the VIN, (IIRC) are mutable attributes.  Some of the people who use the VIN
figure out how to decompose the VIN into the component attributes.  Then,
they use the value of the attribute at the time of manufacture as if it were
the current value of the attribute, which it (sometimes) no longer is.  They
congratulate themselves on having  saved a table lookup (whoop-de-doo)  and
shrug off the fact that they deliver outdated results.

Admittedly this problem falls under the general category of VIN misuse,
which is like the comment you made about ID misuse in a different subthread.
All I'm gonna say is that the VIN invites misuse.  It's a trap for fools.
If someone such as you or I were to use it as if it were a meaningless
identifier,  we would avoid this trap,  at the cost of running a little
slower than the geniuses who skip the table lookup.

Yes, misuse is exactly right. I see your point. I disagree that it is
reason to denigrate VIN codes. I think I would agree that I would
prefer not to work with those "geniuses". 8^)
Your point is made though, and it is a good one.

Ed
 
Ed Prochak...
Posted: Fri Dec 19, 2008 8:14 pm
Guest
Top posting is annoying, Please don't top post.

On Dec 19, 9:40 am, "Elmer Fudd" <bitbuc... at (no spam) 127.0.0.1> wrote:
Quote:
How does one handle cars manufactured before VINs came into use? How about
books before ISBNs began to be used? Which identification systems are
commonly used?

On Wed, 3 Dec 2008 21:43:57 -0800 (PST), Ed Prochak wrote:
[to be safe on the quotes I snipped all but the last one, mine.]
If so, it is better to plan and design such ID's rather than just
exposing a sequence number value. A VIN (Vehicle Identification
Number) is a good example. It isn't perfect, but it goes a long way
beyond just a number.
Ed

Mr. Fudd,
If you want it street legal, you need plates. Even for historical
plates, you likely need a VIN. I would assume they might assign a VIN
and attach VIN plates to an antique car. I haven't done this so not
sure precisely how to do it. Call you local department of motor
vehicles. Anyway the rule that applies here is the rule set by law. It
doesn't justify sequence IDs in the database.

I don't see why ISBNs cannot be assigned to older book publishings. It
is not like a VIN where it must appear on the object(book) being
identified. I believe there are rules for assigning ISBNs which can
apply. So this example is a poor excuse for justifying sequence IDs
(which I assume is your point).

Sincerely,
Mr. Ed 8^)
 
Elmer Fudd...
Posted: Fri Dec 19, 2008 8:40 pm
Guest
How does one handle cars manufactured before VINs came into use? How about
books before ISBNs began to be used? Which identification systems are
commonly used?

On Wed, 3 Dec 2008 21:43:57 -0800 (PST), Ed Prochak wrote:

Quote:
On Dec 3, 10:48 am, paul c <toledobythe... at (no spam) oohay.ac> wrote:
Ed Prochak wrote:
On Dec 1, 10:07 am, paul c <toledobythe... at (no spam) oohay.ac> wrote:
Ed Prochak wrote:

...

Are these IDs known to users?
    If the answer to either of these is yes, then use the IDs..
...
If they are useful (pertinent if you like), they will become known..  The
only decision to make about such ID's is whether they have a pertinent
use in one of the applications.

Applications are not the users I was referring to. Applications (code)
are part of the system and must therefore have access to such ID's.
But that's not relevant to the original question.
ed

Alright, to be pedantic, if they are useful to users they will become
known by users.

Agreed.

If so, it is better to plan and design such ID's rather than just
exposing a sequence number value. A VIN (Vehicle Identification
Number) is a good example. It isn't perfect, but it goes a long way
beyond just a number.
Ed
 
Walter Mitty...
Posted: Fri Dec 19, 2008 9:43 pm
Guest
Quote:
"Elmer Fudd" <bitbucket at (no spam) 127.0.0.1> wrote in message
news:ovgohpxrg.kc4num0.pminews at (no spam) news.verizon.net...
How does one handle cars manufactured before VINs came into use? How about
books before ISBNs began to be used? Which identification systems are
commonly used?

On Wed, 3 Dec 2008 21:43:57 -0800 (PST), Ed Prochak wrote:


Eh, Doc.... please refrain from top posting.
 
--CELKO--...
Posted: Tue Dec 23, 2008 11:27 pm
Guest
Quote:
How does one handle cars manufactured before VINs came into use?

You will be issued a VIN if you want to drive the vehicle on a public
road. If you build your own vehicle, then you inherit the VIN on the
engine block or you get issued one. If the manufacturer went out of
business before 1950, you get a dummy value from a DMV or national
agency. Hey, they want your money.

Here are some more details: http://www.vinguard.org/vin.htm

Quote:
How about books before ISBNs began to be used? Which identification systems are commonly used?

Having owned some bookstores in the past, it is much the same scheme.
We used a dummy language sub-code of 99 (97 is Esperanto), a Dewey
Decimal Classification and then a sequence within the DDC.

In retail, if you make a product in-house (bakery in a grocery store),
there are "open codes" in the UPC so you can bar-code the goods.

If you look for it, you will find a standard or two that will help and
give you validation and verification rules.
 
Elmer Fudd...
Posted: Wed Dec 24, 2008 11:04 pm
Guest
Thank you, great to know!

On Tue, 23 Dec 2008 15:27:34 -0800 (PST), --CELKO-- wrote:

Quote:
How does one handle cars manufactured before VINs came into use?

You will be issued a VIN if you want to drive the vehicle on a public
road. If you build your own vehicle, then you inherit the VIN on the
engine block or you get issued one. If the manufacturer went out of
business before 1950, you get a dummy value from a DMV or national
agency. Hey, they want your money.

Here are some more details: http://www.vinguard.org/vin.htm

How about books before ISBNs began to be used? Which identification systems are commonly used?

Having owned some bookstores in the past, it is much the same scheme.
We used a dummy language sub-code of 99 (97 is Esperanto), a Dewey
Decimal Classification and then a sequence within the DDC.

In retail, if you make a product in-house (bakery in a grocery store),
there are "open codes" in the UPC so you can bar-code the goods.

If you look for it, you will find a standard or two that will help and
give you validation and verification rules.
 
 
Page 2 of 2    Goto page Previous  1, 2
All times are GMT
The time now is Mon Mar 22, 2010 3:01 pm