MongoDB for Developers with Python Transcripts
Chapter: Course conclusion
Lecture: Lightning review: Document design
Login or
purchase this course
to watch this video and the rest of the course contents.
0:01
Next up was document design.
0:03
Some of the concepts and ideas of relational databases still apply here,
0:07
you still are modeling data, you still put it into a database,
0:10
but many of the techniques fall down,
0:13
this whole concept of third normal form
0:15
doesn't make nearly as much sense as it does in a relational database.
0:18
What more we focus on often is really
0:21
how do we make relationships either between documents or within documents.
0:25
We saw the primary question, not the only one, but the most challenging one,
0:30
the one you have to think most carefully about is to embed or not to embed,
0:34
and I gave you a few rules or tips to help you guide this decision.
0:38
One— is the embedded data wanted and you use it 80 percent of the time or more,
0:44
most of the time when you get that containing document?
0:48
If that's true, you probably want to embed,
0:51
if that's false, maybe consider that as a warning sign not to.
0:54
How often do you want the embedded document without the outer containing document?
0:59
If often what you really want to get access to is these little inside pieces,
1:03
there's a lot of overhead and it really kind of complicates the way
1:07
you access it through your application,
1:09
if you want to get them most of the time, or frequently, on their own.
1:13
Is the embedded data abounded set?
1:16
Remember, these documents can only be sixteen megabytes or larger,
1:19
the number is way higher than you really want it to be,
1:22
if this is an unbounded set you're going to continue to add to it,
1:25
it very easily could outgrow the actual size that you're allowed to store.
1:28
Really for a performance reason though, is it abounded set and is that set small?
1:34
Because if you put huge amounts of data in there,
1:36
you're going to really slow down your read time
1:38
for these database operations that involve this document.
1:41
These are the four main rules here,
1:43
you also want to consider how your application accesses this data,
1:47
it might be really easy to answer these four questions
1:50
because there's a very constrained and small set of queries
1:53
you run against your database;
1:55
or it could be that you ask all sorts of questions in a highly varied ways
1:59
in which case it's harder to answer those questions,
2:02
the more types of queries you have the harder it is to know
2:05
whether most of the time you want the embedded data for example.
2:08
The more varied your queries are, the more you'll trend
2:11
towards third normal form, relational style and less embedding.
2:15
One of the situations where you have lots of varied queries is
2:18
if you have this thing called an integration database,
2:21
which we talked about sort of sharing a database across different applications,
2:24
versus having one dedicated to a particular application
2:27
where you can understand these questions very clearly.
2:30
So when you're designing these documents
2:33
you want to really think most carefully about do you want to embed this data
2:36
or create a soft foreign key type of relationship.