Hello there, i have difficulties in engineering a performant and scalable schema for my application with the following simplified scenario:
- I have users, projects and items
- Many users can be in many projects
— In each project, a user has a distinct set of roles
— A user often looks up his projects (need to search for all projects where the user has a role) - Many items can be in many projects
— In each item, a user has a distinct set of roles
— If user has no role here but roles in project, these will be inherited here
— A user often looks up the items of a project (need to search for all items belonging to the project)
These are not squillions to squillions relationships. I would expect 100-200 projects per user, as well as 5-30 items per project as a fixed maximum - so there is no fear of exceeding the 16MB limit with growing Object ID arrays.
So from schema modeling perspective i considered two options:
1. A list of nested objects containing the relationship + roles.
// Project collection
{
_id: ObjectId("...")
...other fields
users: [ { user: ObjectId("...."), roles: ["RoleA", "RoleB"] }, ... ],
items: [ObjectId("..."), ...]
}
// Item collection
{
_id: ObjectId("...")
...other fields
users: [ { user: ObjectId("...."), roles: ["RoleA", "RoleC"] }, ... ]
}
I fear that this is a resource heavy task to fetch all projects for a user, since the query has to iterate through each users field of each project, potentially containing the userId. Same for the project-items relationship. In turn, this simplifies the process of querying one specific project and checking whether the user has role access.
2. Extra Roles collections
// Project collection
{
_id: ObjectId("..."),
...other fields,
items: [ ObjectId("..."), ... ]
}
// Item collection
{
_id: ObjectId("...")
...other fields
}
// Roles collection
{
_id: ObjectId("..."),
user_id: ObjectId("..."),
ref_id: ObjectId("..."),
ref_type: "Project" | "Item",
roles: ["RoleA", "RoleB"]
}
With this i could quickly query for each project or item where a user has a role, but this would introduce $lookup operators and other types of joins at many places. Consider a user accessing an item (the parent project is always referenced in the request): Now the API has to first check whether this item exists, then whether the parent project exists, then lookup the Roles collection to check if the user has a item or project role before even reaching the service function.
In both cases i see some problems and i could not find a better solution yet. Furthermore, option 2 sounds better to me but also seems to be very SQLish and completely skips the advantage of nested docs regarding the roles.
Regards
David