1 / 1
Apr 2024

Hello there, i have difficulties in engineering a performant and scalable schema for my application with the following simplified scenario:

  • I have users, projects and items
  • Many users can be in many projects
    — In each project, a user has a distinct set of roles
    — A user often looks up his projects (need to search for all projects where the user has a role)
  • Many items can be in many projects
    — In each item, a user has a distinct set of roles
    — If user has no role here but roles in project, these will be inherited here
    — A user often looks up the items of a project (need to search for all items belonging to the project)

These are not squillions to squillions relationships. I would expect 100-200 projects per user, as well as 5-30 items per project as a fixed maximum - so there is no fear of exceeding the 16MB limit with growing Object ID arrays.

So from schema modeling perspective i considered two options:

1. A list of nested objects containing the relationship + roles.

// Project collection { _id: ObjectId("...") ...other fields users: [ { user: ObjectId("...."), roles: ["RoleA", "RoleB"] }, ... ], items: [ObjectId("..."), ...] } // Item collection { _id: ObjectId("...") ...other fields users: [ { user: ObjectId("...."), roles: ["RoleA", "RoleC"] }, ... ] }

I fear that this is a resource heavy task to fetch all projects for a user, since the query has to iterate through each users field of each project, potentially containing the userId. Same for the project-items relationship. In turn, this simplifies the process of querying one specific project and checking whether the user has role access.

2. Extra Roles collections

// Project collection { _id: ObjectId("..."), ...other fields, items: [ ObjectId("..."), ... ] } // Item collection { _id: ObjectId("...") ...other fields } // Roles collection { _id: ObjectId("..."), user_id: ObjectId("..."), ref_id: ObjectId("..."), ref_type: "Project" | "Item", roles: ["RoleA", "RoleB"] }

With this i could quickly query for each project or item where a user has a role, but this would introduce $lookup operators and other types of joins at many places. Consider a user accessing an item (the parent project is always referenced in the request): Now the API has to first check whether this item exists, then whether the parent project exists, then lookup the Roles collection to check if the user has a item or project role before even reaching the service function.

In both cases i see some problems and i could not find a better solution yet. Furthermore, option 2 sounds better to me but also seems to be very SQLish and completely skips the advantage of nested docs regarding the roles.

Regards
David