Docs Menu
Docs Home
/
MongoDB Manual
/ /

Archive Pattern

On this page

  • About this Task
  • Tips for Data Archival
  • Scenario
  • Steps
  • Populate sample data
  • Write a script to archive old documents
  • Results
  • Learn More

If you need to store historical data dating back a number of years, storing your oldest data in the same database as your more recent data can negatively impact performance, especially if the old data does not need to be accessed frequently. Instead, you can design your schema to archive old data and move that data to a separate storage location.

There are multiple options for how to store your archived data. For example, you can:

  • Move data to external file storage such as Amazon S3.

  • Move data to a separate, cheaper cluster.

  • Move data to a separate collection on the same cluster.

In most cases, moving data to external file storage is the best option in terms of cost and performance. If external file storage is not possible for your use case, consider moving data to a separate cluster or collection.

Before you implement a data archival design pattern, review these best practices:

  • Your archived data should use an embedded data model, rather than using references to other collections. When you query archived data, all relevant components of the data must be from the same time. Embedding data ensures that queries return the related data together.

  • The document's age should be contained in a single field.

  • If you have documents that should never expire or move to the archive, set the document age to keep forever, or a similar string to indicate that the document must stay in the active collection.

  • MongoDB Atlas offers Online Archive, which moves infrequently-accessed data from your Atlas cluster to a MongoDB-managed read-only Federated Database Instance on a cloud object storage.

In this example, an e-commerce store wants to archive data for sales that occurred more than five years ago. The initial dataset contains all sales, and documents for older sales will be moved to a separate collection.

1
db.sales.insertMany( [
{
customer_name: "Hiroshi Tanaka",
products: [
{
product_id: "P1001",
name: "Wireless Headphones",
quantity: 1,
price: 59.99
},
{
product_id: "P1002",
name: "Phone Charger",
quantity: 2,
price: 14.99
}
],
total_amount: 89.97,
date: ISODate("2025-01-30T10:15:00Z")
},
{
customer_name: "Aisha Khan",
products: [
{
product_id: "P1003",
name: "Laptop",
quantity: 1,
price: 899.99
}
],
total_amount: 899.99,
date: ISODate("2018-11-20T15:45:00Z") // Over 5 years ago
},
{
customer_name: "Fatima Al-Farsi",
products: [
{
product_id: "P1006",
name: "Gaming Mouse",
quantity: 1,
price: 49.99
},
{
product_id: "P1007",
name: "Mechanical Keyboard",
quantity: 1,
price: 129.99
}
],
total_amount: 179.98,
date: ISODate("2017-06-15T12:00:00Z") // Over 5 years ago
},
{
customer_name: "Nguyen Minh",
products: [
{
product_id: "P1008",
name: "Bluetooth Speaker",
quantity: 2,
price: 39.99
}
],
total_amount: 79.98,
date: ISODate("2025-01-26T09:20:00Z")
}
] )
2

Note

The following script uses MongoDB Shell syntax. To see aggregation and query syntax for your driver, see your driver documentation.

// Set a variable to five years before the time that the script runs
const fiveYearsAgo = new Date();
fiveYearsAgo.setFullYear(fiveYearsAgo.getFullYear() - 5);
// Write old sales to the 'archived_sales' collection
const writeOldSalesPipeline = [
{
$match: {
date: { $lt: fiveYearsAgo }
}
},
{
$merge: {
into: {
db: "test",
coll: "archived_sales",
},
on: "_id"
}
}
]
db.sales.aggregate(writeOldSalesPipeline)
// Delete old sales from the active 'sales' collection
try {
db.sales.deleteMany(
{ date : { $lt: fiveYearsAgo } }
);
} catch (e) {
print (e);
}

After you run the script, the sales collection no longer contains sales that occurred more than five years ago.

db.sales.find()

Output:

[
{
_id: ObjectId('679ced18fa29d32ca7d1abab'),
customer_name: 'Hiroshi Tanaka',
products: [
{
product_id: 'P1001',
name: 'Wireless Headphones',
quantity: 1,
price: 59.99
},
{
product_id: 'P1002',
name: 'Phone Charger',
quantity: 2,
price: 14.99
}
],
total_amount: 89.97,
date: ISODate('2025-01-30T10:15:00.000Z')
},
{
_id: ObjectId('679ced18fa29d32ca7d1abae'),
customer_name: 'Nguyen Minh',
products: [
{
product_id: 'P1008',
name: 'Bluetooth Speaker',
quantity: 2,
price: 39.99
}
],
total_amount: 79.98,
date: ISODate('2025-01-26T09:20:00.000Z')
}
]

Old sales now exist in the archived_sales collection.

db.archived_sales.find()

Output:

[
{
_id: ObjectId('679ced18fa29d32ca7d1abac'),
customer_name: 'Aisha Khan',
products: [
{
product_id: 'P1003',
name: 'Laptop',
quantity: 1,
price: 899.99
}
],
total_amount: 899.99,
date: ISODate('2018-11-20T15:45:00.000Z')
},
{
_id: ObjectId('679ced18fa29d32ca7d1abad'),
customer_name: 'Fatima Al-Farsi',
products: [
{
product_id: 'P1006',
name: 'Gaming Mouse',
quantity: 1,
price: 49.99
},
{
product_id: 'P1007',
name: 'Mechanical Keyboard',
quantity: 1,
price: 129.99
}
],
total_amount: 179.98,
date: ISODate('2017-06-15T12:00:00.000Z')
}
]

Back

Maintain Versions