I have a MongoDB dump (created with --archive
) that I’m trying to restore to our database using mongorestore
. However, when I try to do this, it fails with the following error:
Failed: corruption found in archive; 858927154 is neither a valid bson length nor a archive terminator
We received the dump from our former cloud provider, so I don’t know much about how it was created. I do know that it was created with Mongodump 100.6.0 and MongoDB 6.0.2, and I have tried restoring with that exact version, but it still fails.
mongorestore -h localhost --archive=Dump.mongodump -vvvv
This outputted the following data:
2023-09-02T18:39:19.230+0000 archive prelude quiltmc_modmail.logs
2023-09-02T18:39:19.230+0000 archive prelude quiltmc_modmail.config
2023-09-02T18:39:19.230+0000 archive format version "0.1"
2023-09-02T18:39:19.230+0000 archive server version "6.0.2"
2023-09-02T18:39:19.230+0000 archive tool version "100.6.0"
2023-09-02T18:39:19.232+0000 preparing collections to restore from
2023-09-02T18:39:19.232+0000 using as dump root directory
2023-09-02T18:39:19.232+0000 reading collections for database quiltmc_modmail in quiltmc_modmail
2023-09-02T18:39:19.232+0000 found collection quiltmc_modmail.logs bson to restore to quiltmc_modmail.logs
2023-09-02T18:39:19.232+0000 found collection metadata from quiltmc_modmail.logs to restore to quiltmc_modmail.logs
2023-09-02T18:39:19.232+0000 adding intent for quiltmc_modmail.logs
2023-09-02T18:39:19.232+0000 found collection quiltmc_modmail.config bson to restore to quiltmc_modmail.config
2023-09-02T18:39:19.232+0000 found collection metadata from quiltmc_modmail.config to restore to quiltmc_modmail.config
2023-09-02T18:39:19.232+0000 adding intent for quiltmc_modmail.config
2023-09-02T18:39:19.243+0000 demux End
2023-09-02T18:39:19.243+0000 demux finishing (err:corruption found in archive; 858927154 is neither a valid bson length nor a archive terminator)
2023-09-02T18:39:19.243+0000 received from namespaceChan
2023-09-02T18:39:19.243+0000 reading metadata for quiltmc_modmail.logs from archive 'Dump.mongodump'
2023-09-02T18:39:19.244+0000 reading metadata for quiltmc_modmail.config from archive 'Dump.mongodump'
2023-09-02T18:39:19.244+0000 restoring up to 4 collections in parallel
2023-09-02T18:39:19.244+0000 starting restore routine with id=3
2023-09-02T18:39:19.244+0000 ending restore routine with id=3, no more work to do
2023-09-02T18:39:19.244+0000 starting restore routine with id=1
2023-09-02T18:39:19.244+0000 ending restore routine with id=1, no more work to do
2023-09-02T18:39:19.244+0000 starting restore routine with id=0
2023-09-02T18:39:19.244+0000 ending restore routine with id=0, no more work to do
2023-09-02T18:39:19.244+0000 starting restore routine with id=2
2023-09-02T18:39:19.244+0000 ending restore routine with id=2, no more work to do
2023-09-02T18:39:19.244+0000 building indexes up to 4 collections in parallel
2023-09-02T18:39:19.244+0000 starting index build routine with id=3
2023-09-02T18:39:19.244+0000 restoring indexes for collection quiltmc_modmail.logs from metadata
2023-09-02T18:39:19.244+0000 index: &idx.IndexDocument{Options:primitive.M{"default_language":"english", "language_override":"language", "name":"messages.content_text_messages.author.name_text_key_text", "textIndexVersion":3, "v":2, "weights":primitive.M{"key":1, "messages.author.name":1, "messages.content":1}}, Key:primitive.D{primitive.E{Key:"_fts", Value:"text"}, primitive.E{Key:"_ftsx", Value:1}}, PartialFilterExpression:primitive.D(nil)}
2023-09-02T18:39:19.244+0000 run create Index command for indexes: messages.content_text_messages.author.name_text_key_text
2023-09-02T18:39:19.245+0000 starting index build routine with id=0
2023-09-02T18:39:19.245+0000 no indexes to restore for collection quiltmc_modmail.config
2023-09-02T18:39:19.245+0000 ending index build routine with id=0, no more work to do
2023-09-02T18:39:19.245+0000 starting index build routine with id=1
2023-09-02T18:39:19.245+0000 ending index build routine with id=1, no more work to do
2023-09-02T18:39:19.245+0000 starting index build routine with id=2
2023-09-02T18:39:19.245+0000 ending index build routine with id=2, no more work to do
2023-09-02T18:39:19.248+0000 ending index build routine with id=3, no more work to do
2023-09-02T18:39:19.248+0000 Failed: corruption found in archive; 858927154 is neither a valid bson length nor a archive terminator
2023-09-02T18:39:19.248+0000 0 document(s) restored successfully. 0 document(s) failed to restore.
I have examined the dump in a hex editor, and I have found that the 32-bit integer 858927154
is equivalent to the hex 0x33323032
, which, when encoded, translates to 2023
. The database contains quite a few dates, so it’s hard to know exactly where the problematic data is in the file. The last date in the file occurs here:
32303233 2D30372D 33315431 353A3035 3A30312E 3539322B 30303030 09646F6E 65206475 6D70696E 67207175 696C746D 635F6D6F 646D6169 6C2E6C6F 67732028 37313120 646F6375 6D656E74 73290D0A 45000000 02646200 10000000 7175696C 746D635F 6D6F646D 61696C00 02636F6C 6C656374 696F6E00 05000000 6C6F6773 0008454F 46000112 43524300 AD1DD5DB A30C5FCD 00FFFFFF FF
Which encodes to:
2023-07-31T15:05:01.592+0000 done dumping quiltmc_modmail.logs (711 documents)
Edbquiltmc_modmailcollectionlogsEOFCRCÕÛ£_Íÿÿÿÿ
I’m not sure if this is the cause, though, because the file has a similar “done dumping” message for each time the process finishes dumping a table.
I did create a dump of another database for testing purposes, which I’ve verified that I can restore. It was created in a newer version of MongoDB (6.0.8
100.6.0
), but I noticed that it lacked the “done dumping” messages. I also noticed that the test dump ended with:
454F4600 01124352 43000000 00000000 000000FF FFFFFF
which showed in the hex editor as EOF CRC ....
, meanwhile, the corrupted database shows as
454F4600 01124352 4300AD1D D5DBA30C 5FCD00FF FFFFFF
which showed in the hex editor as EOF CRC . ... _. ....
.
I did try and replace the corrupted file’s file end with the working file’s, but this didn’t seem to work.
I found absolutely nothing about this error online, so I wondered if anyone had any ideas about what I might try next.