I recently went through this same scenario. To solve this we can use mode(“append”) and add two options (operationType and upsertDocument).
If the item exists we replace it and do not duplicate it.
Example this soluction:
df_test.write
.format(“mongodb”)
.mode(“append”)
.option(“connection.uri”, “”)
.option(“database”, “”)
.option(“collection”, “”)
.option(“ignoreNullValues”, True)
.option(“operationType”,“replace”)
.option("upsertDocument ", True)
.save()
follow the link for learn more about this configurations type:
1 Like