Join two collections from different databases - is there any program available

i would like to combine or correlate two collections from different databases (different replicaset)

  1. what options i have to join that collections?
  2. programmatically how i can combine?
  3. python code?
    Thanks
    Giri
1 Like

Thank you for the question @giribabu_venugopal2! You could try something like this:

Connect to MongoDB Replica Sets

from pymongo import MongoClient

# Connection to first MongoDB replica set
client1 = MongoClient('mongodb://<username>:<password>@primary1:27017,secondary1:27018/?replicaSet=rs0')
db1 = client1['database1']

# Connection to second MongoDB replica set
client2 = MongoClient('mongodb://<username>:<password>@primary2:27019,secondary2:27020/?replicaSet=rs1')
db2 = client2['database2']

Extract Data from Each Collection

collection1 = db1['collection1']
collection2 = db2['collection2']

# Fetch data from both collections
# Apply any filters you need to reduce the data size
docs1 = list(collection1.find({}))
docs2 = list(collection2.find({}))

Combine Data

combined_data = []

# Assuming 'common_key' is the field present in both collections
docs2_dict = {doc['common_key']: doc for doc in docs2}  # Create a dictionary for efficient lookup

for doc1 in docs1:
    common_key_value = doc1.get('common_key')
    if common_key_value in docs2_dict:
        combined_doc = {**doc1, **docs2_dict[common_key_value]}  # Merge documents
        combined_data.append(combined_doc)

Hope that helps!

Alex