Docs Home → MongoDB Spark Connector
Read From MongoDB![](/docs/spark-connector/v2.4/assets/link.svg)
Use the MongoSpark.load
method to create an RDD representing
a collection.
The following example loads the collection specified in the
SparkConf
:
val rdd = MongoSpark.load(sc) println(rdd.count) println(rdd.first.toJson)
To specify a different collection, database, and other read
configuration settings, pass a ReadConfig
to
MongoSpark.load()
.
Using a ReadConfig
![](/docs/spark-connector/v2.4/assets/link.svg)
MongoSpark.load()
can accept a ReadConfig
object which
specifies various read configuration settings, such as the collection or the
Read Preference.
The following example reads from the spark
collection with a
secondaryPreferred
read preference:
import com.mongodb.spark.config._ val readConfig = ReadConfig(Map("collection" -> "spark", "readPreference.name" -> "secondaryPreferred"), Some(ReadConfig(sc))) val customRdd = MongoSpark.load(sc, readConfig) println(customRdd.count) println(customRdd.first.toJson)
SparkContext Load Helper Methods![](/docs/spark-connector/v2.4/assets/link.svg)
SparkContext
has an implicit helper method loadFromMongoDB()
to
load data from MongoDB.
For example, use the loadFromMongoDB()
method without any arguments
to load the collection specified in the SparkConf
:
sc.loadFromMongoDB() // Uses the SparkConf for configuration
Call loadFromMongoDB()
with a ReadConfig
object to specify a
different MongoDB server address, database and collection. See
input configuration settings for available
settings:
sc.loadFromMongoDB(ReadConfig(Map("uri" -> "mongodb://example.com/database.collection"))) // Uses the ReadConfig