Docs Menu
Docs Home
/
MongoDB Atlas
/

Data Federation Changelog

On this page

  • 2024 Releases
  • 2023 Releases
  • 2022 Releases
  • 2021 Releases
  • 2020 Releases
  • 2019 Releases

Note

Release notes mention only releases with feature changes

MongoDB releases Atlas Data Federation every week, continuously improving Atlas Data Federation performance and stability. These release notes capture only those releases that contain feature changes. If a particular Atlas Data Federation release contains only performance and stability improvements, it is not included in these release notes. To identify which release version you are using, check the release version string for the release date.

  • Supports the ability to read Parquet files with zstd column compression.

  • Changes which regions process data for unsupported regions:

    • ca-central-1 (Montreal) will process data from ca-west-1 (Alberta) instead of eu-west-1 (Ireland).

    • ap-northeast-1 (Tokyo) will process data:

      • from ap-northeast-2 (Seoul) instead of eu-west-1 (Ireland).

      • from ap-northeast-3 (Osaka) instead of eu-west-1 (Ireland).

      • from ap-east-1 (Hong Kong) instead of ap-southeast-2 (Sydney).

  • Fixes an issue where the killOp command was prevented from terminating a query.

  • Adds support for the $sql stage on Atlas Data Federation views.

  • Fixes an issue where, when creating a view, Atlas Data Federation was not properly checking permissions.

  • Resolves an issue where generated dates in the ISO 8602 format caused incorrect query results.

  • Fixes an issue that prevented the Python driver from failing to connect when using AWS IAM authentication.

  • Fixes an issue that caused $limit queries to fail with an InternalError.

  • Improves template partition filtering in the query planner, which improves query performance in certain cases against multiple blob storage sources.

  • Makes Azure data stores generally available for both your development and production deployments.

  • Supports AWS IAM if you configure AWS IAM for authentication. This is currently unsupported in the Atlas UI.

  • Changes the error CommandNotSupported to CommandNotFound.

  • Improves performance when reading from Parquet files.

  • Adds support for modifying views with collMod.

  • Adds automatic recognition of .jsonl files as JSON Lines files.

  • Fixes an issue with sqlGenerateSchema where it wouldn't run on Online Archive source.

  • Improves error message for exceeding the maxTimeMS limit.

  • Improves explain() results for queries that target Atlas Data Lake datasets and Online Archives.

  • Improves performance for queries that utilize the $ne operator.

  • Supports the $$SEARCH_META aggregation variable when you run $search queries on an Atlas cluster through Atlas Data Federation.

  • Introduces new onboarding experience with templates for the most common Atlas Data Federation use cases.

  • Improves error message for $out to S3 queries to provide more detail.

  • Optimizes partition attributes for selecting files on S3 when using the $in operator in aggregation pipelines.

  • Modifies the behavior of $queryHistory to indicate a query is complete when all batches have been uploaded as cursor files, all batches have been returned to the user, or there is an error.

  • Adds the ability to use BSON data for the comment field in commands.

  • Improves error messages when querying a document over 16MB.

  • Fixes a correctness issue for $getFields where Atlas Data Federation differed from MongoDB when querying an empty sub-document.

  • Improves stability and performance for $out to S3 when writing to Parquet.

  • Fixes an issue with $match queries that resulted in documents not being returned when querying on nested documents within an array where any nested document was missing the target field.

  • Improves performance and stability when writing to Parquet using $out to S3.

  • Adds the ability to use any BSON type with the $comment operator and query in $queryHistory. (Starting in MongoDB 5.1, the $comment operator was removed).

  • Atlas Data Federation now returns MongoDB 6.2.0 in the buildInfo output.

  • Adds the ability to limit the amount of data that Atlas Data Federation processes for your federated database instances to control costs.

  • Improves error messages when a client attempts to insert, update, or delete a document in a federated database instance.

  • Adds the application name to connections that Atlas Data Federation creates to your Atlas clusters.

  • Adds the ability to set and update the storage configuration using the Atlas Data Federation API.

  • Fixes an issue that caused maxTimeMS with a batchSize of 0 to fail.

  • Adds new capabilities to the storage configuration to support data provenance and improved flexibility for federation.

  • Adds AWS region ap-southeast-1 (Singapore).

  • Updates Atlas Data Federation to MongoDB 6.0.2.

  • Improves performance and stability.

  • Improves query performance on Atlas Data Lake Datasets using sort metadata to optimize queries.

  • Fixes an issue that caused Atlas Data Federation to fail to read a Parquet file when the top-level or root schema was marked as REPEATED or OPTIONAL.

  • Improves stability when writing to Parquet using $out to S3.

  • Fixes $not and $in pipeline issue that caused unsupported expression panic.

  • Improves performance for $out to S3 queries that write to Parquet file format.

  • Updates the default max row group size to 128MB for the parquet writer.

  • Improves $group stages on Data Lake Dataset partition fields.

  • Fixes aggregation pipelines with multiple $lookup stages where one stage defines a field and another removes the same field.

  • Fixes how Atlas Data Federation handles files in S3 that end with the delimiter character (e.g. '/').

  • Improves performance and stability.

  • Adds support for optionally specifying an ISODate format to optimize performance for date-type partitions.

  • Improves performance and stability.

  • Performs $merge in chunks.

  • Improves performance and stability.

  • Atlas now charges for the total number of bytes that Atlas Data Federation processes from HTTP sources.

  • Adds support for the background option on the $merge aggregation stage.

  • Improves performance and stability.

  • Adds support for Atlas Data Lake as a "Store Type" to the createStore command.

  • Improves error messaging for Federated $search queries.

  • Renames and relaunches Atlas Data Lake as Atlas Data Federation.

    Important

    The federated query engine service previously called Atlas Data Lake is now called Atlas Data Federation. To learn more about Atlas Data Federation, see Set Up and Query Data Federation.

  • Improves performance and stability.

  • Disables support for the MySQL dialect.

  • Improves performance and stability.

  • Improves performance and stability.

  • Supports the following new MongoDB 5.2 aggregation operators:

    • $sortArray

    • $topN

    • $bottomN

    • $maxN

    • $firstN

    • $lastN

  • Fixes a bug to allow you to use read preference for sharded clusters.

  • Improves performance and stability.

  • Improves performance and stability.

  • Imposes an upper limit on maxRowGroupSize.

  • Improves performance and stability.

  • Supports queries on collections prefixed with system, but doesn't support queries on collections prefixed with system..

  • Improves performance and stability.

  • Adds support for the $maxTimeMS option.

  • Improves performance and stability.

  • Allows connections to Data Lakes via private endpoints.

  • Adds support for X.509 authorization.

  • Adds support for empty field parameters with the $setField aggregation expression.

  • Fixes an issue where commands returned zero exit codes on failure.

  • Fixes an issue where documents with empty subdocuments written to Parquet contained empty parquet groups.

  • Updates EstimateRowGroupSize to report UncompressedSize for documents stored in Parquet.

  • Adjusts the minimum value for maxRowGroupSize when using $out to Parquet to 16MB.

  • Removes support for using $out to write documents that contain duplicate fields to Parquet.

  • Improves error messages for $out.

  • Improves performance and stability.

  • Includes X.509 users in the usersInfo command output.

  • Improves SCRAM authentication performance.

  • Improves performance and stability.

  • Adds support for the authenticate command.

  • Preserves binary subtypes in the parquet reader/writer.

  • Provides various stability improvements.

  • Improves collStats and dbStats command performance and stability.

  • Adds support for the $merge aggregation pipeline stage.

  • Allows localField and foreignField with a more expressive $lookup aggregation pipeline stage syntax.

  • Implements the $count accumulator.

  • Improves performance.

  • Improves error messaging.

  • Adds computeTime and automaticRefreshInProgress fields to the collStats and dbStats command outputs.

  • Supports dropping non-existent stores and databases from the storage configuration.

  • Includes partitions.count in collStats command output.

  • Allows downloading Data Federation query logs from the UI and API.

  • Removes restriction on large collection namespaces.

  • Adds option to bypass cache for collStats and dbStats to fetch the most recent statistics.

  • Supports serverStatus command.

  • Improves stability and performance.

  • Supports public S3 data stores with the public configuration flag.

  • Supports Zstandard compression when federating queries to Atlas clusters.

  • Adds db field to dbStats result.

  • Supports selecting read preference, read tags, and max staleness through the storage configuration for Atlas Cluster stores.

  • Rejects commands sent with a Versioned API set.

  • Enables the count parameter in the Data Lake $collStats aggregation stage.

  • No longer permits $collStats in $facet sub-pipelines.

  • Enforces maximum document size for $facet after processing each item.

  • Improves performance for $match stages.

  • Improves error messaging.

  • Improves stability and performance.

  • Includes improved support for Parquet.

  • Supports M0, M2, & M5 Atlas clusters as data sources.

  • Adds regex pattern matching option for wildcard collections from Atlas Clusters.

  • Includes updated error messages for query execution limit.

  • Generates storage configuration automatically for the first time after user authentication.

  • Returns connection ID through the hello command.

  • Supports $geoNear on Atlas Data Lake collections that span multiple Atlas clusters.

  • Includes various performance improvements.

  • Includes improved error messages for terminated queries.

  • Includes new onboarding and storage configuration interface.

  • Improved SQL schema error message.

  • Support query pushdown to collections comprised of multiple Atlas collections.

  • Improves stability and performance.

  • Adds a new $sql formatVersion to reduce the data size of the result set.

  • Improves performance of $lookup.

  • Adds "verbosity": "queryPlannerExtended" support to the explain command to filter out non-matching partitions.

  • Adds support for $$NOW.

  • Reports Atlas Data Lake as MongoDB version 4.4 to tools.

  • Adds support for the background option on the $out to Atlas aggregation stage.

  • Includes stability and performance improvements.

  • Adds {background: true} option, which allows queries to run in the background for $out to S3 stage.

  • Introduces $queryHistory aggregation stage to view past queries.

  • Includes various performance and stability improvements.

  • Supports Parquet, CSV, and TSV formats for $out to S3.

  • Adds a rolling limit for cursors.

  • Improves error messages for commands that cannot be parsed.

  • Supports defaultFormat for files in publicly accessible URLs in HTTP stores.

  • Limits the number of simultaneous queries to 30 per federated database instance.

  • Supports bzip2 compression format.

  • Supports comment option for the aggregate command.

  • Includes various performance and stability improvements.

  • Supports killOp command for terminating a long-running query.

  • Adds configuration for maximum number of wildcard collections for S3 federated database instance stores.

  • Improves $out to S3 write performance.

  • Includes general performance and stability improvements.

  • Adds correlationID to the $currentOp output.

  • Includes general performance and stability improvements.

  • Relaxes $out S3 region requirement.

  • Includes improved storage configuration error messages.

  • Includes general performance and stability improvements.

  • Supports $collStats aggregation pipeline stage.

  • Includes performance optimizations for ORC files.

  • Includes general performance and stability improvements.

  • Adds support for the skip and limit fields to the count() command.

  • Adds storageValidateConfig command to validate your federated database instance storage configuration.

  • Includes bug fixes and performance improvements.

  • Includes general performance and stability improvements.

  • Adds support for Atlas Clusters as a data source.

  • Improves performance for the $lookup aggregation pipeline stage.

  • Adds support for evaluating string $convert expressions in the filename for $out to S3.

  • Updates Parquet support for MAP types.

  • Improves error messaging for $out to S3.

  • Adds a command to generate a storage configuration.

  • Automates storage configuration generation for newly created federated database instances.

  • Allows write partitioning-aware data to S3 using the $out in Data Federation.

  • Generates Storage Configs when Atlas creates a federated database instance.

  • Adds support for $out to S3.

  • Updates support for Apache Parquet LIST element.

  • Upgrades wire protocol support to 4.2 from 3.6.

  • Adds support for verbosity in the explain plan.

  • Fixes stability issues.

  • Improves performance.

  • Supports the $currentOp stage so that you can monitor query progress on long-running queries.

  • Updates the isodate attribute to accept additional formats.

  • Refreshes the metadata catalog when you use Storage Configuration commands.

  • Includes various performance and stability improvements.

  • Supports filename field references for $out.

  • Supports $toString in $out to S3.

  • Supports optionally granting federated database instance write access to S3 buckets, enabling use of $out semantics to write directly to those buckets.

  • Adds incremental store, database, collection, and view commands for storage configuration management.

  • Limits collections returned for wildcard collections to 1,000.

  • Updates the storage configuration format.

  • Supports cross-database $lookup queries.

  • Supports lowercase and uppercase file extensions.

  • Template segments now support dot-separated attribute names that correspond to nested fields.

  • Allows the defaultFormat to be specified without a leading dot.

  • Supports filtering based on stripes for files in ORC format.

  • Allows query attributes to be extracted after the first stage.

  • Includes several performance and stability improvements.

  • Supports partition definition for the following:

    • epoch_secs, which is seconds since the Unix Epoch

    • epoch_millis, which is milliseconds since the Unix Epoch

    • UUID, which is binary subtype 4

  • Includes several performance and stability improvements.

  • Adds support for reading Apache ORC files.

  • Supports filtering partitions by Parquet file row group statistics.

  • Supports ObjectIds in the path when specifying partition databases.<database>.<collection>.[n].definition.

  • Returns an error if a query produces a document larger than 16 MiB.

  • The $indexStats stage now produces an empty list of indexes instead of an error.

  • Supports $out to S3 storage format in JSON.

  • $match now implicitly treats all terms as conjunctions.

  • No longer parses empty files.

  • Fixes an issue that caused the {$match: {$expr: {$and: []}}} expression to terminate the connection.

  • Allows nested fields in partition definitions.

  • No longer enumerates directories on S3 when a single subdirectory containing all the partitions matching the query is identified.

  • Fixes an issue where the new storage configuration did not appear on the issuing connection after running setStorageConfig.

  • Adds support for the getLastError database command.

  • Fixes a bug with how union types are handled in Avro.

  • Supports $out aggregation pipeline stage to S3.

  • listIndexes now always returns an empty list.

  • Translates dot-delimited CSV and TSV keys into subdocuments.

  • Storage configuration error message now includes a link to the documentation.

  • Supports the XLSX file format.

  • Includes the correlation ID in query execution error messages.

  • Returns an error to the client when the cursor storage limit is reached.

  • Returns an error to the client on the last getMore if the cursor storage limit is exceeded.

  • Supports listCommands. For example: db.runCommand({"listCommands": 1})

  • Includes partition size information in the output of explain().

  • Returns the first batch of cursor results more quickly.

  • Improves performance of $lookup when combined with $unwind.

  • Automatically supports SCRAM-SHA-1 credentials without requiring drivers to specify this authentication mechanism.

  • Provides a descriptive error message when the file format is unknown.

  • Provides additional validation on setStorageConfig.

Initial public preview release of Set Up and Query Data Federation.

Back

Atlas