Glossary¶

$cmd

A special virtual collection that exposes MongoDB’s database commands. To use database commands, see 发出命令.

_id

A field required in every MongoDB document. The _id field must have a unique value. You can think of the _id field as the document’s primary key. If you create a new document without an _id field, MongoDB automatically creates the field and assigns a unique BSON ObjectId.

accumulator

An expression in the aggregation framework that maintains state between documents in the aggregation pipeline. For a list of accumulator operations, see $group.

action

An operation the user can perform on a resource. Actions and resources combine to create privileges. See action.

admin database

A privileged database. Users must have access to the admin database to run certain administrative commands. For a list of administrative commands, see Instance Administration Commands.

aggregation

Any of a variety of operations that reduces and summarizes large sets of data. MongoDB’s aggregate() and mapReduce() methods are two examples of aggregation operations. For more information, see 聚合的基本概念.

aggregation framework

The set of MongoDB operators that let you calculate aggregate values without having to use map-reduce. For a list of operators, see Aggregation Reference.

arbiter

A member of a replica set that exists solely to vote in elections. Arbiters do not replicate data. See 投票节点.

authentication

Verification of the user identity. See 认证.

authorization

Provisioning of access to databases and operations. See 授权.

B-tree

A data structure commonly used by database management systems to store indexes. MongoDB uses B-trees for its indexes.

balancer

An internal MongoDB process that runs in the context of a sharded cluster and manages the migration of chunks. Administrators must disable the balancer for all maintenance operations on a sharded cluster. See 开启分片集合的均衡.

BSON

A serialization format used to store documents and make remote procedure calls in MongoDB. “BSON” is a portmanteau of the words “binary” and “JSON”. Think of BSON as a binary representation of JSON (JavaScript Object Notation) documents. See BSON Types and MongoDB Extended JSON.

BSON types

The set of types supported by the BSON serialization format. For a list of BSON types, see BSON Types.

CAP Theorem

Given three properties of computing systems, consistency, availability, and partition tolerance, a distributed computing system can provide any two of these features, but never all three.

capped collection

A fixed-sized collection that automatically overwrites its oldest entries when it reaches its maximum size. The MongoDB oplog that is used in replication is a capped collection. See 限制集.

checksum

A calculated value used to ensure data integrity. The md5 algorithm is sometimes used as a checksum.

chunk

A contiguous range of shard key values within a particular shard. Chunk ranges are inclusive of the lower boundary and exclusive of the upper boundary. MongoDB splits chunks when they grow beyond the configured chunk size, which by default is 64 megabytes. MongoDB migrates chunks when a shard contains too many chunks of a collection relative to other shards. See 数据分区 and 分片的技术细节.

client

The application layer that uses a database for data persistence and storage. Drivers provide the interface level between the application layer and the database server.

cluster

See sharded cluster.

collection

A grouping of MongoDB documents. A collection is the equivalent of an RDBMS table. A collection exists within a single database. Collections do not enforce a schema. Documents within a collection can have different fields. Typically, all documents in a collection have a similar or related purpose. See What is a namespace in MongoDB?.

collection scan

Collection scans are a query execution strategy where MongoDB must inspect every document in a collection to see if it matches the query criteria. These queries are very inefficient and do not use indexes. See Query Optimization for details about query execution strategies.

compound index

An index consisting of two or more keys. See 复合索引.

config database

An internal database that holds the metadata associated with a sharded cluster. Applications and administrators should not modify the config database in the course of normal operation. See Config Database.

config server

A mongod instance that stores all the metadata associated with a sharded cluster. A production sharded cluster requires three config servers, each on a separate machine. See 配置服务器.

control script

A simple shell script, typically located in the /etc/rc.d or /etc/init.d directory, and used by the system’s initialization process to start, restart or stop a daemon process.

CRUD

An acronym for the fundamental operations of a database: Create, Read, Update, and Delete. See MongoDB CRUD 操作.

CSV

A text-based data format consisting of comma-separated values. This format is commonly used to exchange data between relational databases since the format is well-suited to tabular data. You can import CSV files using mongoimport.

cursor

A pointer to the result set of a query. Clients can iterate through a cursor to retrieve results. By default, cursors timeout after 10 minutes of inactivity. See Cursors.

daemon

The conventional name for a background, non-interactive process.

data directory

The file-system location where the mongod stores data files. The dbPath option specifies the data directory.

data-center awareness

A property that allows clients to address members in a system based on their locations. Replica sets implement data-center awareness using tagging. See Data Center Awareness.

database

A physical container for collections. Each database gets its own set of files on the file system. A single MongoDB server typically has multiple databases.

database command

A MongoDB operation, other than an insert, update, remove, or query. For a list of database commands, see Database Commands. To use database commands, see 发出命令.

database profiler

A tool that, when enabled, keeps a record on all long-running operations in a database’s system.profile collection. The profiler is most often used to diagnose slow queries. See Database Profiling.

datum

A set of values used to define measurements on the earth. MongoDB uses the WGS84 datum in certain geospatial calculations. See 地理空间索引和查询.

dbpath

The location of MongoDB’s data file storage. See dbPath.

delayed member

A replica set member that cannot become primary and applies operations at a specified delay. The delay is useful for protecting data from human error (i.e. unintentionally deleted databases) or updates that have unforeseen effects on the production database. See 延时节点.

diagnostic log

A verbose log of operations stored in the dbpath. See the --diaglog option.

document

A record in a MongoDB collection and the basic unit of data in MongoDB. Documents are analogous to JSON objects but exist in the database in a more type-rich format known as BSON. See 文档.

dot notation

MongoDB uses the dot notation to access the elements of an array and to access the fields of a subdocument. See Dot Notation.

draining

The process of removing or “shedding” chunks from one shard to another. Administrators must drain shards before removing them from the cluster. See 从集群中删除分片.

driver

A client library for interacting with MongoDB in a particular language. See MongoDB Drivers and Client Libraries.

election

The process by which members of a replica set select a primary on startup and in the event of a failure. See 复制集选举.

eventual consistency

A property of a distributed system that allows changes to the system to propagate gradually. In a database system, this means that readable members are not required to reflect the latest writes at all times. In MongoDB, reads to a primary have strict consistency; reads to secondaries have eventual consistency.

expression

In the context of aggregation framework, expressions are the stateless transformations that operate on the data that passes through a pipeline. See 聚合的基本概念.

failover

The process that allows a secondary member of a replica set to become primary in the event of a failure. See 复制集的高可用.

field

A name-value pair in a document. A document has zero or more fields. Fields are analogous to columns in relational databases. See Document Structure.

field path

Path to a field in the document. To specify a field path, use a string that prefixes the field name with a dollar sign ($).

firewall

A system level networking filter that restricts access based on, among other things, IP address. Firewalls form a part of an effective network security strategy. See Firewalls.

fsync

A system call that flushes all dirty, in-memory pages to disk. MongoDB calls fsync() on its database files at least every 60 seconds. See fsync.

geohash

A geohash value is a binary representation of the location on a coordinate grid. See 为 2d 索引计算Geohash值.

GeoJSON

A geospatial data interchange format based on JavaScript Object Notation (JSON). GeoJSON is used in geospatial queries. For supported GeoJSON objects, see 位置数据. For the GeoJSON format specification, see http://geojson.org/geojson-spec.html.

geospatial

Data that relates to geographical location. In MongoDB, you may store, index, and query data according to geographical parameters. See 地理空间索引和查询.

GridFS

A convention for storing large files in a MongoDB database. All of the official MongoDB drivers support this convention, as does the mongofiles program. See GridFS and GridFS Reference.

hashed shard key

A special type of shard key that uses a hash of the value in the shard key field to distribute documents among members of the sharded cluster. See 哈希索引.

haystack index

A geospatial index that enhances searches by creating “buckets” of objects grouped by a second criterion. See geoHaystack 索引.

hidden member

A replica set member that cannot become primary and are invisible to client applications. See 隐藏节点.

idempotent

The quality of an operation to produce the same result given the same input, whether run once or run multiple times.

index

A data structure that optimizes queries. See 索引概念.

initial sync

The replica set operation that replicates data from an existing replica set member to a new or restored replica set member. See 初始同步.

interrupt point

A point in an operation’s lifecycle when it can safely abort. MongoDB only terminates an operation at designated interrupt points. See Terminate Running Operations.

IPv6

A revision to the IP (Internet Protocol) standard that provides a significantly larger address space to more effectively support the number of hosts on the contemporary Internet.

ISODate

The international date format used by mongo to display dates. The format is: YYYY-MM-DD HH:MM.SS.millis.

JavaScript

A popular scripting language originally designed for web browsers. The MongoDB shell and certain server-side functions use a JavaScript interpreter. See Server-side JavaScript for more information.

journal

A sequential, binary transaction log used to bring the database into a valid state in the event of a hard shutdown. Journaling writes data first to the journal and then to the core data files. MongoDB enables journaling by default for 64-bit builds of MongoDB version 2.0 and newer. Journal files are pre-allocated and exist as files in the data directory. See Journaling Mechanics.

JSON

JavaScript Object Notation. A human-readable, plain text format for expressing structured data with support in many programming languages. For more information, see http://www.json.org. Certain MongoDB tools render an approximation of MongoDB BSON documents in JSON format. See MongoDB Extended JSON.

JSON document

A JSON document is a collection of fields and values in a structured format. For sample JSON documents, see http://json.org/example.html.

JSONP

JSON with Padding. Refers to a method of injecting JSON into applications. Presents potential security concerns.

least privilege

An authorization policy that gives a user only the amount of access that is essential to that user’s work and no more.

legacy coordinate pairs

The format used for geospatial data prior to MongoDB version 2.4. This format stores geospatial data as points on a planar coordinate system (e.g. [ x, y ]). See 地理空间索引和查询.

LineString

A LineString is defined by an array of two or more positions. A closed LineString with four or more positions is called a LinearRing, as described in the GeoJSON LineString specification: http://geojson.org/geojson-spec.html#linestring. To use a LineString in MongoDB, see GeoJSON对象.

lock

MongoDB uses locks to ensure concurrency. MongoDB uses both read locks and write locks. For more information, see What type of locking does MongoDB use?.

LVM

Logical volume manager. LVM is a program that abstracts disk images from physical devices and provides a number of raw disk manipulation and snapshot capabilities useful for system management. For information on LVM and MongoDB, see Backup and Restore Using LVM on a Linux System.

map-reduce

A data processing and aggregation paradigm consisting of a “map” phase that selects data and a “reduce” phase that transforms the data. In MongoDB, you can run arbitrary aggregations over data using map-reduce. For map-reduce implementation, see 映射化简. For all approaches to aggregation, see 聚合的基本概念.

mapping type

A Structure in programming languages that associate keys with values, where keys may nest other pairs of keys and values (e.g. dictionaries, hashes, maps, and associative arrays). The properties of these structures depend on the language specification and implementation. Generally the order of keys in mapping types is arbitrary and not guaranteed.

master

The database that receives all writes in a conventional master-slave replication. In MongoDB, replica sets replace master-slave replication for most use cases. For more information on master-slave replication, see 主从复制.

md5

A hashing algorithm used to efficiently provide reproducible unique strings to identify and checksum data. MongoDB uses md5 to identify chunks of data for GridFS. See filemd5.

MIB

Management Information Base. MongoDB uses MIB files to define the type of data tracked by SNMP in the MongoDB Enterprise edition.

MIME

Multipurpose Internet Mail Extensions. A standard set of type and encoding definitions used to declare the encoding and type of data in multiple data storage, transmission, and email contexts. The mongofiles tool provides an option to specify a MIME type to describe a file inserted into GridFS storage.

mongo

The MongoDB shell. The mongo process starts the MongoDB shell as a daemon connected to either a mongod or mongos instance. The shell has a JavaScript interface. See mongo and mongo Shell Methods.

mongod

The MongoDB database server. The mongod process starts the MongoDB server as a daemon. The MongoDB server manages data requests and formats and manages background operations. See mongod.

MongoDB

An open-source document-based database system. “MongoDB” derives from the word “humongous” because of the database’s ability to scale up with ease and hold very large amounts of data. MongoDB stores documents in collections within databases.

MongoDB Enterprise

A commercial edition of MongoDB that includes additional features. For more information, see MongoDB Subscriptions.

mongos

The routing and load balancing process that acts an interface between an application and a MongoDB sharded cluster. See mongos.

namespace

The canonical name for a collection or index in MongoDB. The namespace is a combination of the database name and the name of the collection or index, like so: [database-name].[collection-or-index-name]. All documents belong to a namespace. See What is a namespace in MongoDB?.

natural order

The order that a database stores documents on disk. Typically, the order of documents on disks reflects insertion order, except when a document moves internally because an update operation increases its size. In capped collections, insertion order and natural order are identical because documents do not move internally. MongoDB returns documents in forward natural order for a find() query with no parameters. MongoDB returns documents in reverse natural order for a find() query sorted with a parameter of $natural:-1. See $natural.

ObjectId

A special 12-byte BSON type that guarantees uniqueness within the collection. The ObjectId is generated based on timestamp, machine ID, process ID, and a process-local incremental counter. MongoDB uses ObjectId values as the default values for _id fields.

operator

A keyword beginning with a $ used to express an update, complex query, or data transformation. For example, $gt is the query language’s “greater than” operator. For available operators, see Operators.

oplog

A capped collection that stores an ordered history of logical writes to a MongoDB database. The oplog is the basic mechanism enabling replication in MongoDB. See 复制集Oplog.

ordered query plan

A query plan that returns results in the order consistent with the sort() order. See 查询计划.

orphaned document

In a sharded cluster, orphaned documents are those documents on a shard that also exist in chunks on other shards as a result of failed migrations or incomplete migration cleanup due to abnormal shutdown. Delete orphaned documents using cleanupOrphaned to reclaim disk space and reduce confusion.

padding

The extra space allocated to document on the disk to prevent moving a document when it grows as the result of update() operations. See Record Allocation Strategies.

padding factor

An automatically-calibrated constant used to determine how much extra space MongoDB should allocate per document container on disk. A padding factor of 1 means that MongoDB will allocate only the amount of space needed for the document. A padding factor of 2 means that MongoDB will allocate twice the amount of space required by the document. See Record Allocation Strategies.

page fault

Page faults can occur as MongoDB reads from or writes data to parts of its data files that are not currently located in physical memory. In contrast, operating system page faults happen when physical memory is exhausted and pages of physical memory are swapped to disk.

See Page Faults and What are page faults?.

partition

A distributed system architecture that splits data into ranges. Sharding uses partitioning. See 数据分区.

passive member

A member of a replica set that cannot become primary because its priority is 0. See 优先级为0的复制集成员.

pcap

A packet-capture format used by mongosniff to record packets captured from network interfaces and display them as human-readable MongoDB operations. See Options.

PID

A process identifier. UNIX-like systems assign a unique-integer PID to each running process. You can use a PID to inspect a running process and send signals to it. See /proc File System.

pipe

A communication channel in UNIX-like systems allowing independent processes to send and receive data. In the UNIX shell, piped operations allow users to direct the output of one command into the input of another.

pipeline

A series of operations in an aggregation process. See 聚合的基本概念.

Point

A single coordinate pair as described in the GeoJSON Point specification: http://geojson.org/geojson-spec.html#point. To use a Point in MongoDB, see GeoJSON对象.

Polygon

An array of LinearRing coordinate arrays, as described in the GeoJSON Polygon specification: http://geojson.org/geojson-spec.html#polygon. For Polygons with multiple rings, the first must be the exterior ring and any others must be interior rings or holes.

MongoDB does not permit the exterior ring to self-intersect. Interior rings must be fully contained within the outer loop and cannot intersect or overlap with each other. See GeoJSON对象.

powerOf2Sizes

A per-collection setting that changes and normalizes the way MongoDB allocates space for each document, in an effort to maximize storage reuse and to reduce fragmentation. This is the default for TTL Collections. See collMod and usePowerOf2Sizes.

pre-splitting

An operation performed before inserting data that divides the range of possible shard key values into chunks to facilitate easy insertion and high write throughput. In some cases pre-splitting expedites the initial distribution of documents in sharded cluster by manually dividing the collection rather than waiting for the MongoDB balancer to do so. See 在集群中创建数据块.

primary

In a replica set, the primary member is the current master instance, which receives all write operations. See Primary.

primary key

A record’s unique immutable identifier. In an RDBMS, the primary key is typically an integer stored in each row’s id field. In MongoDB, the _id field holds a document’s primary key which is usually a BSON ObjectId.

primary shard

The shard that holds all the un-sharded collections. See 主分片.

priority

A configurable value that helps determine which members in a replica set are most likely to become primary. See priority.

privilege

A combination of specified resource and actions permitted on the resource. See privilege.

projection

A document given to a query that specifies which fields MongoDB returns in the result set. See 限制查询返回的字段. For a list of projection operators, see Projection Operators.

query

A read request. MongoDB uses a JSON-like query language that includes a variety of query operators with names that begin with a $ character. In the mongo shell, you can issue queries using the find() and findOne() methods. See Read Operations Overview.

query optimizer

A process that generates query plans. For each query, the optimizer generates a plan that matches the query to the index that will return results as efficiently as possible. The optimizer reuses the query plan each time the query runs. If a collection changes significantly, the optimizer creates a new query plan. See 查询计划.

query shape

A combination of query predicate, sort, and projection specifications.

For the query predicate, only the structure of the predicate, including the field names, are significant; the values in the query predicate are insignificant. As such, a query predicate { type: 'food' } is equivalent to the query predicate { type: 'utensil' } for a query shape.

RDBMS

Relational Database Management System. A database management system based on the relational model, typically using SQL as the query language.

read lock

In the context of a reader-writer lock, a lock that while held allows concurrent readers but no writers. See What type of locking does MongoDB use?.

read preference

A setting that determines how clients direct read operations. Read preference affects all replica sets, including shards. By default, MongoDB directs reads to primaries for strict consistency. However, you may also direct reads to secondaries for eventually consistent reads. See Read Preference.

record size

The space allocated for a document including the padding. For more information on padding, see Record Allocation Strategies and compact.

recovering

A replica set member status indicating that a member is not ready to begin normal activities of a secondary or primary. Recovering members are unavailable for reads.

replica pairs

The precursor to the MongoDB replica sets.

1.6 版后已移除.

replica set

A cluster of MongoDB servers that implements master-slave replication and automated failover. MongoDB’s recommended replication strategy. See 复制.

replication

A feature allowing multiple database servers to share the same data, thereby ensuring redundancy and facilitating load balancing. See 复制.

replication lag

The length of time between the last operation in the primary’s oplog and the last operation applied to a particular secondary. In general, you want to keep replication lag as small as possible. See Replication Lag.

resident memory

The subset of an application’s memory currently stored in physical RAM. Resident memory is a subset of virtual memory, which includes memory mapped to physical RAM and to disk.

resource

A database, collection, set of collections, or cluster. A privilege permits actions on a specified resource. See resource.

REST

An API design pattern centered around the idea of resources and the CRUD operations that apply to them. Typically REST is implemented over HTTP. MongoDB provides a simple HTTP REST interface that allows HTTP clients to run commands against the server. See REST Interface and REST API.

role

A set of privileges that permit actions on specified resources. Roles assigned to a user determine the user’s access to resources and operations. See 安全介绍.

rollback

A process that reverts writes operations to ensure the consistency of all replica set members. See 故障切换时的回滚.

secondary

A replica set member that replicates the contents of the master database. Secondary members may handle read requests, but only the primary members can handle write operations. See Secondaries.

secondary index

A database index that improves query performance by minimizing the amount of work that the query engine must perform to fulfill a query. See Indexes.

set name

The arbitrary name given to a replica set. All members of a replica set must have the same name specified with the replSetName setting or the --replSet option.

shard

A single mongod instance or replica set that stores some portion of a sharded cluster’s total data set. In production, all shards should be replica sets. See 分片.

shard key

The field MongoDB uses to distribute documents among members of a sharded cluster. See 片键.

sharded cluster

The set of nodes comprising a sharded MongoDB deployment. A sharded cluster consists of three config processes, one or more replica sets, and one or more mongos routing processes. See 集群组件.

sharding

A database architecture that partitions data by key ranges and distributes the data among two or more database instances. Sharding enables horizontal scaling. See 分片.

shell helper

A method in the mongo shell that provides a more concise syntax for a database command. Shell helpers improve the general interactive experience. See mongo Shell Methods.

single-master replication

A replication topology where only a single database instance accepts writes. Single-master replication ensures consistency and is the replication topology employed by MongoDB. See 复制集主节点.

slave

A read-only database that replicates operations from a master database in conventional master/slave replication. In MongoDB, replica sets replace master/slave replication for most use cases. However, for information on master/slave replication, see 主从复制.

split

The division between chunks in a sharded cluster. See 集群中chunk的分裂.

SQL

Structured Query Language (SQL) is a common special-purpose programming language used for interaction with a relational database, including access control, insertions, updates, queries, and deletions. There are some similar elements in the basic SQL syntax supported by different database vendors, but most implementations have their own dialects, data types, and interpretations of proposed SQL standards. Complex SQL is generally not directly portable between major RDBMS products. SQL is often used as metonym for relational databases.

SSD

Solid State Disk. A high-performance disk drive that uses solid state electronics for persistence, as opposed to the rotating platters and movable read/write heads used by traditional mechanical hard drives.

stale

Refers to the amount of time a secondary member of a replica set trails behind the current state of the primary’soplog. If a secondary becomes too stale, it can no longer use replication to catch up to the current state of the primary. See 复制集Oplog and 复制集的数据同步 for more information.

standalone

An instance of mongod that is running as a single server and not as part of a replica set. To convert a standalone into a replica set, see 将单节点转为复制集.

strict consistency

A property of a distributed system requiring that all members always reflect the latest changes to the system. In a database system, this means that any system that can provide data must reflect the latest writes at all times. In MongoDB, reads from a primary have strict consistency; reads from secondary members have eventual consistency.

sync

The replica set operation where members replicate data from the primary. Sync first occurs when MongoDB creates or restores a member, which is called initial sync. Sync then occurs continually to keep the member updated with changes to the replica set’s data. See 复制集的数据同步.

syslog

On UNIX-like systems, a logging process that provides a uniform standard for servers and processes to submit logging information. MongoDB provides an option to send output to the host’s syslog system. See syslogFacility.

tag

A label applied to a replica set member or shard and used by clients to issue data-center-aware operations. For more information on using tags with replica sets and with shards, see the following sections of this manual: 标签设置 and 行为和选项.

tailable cursor

For a capped collection, a tailable cursor is a cursor that remains open after the client exhausts the results in the initial cursor. As clients insert new documents into the capped collection, the tailable cursor continues to retrieve documents. See 创建Tailable游标.

TSV

A text-based data format consisting of tab-separated values. This format is commonly used to exchange data between relational databases, since the format is well-suited to tabular data. You can import TSV files using mongoimport.

TTL

Stands for “time to live” and represents an expiration time or period for a given piece of information to remain in a cache or other temporary storage before the system deletes it or ages it out. MongoDB has a TTL collection feature. See Expire Data from Collections by Setting TTL.

unique index

An index that enforces uniqueness for a particular field across a single collection. See 唯一索引.

unordered query plan

A query plan that returns results in an order inconsistent with the sort() order. See 查询计划.

upsert

An option for update operations. If set to true, the update operation will either update the first document matched by a query or insert a new document if none matches. The new document will have the fields implied by the operation. The update() and findAndModify() have the option. See 更新插（Upsert）选项.

virtual memory

An application’s working memory, typically residing on both disk an in physical RAM.

WGS84

The default datum MongoDB uses to calculate geometry over an Earth-like sphere. MongoDB uses the WGS84 datum for geospatial queries on GeoJSON objects. See the “EPSG:4326: WGS 84” specification: http://spatialreference.org/ref/epsg/4326/.

working set

The data that MongoDB uses most often. This data is preferably held in RAM, solid-state drive (SSD), or other fast media. See What is the working set?.

write concern

Specifies whether a write operation has succeeded. Write concern allows your application to detect insertion errors or unavailable mongod instances. For replica sets, you can configure write concern to confirm replication to a specified number of members. See 写关注.

write lock

A lock on the database for a given writer. When a process writes to the database, it takes an exclusive write lock to prevent other processes from writing or reading. For more information on locks, see FAQ: Concurrency.

writeBacks

The process within the sharding system that ensures that writes issued to a shard that is not responsible for the relevant chunk get applied to the proper shard. For related information, see What does writebacklisten in the log mean? and writeBacksQueued.

← MongoDB Extended JSON Release Notes →