Introduction to MongoDB
MongoDB is a popular, open-source NoSQL database that stores data in flexible, JSON-like documents. Instead of using tables and rows as in traditional relational databases, MongoDB uses collections and documents, allowing for a more natural representation of data in many modern applications.
Analogy: MongoDB vs. Traditional Databases
Think of the difference between MongoDB and traditional relational databases like the difference between a filing cabinet and a spreadsheet:
- Relational Database (Spreadsheet):
- Data is organized in rigid tables with predefined columns
- Each row must conform to the same structure
- Adding a new column affects the entire table
- Relationships between tables are defined explicitly
- MongoDB (Filing Cabinet):
- Data is stored in folders (collections) containing documents
- Each document can have its own unique structure
- You can add new fields to some documents without affecting others
- Related information can be embedded directly within documents
Both approaches have their strengths, but MongoDB's flexibility makes it particularly well-suited for:
- Rapidly evolving data structures
- Applications with complex, hierarchical data
- Development workflows with frequent schema changes
- Large-scale, distributed systems where horizontal scaling is important
Key Features of MongoDB
- Document-Oriented: Data stored in flexible, JSON-like BSON documents
- Schema-less: Documents in a collection can have different fields and structures
- Rich Query Language: Supports dynamic queries, field indexing, and real-time aggregation
- High Performance: Supports indexing, sharding, and has built-in caching capabilities
- High Availability: Replica sets ensure redundancy and automatic failover
- Horizontal Scalability: Scale-out architecture through sharding for massive datasets
- Geospatial Support: Built-in features for location-based data
- Aggregation Framework: Powerful pipeline for data transformation and analysis
- Transactions: ACID transactions for multi-document operations
MongoDB vs. Relational Databases
| Concept | Relational (SQL) Database | MongoDB (NoSQL) |
|---|---|---|
| Data Structure | Tables with rows and columns | Collections with documents |
| Schema | Fixed, predefined | Dynamic, flexible |
| Relationships | Through foreign keys and joins | Through embedded documents or references |
| Query Language | SQL (Structured Query Language) | JSON-based query language |
| Scaling | Vertical scaling (larger servers) | Horizontal scaling (more servers) |
| Transactions | ACID transactions by default | Multi-document ACID transactions (since v4.0) |
| Join Operations | Native support for complex joins | $lookup aggregation stage (less efficient) |
| Data Integrity | Enforced through constraints | Application-enforced (mostly) |
| Best For | Complex relationships, transactions | Rapid development, flexible schema, scaling |
When to Choose MongoDB
- Large volumes of rapidly changing data - MongoDB's horizontal scaling makes it suitable for big data applications
- Projects requiring rapid iteration - Schema-less design enables agile development
- Applications with complex, hierarchical data structures - Document model can represent nested structures naturally
- Content management systems - Flexibility for various content types with different attributes
- Real-time analytics - Aggregation framework and speed for analytics workloads
- Catalog or product data - Ideal for handling products with varying attributes
- IoT applications - Can handle high write loads and time-series data
When to Consider Alternatives
- Applications requiring complex transactions - While MongoDB supports transactions, relational databases may be better optimized
- Highly relational data with many joins - Normalized data with many relationships may be more efficiently handled in a relational database
- When rigid schema enforcement is critical - If strict data validation is required at the database level
- Applications requiring complex queries with multiple joins - SQL may provide more efficient queries for certain complex operations
MongoDB Document Structure
Understanding Documents and Collections
The basic components of MongoDB's structure are:
- Database: Contains collections; similar to a database in relational systems
- Collection: Group of MongoDB documents; roughly equivalent to a table in relational databases
- Document: A record in MongoDB, stored in BSON format (Binary JSON)
- Field: A key-value pair in a document; similar to a column in relational databases
Document Format: BSON
MongoDB stores data in BSON (Binary JSON) format, which extends the JSON model to provide additional data types and efficiency. BSON documents can contain a variety of data types:
| Data Type | Description | Example in JavaScript |
|---|---|---|
| String | UTF-8 character string | "Hello, MongoDB!" |
| Integer | 32-bit or 64-bit integer | 42 |
| Double | 64-bit floating point | 3.14159 |
| Boolean | true or false | true |
| Array | Ordered list of values | ["red", "green", "blue"] |
| Object | Embedded document | { name: "John", age: 30 } |
| null | Null value | null |
| Date | DateTime value | new Date() |
| ObjectId | Unique identifier | ObjectId("507f1f77bcf86cd799439011") |
| Binary Data | Binary data | Buffer.from("binary") |
| Regular Expression | JavaScript RegExp | /pattern/i |
| Timestamp | Internal timestamp | Timestamp(1412180887, 1) |
Document Structure: Key Concepts
Example MongoDB Document:
{
"_id": ObjectId("5f8a76e910bd12b4e4c9a0f1"),
"username": "johndoe",
"email": "john@example.com",
"profile": {
"firstName": "John",
"lastName": "Doe",
"birthDate": ISODate("1990-07-15T00:00:00Z"),
"address": {
"street": "123 Main St",
"city": "New York",
"state": "NY",
"zipCode": "10001"
}
},
"interests": ["programming", "hiking", "photography"],
"accountCreated": ISODate("2020-10-17T09:34:33.123Z"),
"isActive": true,
"loginCount": 42,
"lastLogin": ISODate("2023-05-20T14:25:16.789Z")
}
Key concepts of MongoDB document structure:
- Document ID: Each document has a unique
_idfield that acts as a primary key - Embedded Documents: Documents can contain nested documents, allowing for hierarchical data structures
- Arrays: Documents can contain arrays of values or even arrays of embedded documents
- Field Names: Field names are case-sensitive and cannot contain the null character
- Dynamic Schema: Documents in the same collection can have different fields
- Document Size Limit: Maximum BSON document size is 16 megabytes
Data Modeling in MongoDB
Embedded Documents vs. References
One of the most important decisions in MongoDB schema design is whether to embed related data or use references:
Embedded Document Approach:
// User document with embedded addresses
{
"_id": ObjectId("..."),
"username": "johndoe",
"email": "john@example.com",
"addresses": [
{
"type": "home",
"street": "123 Main St",
"city": "New York",
"state": "NY",
"zipCode": "10001"
},
{
"type": "work",
"street": "456 Business Ave",
"city": "New York",
"state": "NY",
"zipCode": "10002"
}
]
}
Reference Approach:
// User document with references to addresses
{
"_id": ObjectId("5f8a76e910bd12b4e4c9a0f1"),
"username": "johndoe",
"email": "john@example.com",
"addressIds": [
ObjectId("5f8a77a110bd12b4e4c9a0f2"),
ObjectId("5f8a77a110bd12b4e4c9a0f3")
]
}
// Address documents in a separate collection
{
"_id": ObjectId("5f8a77a110bd12b4e4c9a0f2"),
"userId": ObjectId("5f8a76e910bd12b4e4c9a0f1"),
"type": "home",
"street": "123 Main St",
"city": "New York",
"state": "NY",
"zipCode": "10001"
}
{
"_id": ObjectId("5f8a77a110bd12b4e4c9a0f3"),
"userId": ObjectId("5f8a76e910bd12b4e4c9a0f1"),
"type": "work",
"street": "456 Business Ave",
"city": "New York",
"state": "NY",
"zipCode": "10002"
}
When to Embed vs. When to Reference
| Use Embedding When | Use References When |
|---|---|
| Entities have a "contains" relationship | Entities have a "refers to" relationship |
| One entity always appears with another | Entities can be queried independently |
| Entities have a one-to-few relationship | Entities have a one-to-many or many-to-many relationship |
| Embedded data doesn't grow without bound | Related data set can grow very large |
| Fast reads are a priority | Data consistency is more important than read performance |
| Atomic updates are required | Related data is accessed infrequently |
Common Data Modeling Patterns
One-to-One Relationship
Typically implemented with embedding:
{
"_id": ObjectId("..."),
"user": "johndoe",
"profile": {
"firstName": "John",
"lastName": "Doe",
"bio": "Software developer"
}
}
One-to-Few Relationship
Best implemented with embedding:
{
"_id": ObjectId("..."),
"name": "ACME Inc.",
"contacts": [
{ "name": "John Doe", "position": "CEO", "email": "john@acme.com" },
{ "name": "Jane Smith", "position": "CFO", "email": "jane@acme.com" }
]
}
One-to-Many Relationship
Can be implemented with either approach depending on size:
Parent referencing children:
// Author document
{
"_id": ObjectId("author123"),
"name": "Stephen King",
"bookIds": [
ObjectId("book1"),
ObjectId("book2"),
ObjectId("book3")
]
}
Or child referencing parent (more common):
// Book documents
{
"_id": ObjectId("book1"),
"title": "The Shining",
"authorId": ObjectId("author123")
}
Many-to-Many Relationship
Typically implemented with references and sometimes a separate collection:
// Student
{
"_id": ObjectId("student1"),
"name": "Alice",
"courseIds": [
ObjectId("course1"),
ObjectId("course2")
]
}
// Course
{
"_id": ObjectId("course1"),
"name": "Database Design",
"studentIds": [
ObjectId("student1"),
ObjectId("student2")
]
}
MongoDB ObjectIDs
The _id field is a special field that serves as a primary key for MongoDB documents. By default, MongoDB generates a unique ObjectId for this field.
Key facts about ObjectIds:
- Globally Unique: Generated to be unique across servers, ensuring no collisions
- Generated by Default: MongoDB creates one if not provided when inserting a document
- Contains Timestamp: The first 4 bytes represent the creation time
- Sortable: ObjectIds are roughly sortable by creation time
- 12-byte Value: Typically represented as a 24-character hexadecimal string
- Custom IDs Allowed: You can provide your own _id value instead of using an ObjectId
Working with ObjectIds:
// Creating a new ObjectId
const { ObjectId } = require('mongodb');
const newId = new ObjectId();
console.log(newId.toString()); // e.g., "507f191e810c19729de860ea"
// Getting creation timestamp from ObjectId
const timestamp = newId.getTimestamp();
console.log(timestamp); // e.g., 2023-05-21T12:34:56.000Z
// Creating an ObjectId from a string
const existingId = new ObjectId("507f191e810c19729de860ea");
// Creating an ObjectId for a specific time
const specificTime = new Date("2023-01-01");
const timeBasedId = new ObjectId(Math.floor(specificTime / 1000).toString(16) + "0000000000000000");
Setting Up MongoDB
Installation Options
- Local Installation: Install MongoDB directly on your machine
- MongoDB Atlas: Cloud-hosted MongoDB service (free tier available)
- Docker Container: Run MongoDB in a Docker container
- MongoDB Compass: GUI for exploring and manipulating MongoDB data
Local Installation
For Windows:
- Download the MongoDB installer from the official MongoDB website
- Run the installer and follow the setup wizard
- MongoDB is installed as a service by default
- Data is stored in
C:\Program Files\MongoDB\Server\[version]\databy default
For macOS (using Homebrew):
# Install MongoDB
brew tap mongodb/brew
brew install mongodb-community
# Start MongoDB service
brew services start mongodb-community
For Linux (Ubuntu):
# Import MongoDB public key
wget -qO - https://www.mongodb.org/static/pgp/server-5.0.asc | sudo apt-key add -
# Create list file for MongoDB
echo "deb [ arch=amd64,arm64 ] https://repo.mongodb.org/apt/ubuntu focal/mongodb-org/5.0 multiverse" | sudo tee /etc/apt/sources.list.d/mongodb-org-5.0.list
# Reload local package database
sudo apt-get update
# Install MongoDB packages
sudo apt-get install -y mongodb-org
# Start MongoDB service
sudo systemctl start mongod
# Enable MongoDB to start on boot
sudo systemctl enable mongod
MongoDB Atlas
MongoDB Atlas is a cloud-hosted MongoDB service:
- Go to MongoDB Atlas
- Create a free account and sign in
- Create a new cluster (the free tier is sufficient for learning)
- Configure database access (username and password)
- Configure network access (IP address whitelist)
- Get your connection string from the "Connect" button
Sample connection string format:
mongodb+srv://username:password@cluster0.mongodb.net/myDatabase?retryWrites=true&w=majority
Docker Container
Run MongoDB in a Docker container:
# Pull the MongoDB image
docker pull mongo
# Run MongoDB container
docker run -d -p 27017:27017 --name mongodb mongo:latest
# To run with a persistent data volume
docker run -d -p 27017:27017 --name mongodb \
-v mongodb_data:/data/db mongo:latest
# Connect to MongoDB shell in the container
docker exec -it mongodb mongosh
MongoDB Compass
MongoDB Compass is an interactive tool for querying, exploring, and visualizing your MongoDB data:
- Download MongoDB Compass from the official website
- Install and launch the application
- Connect to your MongoDB instance:
- For local MongoDB:
mongodb://localhost:27017 - For MongoDB Atlas: Use the connection string provided by Atlas
- For local MongoDB:
MongoDB Connection Options
Common connection string parameters:
retryWrites=true: Automatically retry write operations if they failw=majority: Require acknowledgment from a majority of replica set membersreadPreference=primary: Read only from the primary nodeauthSource=admin: Database to use for authenticationssl=true: Use SSL/TLS connectionconnectTimeoutMS=30000: Connection timeout in milliseconds
MongoDB Shell Basics
MongoDB provides a JavaScript shell interface called mongosh (MongoDB Shell) for interacting with the database.
Connecting to MongoDB:
# Connect to local MongoDB
mongosh
# Connect to a specific database
mongosh myDatabase
# Connect with authentication
mongosh --username myUsername --password myPassword --authenticationDatabase admin
# Connect to MongoDB Atlas
mongosh "mongodb+srv://username:password@cluster0.mongodb.net/myDatabase"
Basic MongoDB Shell Commands:
// Show all databases
show dbs
// Switch to a database (creates it if it doesn't exist)
use myDatabase
// Show collections in the current database
show collections
// Create a collection
db.createCollection("users")
// Insert a document
db.users.insertOne({
username: "johndoe",
email: "john@example.com",
age: 30,
active: true
})
// Find documents
db.users.find() // All documents
db.users.find({age: 30}) // With a filter
db.users.findOne({username: "johndoe"}) // Single document
// Update a document
db.users.updateOne(
{ username: "johndoe" },
{ $set: { email: "john.doe@example.com" } }
)
// Delete a document
db.users.deleteOne({ username: "johndoe" })
// Count documents
db.users.countDocuments()
// Drop a collection
db.users.drop()
// Drop a database
db.dropDatabase()
Query Formatting and Navigation:
// Format results for better readability
db.users.find().pretty()
// Limit results
db.users.find().limit(5)
// Skip results (for pagination)
db.users.find().skip(5).limit(5)
// Sort results (1 for ascending, -1 for descending)
db.users.find().sort({ age: 1 })
db.users.find().sort({ lastLogin: -1 })
// Combine these operations
db.users.find({ active: true })
.sort({ lastLogin: -1 })
.skip(10)
.limit(10)
.pretty()
Practical Activities
Activity 1: Setting Up MongoDB
- Install MongoDB locally or create a MongoDB Atlas account
- Install MongoDB Compass for visual exploration
- Connect to your MongoDB instance using the MongoDB Shell
- Create a new database for your web development projects
- Create several test collections
Activity 2: Document Structure Practice
Design document structures for the following scenarios:
- A blog platform with users, posts, and comments
- An e-commerce store with products, categories, and orders
- A social media application with users, posts, and friend relationships
- A library management system with books, authors, and borrowing records
For each scenario:
- Decide when to use embedded documents vs. references
- Create example documents showing the structure
- Consider performance implications for common operations
Activity 3: Basic MongoDB Operations
Using the MongoDB Shell:
- Create a collection called "contacts"
- Insert at least 5 contact documents with various fields (name, email, phone, address, etc.)
- Query contacts based on different criteria (exact match, range, etc.)
- Update contact information
- Delete a contact
- Practice with sorting, limiting, and skipping results
Key Takeaways
- MongoDB is a NoSQL database that stores data in flexible, JSON-like documents
- Documents are grouped into collections, which are analogous to tables in relational databases
- MongoDB's flexible schema allows different documents in the same collection to have different fields
- Data in MongoDB can be structured with either embedded documents or references between documents
- Each document has a unique
_idfield, typically an ObjectId generated by MongoDB - MongoDB is particularly well-suited for applications with evolving schemas and complex hierarchical data
- MongoDB can be installed locally, run in Docker, or used as a cloud service via MongoDB Atlas
- The MongoDB Shell (mongosh) provides a JavaScript interface for interacting with the database
Next Steps
In our next lecture, we'll explore CRUD operations in MongoDB in more detail and learn how to perform these operations using Node.js with the MongoDB native driver.