as PDF

Storage Elements in AEM 6.0
Storage Elements in AEM 6.0
Overview / Adobe Experience Manager / Adobe Experience Manager 6.0 / Deploying and Maintaining / Upgrading to AEM 6.0 /
OVERVIEW OF PLATFORM CHANGES
One of the most important changes in AEM 6.0 are the innovations at the repository level.
Currently, there are two node storage implementations available in AEM6: Tar storage, and MongoDB
storage.
TAR STORAGE
The Tar storage uses tar files. It stores the content as various types of records within larger segments.
Journals are used to track the latest state of the repository.
There are several key design principles it was build around:
• Immutable Segments
The content is stored in segments that can be up to 256KiB in size. They are immutable, which makes it
easy to cache frequently accessed segments and reduce system errors that may corrupt the repository.
Each segment is identified by a unique identifier (UUID) and contains a continuous subset of the content
tree. In addition, segments can reference other content. Each segment keeps a list of UUIDs of other
referenced segments.
• Locality
Related records like a node and its immediate children are usually stored in the same segment. This makes
searching the repository very fast and avoids most cache misses for typical clients that access more than
one related node per session.
• Compactness
The formatting of records is optimized for size to reduce IO costs and to fit as much content in caches as
possible.
Running a freshly installed AEM instance with Tar Storage
By default, AEM 6.0 uses the Tar storage to store nodes and binaries, using the default configuration
options. To manually configured its storage settings, follow the below procedure:
1.
2.
3.
Download the AEM 6.0 quickstart jar and place it in a new folder.
Unpack AEM by running:
java –jar cq-quickstart-6.0.0.jar -unpack
Create a folder named crx-quickstart\install in the installation directory.
4.
Create a file called org.apache.jackrabbit.oak.plugins.segment.SegmentNodeStoreService.cfg in the
newly created folder.
5.
Edit the file and set the configuration options. The following options are available for Segment Node
Store, which is the basis of AEM's Tar storage implementation:
• repository.home: Path to repository home under which various repository related data is stored. By
default segment files would be stored under the crx-quickstart/segmentstore directory.
• tarmk.size: Maximum size of a segment in MB. The default is 256MB.
6.
Start AEM.
MONGO STORAGE
The MongoDB storage leverages MongoDB for sharding and clustering. The repository tree is kept in one
MongoDB database where each node is a separate document.
It has several particularities:
• Revisions
For each update (commit) of the content, a new revision is created. A revision is basically a string that
consists of three elements:
© 2012 Adobe Systems Incorporated.
All rights reserved.
Page 1
Created on 2015-01-20
Storage Elements in AEM 6.0
1. A timestamp derived from the system time of the machine it was generated on
2. A counter to distinguish revisions created with the same timestamp
3. The cluster node id where the revision was created
• Branches
Branches are supported, which allows client to stage multiple changes and make them visible with a single
merge call.
• Previous documents
MongoDB storage adds data to a document with every modification. However, it only deletes data if a
cleanup is explicitly triggered. Old data is moved when a certain threshold is met. Previous documents only
contain immutable data, which means they only contain committed and merged revisions.
• Cluster node metadata
Data about active and inactive cluster nodes is kept in the database in order to facilitate cluster operations.
A typical AEM cluster setup with MongoDB storage:
Running a freshly installed AEM instance with Mongo Storage
AEM 6.0 can be configured to run with MongoDB storage by following the below procedure:
1.
2.
Download the AEM 6.0 quickstart jar and place it into a new folder.
Unpack AEM by running the following command:
4.
5.
java –jar cq-quickstart-6.0.0.jar -unpack
Make sure that MongoDB is installed and an instance of mongod is running. For more info, see
Installing MongoDB.
Create a folder named crx-quickstart\install in the installation directory.
Configure the node store by creating a configuration file with the name of the configuration you want to
use in the crx-quickstart\install directory.
The Document Node Store (which is the basis for AEM's MongoDB storage implementation) uses a file
called org.apache.jackrabbit.oak.plugins.document.DocumentNodeStoreService.cfg
6.
Edit the file and set your configuration options. The following options are available:
3.
•
•
•
•
mongouri: The MongoURI required to connect to Mongo Database. The default is mongodb://
localhost:27017
db: Name of the Mongo database. By default new AEM 6 installations use aem-author as the
database name.
cache: The cache size in MB. This is distributed among various caches used in
DocumentNodeStore. The default is 256
changesSize: Size in MB of capped collection used in Mongo for caching the diff output. The
default is 256
© 2012 Adobe Systems Incorporated.
All rights reserved.
Page 2
Created on 2015-01-20
Storage Elements in AEM 6.0
•
7.
8.
customBlobStore: Boolean value indicating that a custom data store will be used. The default is
false.
Create a configuration file with the PID of the data store you wish to use and edit the file in order to set
the configuration options. For more info, please see Configuring Node Stores and Data Stores.
Start the AEM 6 jar with a MongoDB storage backend by running:
java -jar cq-quickstart-6.0.0.jar -r crx3,crx3mongo
Where -r is the backend runmode. In this example, it will start with MongoDB support.
MAINTAINING THE REPOSITORY
Compacting Tar Files
As data is never overwritten in a tar file, the disk usage increases even when only updating existing data.
To make up for the growing size of the repository, AEM employs a garbage collection mechanism called Tar
Compaction. The mechanism will reclaim disk space by removing obsolete data from the repository.
Revision Clean Up
By default, tar file compaction is automatically run each night between 2 am and 5 am. The automatic
compaction can be triggered manually in the Operations Dashboard via a maintenance job called Revision
Clean Up.
To start Revision Clean Up you need to:
1.
2.
3.
4.
Open AEM.
In the main AEM window, go to Tools - Operations - Dashboard - Maintenance or directly browse to
http://localhost:4502/libs/granite/operations/content/maintenance.html
Click on Daily Maintenance Window.
Hover over the Revision Clean Up window and press the Start button.
The icon will turn orange to indicate that the Revision Clean Up job is running. You can stop it at any time by
hovering the mouse over the icon and pressing the Stop button:
© 2012 Adobe Systems Incorporated.
All rights reserved.
Page 3
Created on 2015-01-20
Storage Elements in AEM 6.0
Invoking Revision Garbage Collection via the JMX Console
1.
2.
3.
Open the JMX Console by going to http://localhost:4502/system/console/jmx
Click the RevisionGarbageCollection MBean.
In the next window, click startRevisionGC() and then Invoke to start the Revision Garbage Collection
job.
NOTE
Due to the mechanics of the garbage collection, the first run will actually add 256 MB of disk
space. Subsequent runs will work as expected and start shrinking the repository size.
The Oak-run tool
For situations where normal garbage collection doesn't work, Adobe provides a manual Tar compaction tool
called Oak-run. It can be requested by logging a ticket with Adobe AEM Support.
The tool is a runnable jar that can be manually run to compact the repository. However, since no other
client can access the repository while the tool is running it is required that the repository is shut down before
executing it.
Normal operation of the tool also requires old checkpoints to be cleared before the compaction takes
place. Because of this, a full content reindex will be required after running the tool. For more information on
indexing, please see Queries and Indexing.
The procedure to run the tool is:
1.
First, delete old checkpoints:
java -jar oak-run.jar checkpoints install-folder/crx-quickstart/repository/segmentstore rm-all
2.
Run the compaction and wait for it to complete:
java -jar oak-run.jar compact install-folder/crx-quickstart/repository/segmentstore
© 2012 Adobe Systems Incorporated.
All rights reserved.
Page 4
Created on 2015-01-20