Lecture 1.1 What is PAT and How to use it? Content

Lecture 1.1
What is PAT and How to use it?
Content
●
A short reminder of the CMS EDM and Analysis Workflow
●
The answer to the question: What is PAT?
●
An introduction to the PAT DataFormats
●
Configuration of the PAT DataFormats
●
An introduction to the PAT Workflow
●
Support and Documentation
PAT Tutorial June 2010
Reminder of the Event Data Model
●
Configurable edm::Modules communicate with/via the EventContent
●
Same file structure (i.e. root) for: Gen-Sim-Digi-Reco-Analysis
●
Single framework for Reconstruction (POGs) and Analysis (PAGs)
Typical CMS Analysis Workflow
●
Prompt reconstruction at Tier-0.
●
Central skims at Tier-1's.
●
Users run cmsRun at Tier-2's:
●
●
●
●
●
●
Perform high level analysis steps.
Preselect events.
Write their own user defined
EventContent to private T2/T3 space.
The latter step might be iterated.
Copy reduced datasets to your
favorite machine.
Run your final analysis/produce plots.
PAT helps you to create a
user-defined EventContent
What is the Physics Analysis Toolkit
PAT is a toolkit as part of the CMSSW framework
●
●
●
●
●
It serves as well tested and supported common ground for group and user
analyses.
It facilitates reproducibility and comprehensibility of analyses.
It is an interface between the sometimes complicated EDM and the simple
mind of the common user.
You can view it as a common language between CMS analysts:
If another CMS analyst describes you a PAT analysis you can easily know
what he/she is talking about
Three Aspects of PAT
Common Tool
Interface
●
●
●
●
●
b/w RECO expertise & Analysis Level
simplifies access via DataFormats
canalizes expertise (via POG & PAG
contacts)
●
●
approved algorithms & sensible defaults
synergy (everybody can profit from
recent developments)
quick start into analysis for beginners
crossing point between POGs & PAGs
('vertical integration')
Common Format
●
facilitates transfer & comparisons
●
PAG common configurations
●
sustained provenance
Facilitated Access to Event Information
●
Do you know how to access this event information within the EDM?
Object Id,
Cluster shapes
Isolation
(different from
defaults)
Correction Factors,
Object Resolutions
reco::Candidate
JetFlavor
Generator Match,
Trigger Match
More, ...
Associated Tracks,
JetCharge
BTag Algorithms,
TagInfos
●
With PAT Candidates you get this just by calling member functions!
●
Note: Each PAT Candidate IS a corresponding reco::RecoCandidate (and more)
The PAT Data Formats
●
All pat::Objects inherit from their corresponding reco::RecoCandidates
●
A PAT Candidate is a reco::RecoCandidate PLUS more.
PAT Candidate Member Functions
Check the Documentation: SWGuidePATDataFormats
Combine Flexibility and User Friendliness
●
●
You can choose yourself whether you really need all the extra information
that the PAT Candidates provide.
Still you don't need to know, how EDM/PAT manages this access for you under
the hood.
Flexibility
User Friendliness
Maximal
Configuration
●
The key is: configuration of DataFormats by cfi file! (E.g. for pat::Jets).
Configuration of PAT DataFormats
You can configure the content of the DataFormats yourself (example: pat::Jet)!
import FWCore.ParameterSet.Config as cms
patJets = cms.EDProducer("PATJetProducer",
...
# embedding of AOD items
embedCaloTowers = cms.bool(False),
embedPFCandidates = cms.bool(False),
# jet energy corrections
addJetCorrFactors
= cms.bool(True),
jetCorrFactorsSource = cms.VInputTag("patJetCorrFactors"),
# btag information
addBTagInfo
= cms.bool(True),
addDiscriminators
= cms.bool(True),
discriminatorSources = cms.VInputTag(
...
),
# clone tag infos ATTENTION: these take lots of space!
# usually the discriminators from the default algos
# are sufficient
addTagInfos
= cms.bool(True),
tagInfoSources = cms.VInputTag(
...
),
# track association
addAssociatedTracks
= cms.bool(True),
trackAssociationSource = "ak5JetTracksAssociatorAtVertex",
# jet charge
addJetCharge
= cms.bool(True),
jetChargeSource = cms.InputTag("patJetCharge"),
# add jet ID
addJetID = cms.bool(True),
jetIDMap = cms.InputTag("ak5JetID"),
Size: 14kb/event (for ttbar)
The PAT Workflow
Have a look at:
SWGuidePATWorkflow
Pre-Production steps
before PAT Candidate
creation
PAT Candidate creation
Main collection (w/o cleaning)
Main collection (with cleaning)
Resembled by the structure of the python directory
in the PatAlgos package (don't be shy, check it out!)
EventContent of the default PAT Tuple
●
Have a look to patEventContent_cff.py:
●
Have a look to patTemplate_cfg.py:
●
Size: 20kb/event (for ttbar)
But decide yourself how your PAT Tuple should look like (add reco::Tracks or
reco::GenParticles to the Event Content or BTag information to the jets, etc ... ).
The concept of Maximal Configuration
●
Configure your own
DataFormats via embedding
(see Lecture 2.2/Exercise 06).
●
Configure your workflow via
tools that PAT provides (see
Lecture 2.1/Exercise 05).
●
●
Add any extra info
you need the the
EventContent.
Apply selections via the
StringCutParser.
The Code Location
DataFormats/PatCandidates
●
●
Definition of all PAT Candidates.
pat::Photon, pat::Electron, pat::Muon, pat::Tau, pat::Jet, pat::MET, ...
PhysicsTools/PatAlgos
●
●
Implementation and filling of all data formats.
Definition of common workflow and PAT tools.
PhysicsTools/PatUtils
●
Definition of common tools and helper functions used in PatAlgos.
PhysicsTools/PatExamples
●
Location of many examples e.g. all non-trivial examples used during
this Tutorial.
Development
PAT is part of any CMSSW release. We recommend to use it from the release!
Have a look at:
SWGuidePATRecipes
Development (cont'd)
In case you want already to use features/fixes that will go into the next release
follow the Pat release notes in the corresponding development branch.
Support
Check the the main entry page of PAT in the software guide: SWGuidePAT
A short extract of possible
support:
●
Lecturers & Tutors
●
Hypernews
●
Community
●
POG/PAG contacts
●
Developers
●
The quite developed PAT
Documentation!
Documentation
●
SWGuidePAT/WorkBookPAT Main documentation pages
●
WorkBookPATDataFormats
●
WorkBookPATWorkflow
●
WorkBookPATConfiguration
●
SWGuidePATTools Description of all PaT tools.
●
WorkBookPATTutorial Tutorials and examples to get started.
●
SWGuidePATRecipes
●
SWGuidePATEventSize Tools for event size estimate
Description of all PAT Candidate.
Description of the PAT workflow.
Description of the configuration of PAT.
Installation recipes
And last but not least: This Tutorial and/or former Tutorials...
Exercises
By now you should be prepared to do the following Exercises on
WorkBookPATTutorial: Have Fun!
●
Exercise 1 (WorkBookPATDocNavigationExercise)
The PAT Documentation is one of the most looked after parts of the WorkBook. To
know your documentation and how to use it can speed up your learning curve
enormously. Learn more about the PAT Documentation and how to make effective
use of it.
●
Exercise 2 (WorkBookTupleCreationExercise)
Learn how the default PAT tuple is produced to be prepared to produce your own PAT
tuples.
●
Exercise 3 (WorkBookTupleCrapExercise)
This is the part of the crab tutorial. Once you are doing large sceal analyses you will
need crab.