CRAB documentation at CERN

The official CRAB documentation can be found here

A CRAB example

This is an example on how to run a simple analysis using CRAB on LXPLUS.

The CRAB version used for this example is CRAB_2_6_3_patch_2

Prerequisites:

  • a ZH account on LXPLUS
  • a valid GRID certificate installed on your ZH account on LXPLUS (see GridCertificates)

  1. login to LXPLUS as LXPLUSusername
  2. setup your analysis (we will use the CMSSW_3_1_4 jet analysis example available here)
    cmsrel CMSSW_3_1_4
    cd CMSSW_3_1_4/src
    cvs co -r V01-08-08-02 CondFormats/JetMETObjects
    cvs co -r V01-08-21-01 JetMETCorrections/Configuration
    cvs co -r V00-04-08 RecoJets/JetAnalyzers
    scram b
    cmsenv
    cd RecoJets/JetAnalyzers/test
    cmsRun runL2L3JetCorrectionExample_cfg.py
  3. if cmsRun was executed without problems, you may now prepare the crab.cfg file
    Show Hide sample crab.cfg
    #=================================================================================================
    [CRAB]                                                              
    
    jobtype = cmssw
    scheduler = glite
    ### NOTE: just setting the name of the server (pi, lnl etc etc )
    ###       crab will submit the jobs to the server...            
    #server_name = bari                                             
    #                                                               
    [CMSSW]                                                         
    
    ### The data you want to access (to be found on DBS)
    #datasetpath=/ttbar_inclusive_TopRex/CMSSW_1_3_1-Spring07-1122/GEN-SIM-DIGI-RECO
    datasetpath=none                                                                
    
    ### The ParameterSet you want to use
    pset=pythia.cfg                     
    
    ### Splitting parameters
    #total_number_of_events=-1
    total_number_of_events=10 
    #events_per_job = 1000    
    number_of_jobs = 5        
    
    ### The output files (comma separated list)
    output_file = mcpool.root                  
    
    [USER]
    
    ### OUTPUT files Management
    ##  output back into UI    
    return_data = 1            
    
    ### To use a specific name of UI directory where CRAB will create job to submit (with full path).
    ### the default directory will be "crab_0_data_time"                                             
    #ui_working_dir = /full/path/Name_of_Directory                                                   
    
    ### To specify the UI directory where to store the CMS executable output
    ### FULL path is mandatory. Default is  <ui_working_dir>/res will be used.
    #outputdir= /full/path/yourOutDir                                         
    
    ### To specify the UI directory where to store the stderr, stdout and .BrokerInfo of submitted jobs
    ### FULL path is mandatory. Default is <ui_working_dir>/res will be used.                          
    #logdir= /full/path/yourLogDir                                                                     
    
    ### OUTPUT files INTO A SE
    copy_data = 0             
    
    ### if you want to copy data in a "official CMS site"
    ### you have to specify the name as written in       
    #storage_element = T2_IT_Bari                        
    ### the user_remote_dir will be created under the SE mountpoint
    ### in the case of publication this directory is not considered
    #user_remote_dir = name_directory_you_want                     
    
    ### if you want to copy your data at CAF
    #storage_element = T2_CH_CAF            
    ### the user_remote_dir will be created under the SE mountpoint
    ### in the case of publication this directory is not considered
    #user_remote_dir = name_directory_you_want                     
    
    ### if you want to copy your data to your area in castor at cern
    ### or in a "not official CMS site" you have to specify the complete name of SE
    #storage_element=srm-cms.cern.ch
    ### this directory is the mountpoin of SE
    #storage_path=/srm/managerv2?SFN=/castor/cern.ch/
    ### directory or tree of directory under the mounpoint
    #user_remote_dir = name_directory_you_want
    
    
    ### To publish produced output in a local istance of DBS set publish_data = 1
    publish_data=0
    ### Specify the dataset name. The full path will be <primarydataset>/<publish_data_name>/USER
    publish_data_name = name_you_prefer
    ### Specify the URL of DBS istance where CRAB has to publish the output files
    #dbs_url_for_publication = https://cmsdbsprod.cern.ch:8443/cms_dbs_caf_analysis_01_writer/servlet/DBSServlet
    
    ### To specify additional files to be put in InputSandBox
    ### write the full path  if the files are not in the current directory
    ### (wildcard * are allowed): comma separated list
    #additional_input_files = file1, file2, /full/path/file3
    
    #if server
    #thresholdLevel = 100
    #eMail = your@Email.address
    
    [GRID]
    #
    ## RB/WMS management:
    rb = CERN
    
    ##  Black and White Lists management:
    ## By Storage
    se_black_list = T0,T1
    #se_white_list =
    
    ## By ComputingElement
    #ce_black_list =
    #ce_white_list =
    
    [CONDORG]
    
    # Set this to condor to override the batchsystem defined in gridcat.
    #batchsystem = condor
    
    # Specify addition condor_g requirments
    # use this requirment to run on a cms dedicated hardare
    # globus_rsl = (condor_submit=(requirements 'ClusterName == \"CMS\" && (Arch == \"INTEL\" || Arch == \"X86_64\")'))
    # use this requirement to run on the new hardware
    #globus_rsl = (condor_submit=(requirements 'regexp(\"cms-*\",Machine)'))
    #=================================================================================================
    
    Modify the sample crab.cfg file according to your needs:
    • change the datasetpath, e.g.
      datasetpath=/QCDDiJet_Pt170to230/Summer09-MC_31X_V3-v1/GEN-SIM-RECO
    • change the pset to the one used before:
      pset=runL2L3JetCorrectionExample_cfg.py
    • adjust the splitting parameters ( set the two out of three ), e.g.
      total_number_of_events=-1 
      events_per_job = 10000
      # number_of_jobs = 20
      
    • adjust the output file(s), e.g.
      output_file = CorJetHisto_SC5PF.root
    • describe where output files are to be saved, e.g.
      copy_data = 1
      storage_element = T2_GR_Ioannina
      user_remote_dir = MyTest
      # The area in which your output file will be written is:
      #   site's endpoint + /store/user/<LXPLUSusername>/<user_remote_dir>/<output-file-name>
      #   for T2_GR_Ioannina, the site's endpoint is srm://grid02.physics.uoi.gr/dpm/physics.uoi.gr/home/cms
      
    • adjust ce_black_list, ce_white_list, se_black_list and se_white_list to your needs, if necessary.
  4. setup the GRID environment
    source /afs/cern.ch/cms/LCG/LCG-2/UI/cms_ui_env.sh
    voms-proxy-init -voms cms -hours 192 -vomslife 192:0
    • You will be asked to enter your GRID pass phrase
      
      
  5. setup the CRAB environment
    source /afs/cern.ch/cms/ccs/wm/scripts/Crab/crab.sh
    
  6. create the CRAB jobs
    crab -create
    
  7. submit the CRAB jobs to the GRID
    crab -submit
    
  8. monitor the job status
    crab -status
    
  9. get the job output
    crab -getoutput
    
  10. look for the files now stored on the Storage Element:
    lcg-ls -l srm://grid02.physics.uoi.gr/dpm/physics.uoi.gr/home/cms/store/user/<LXPLUSusername>/MyTest/
    

-- IoannisPapadopoulos - 2012-10-31

Edit | Attach | Watch | Print version | History: r2 < r1 | Backlinks | Raw View | Raw edit | More topic actions
Topic revision: r2 - 2012-10-31 - IoannisPapadopoulos
 
This site is powered by the TWiki collaboration platform Powered by PerlCopyright © 2008-2019 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback